[go: up one dir, main page]

WO2013145512A1 - Management device and distributed processing management method - Google Patents

Management device and distributed processing management method Download PDF

Info

Publication number
WO2013145512A1
WO2013145512A1 PCT/JP2013/000305 JP2013000305W WO2013145512A1 WO 2013145512 A1 WO2013145512 A1 WO 2013145512A1 JP 2013000305 W JP2013000305 W JP 2013000305W WO 2013145512 A1 WO2013145512 A1 WO 2013145512A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processing
information
server
vertex
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2013/000305
Other languages
French (fr)
Japanese (ja)
Inventor
理人 浅原
慎二 中台
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of WO2013145512A1 publication Critical patent/WO2013145512A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Definitions

  • the present invention relates to a technique for managing distributed processing of data in a distributed system in which data devices for storing data and processing devices for processing the data are distributedly arranged.
  • Patent Document 1 discloses a distributed system that determines a calculation server that processes data stored in a plurality of computers, and by sequentially determining the nearest available calculation server from the computer that stores the individual data. A distributed system for determining communication paths for all data has been proposed.
  • Japanese Patent Application Laid-Open No. 2004-228688 proposes a technique that does not reduce the speed of file input / output on the storage side even when there are a plurality of file transfer requests simultaneously.
  • Japanese Patent Application Laid-Open Publication No. 2003-259259 proposes a distributed file system that provides an address space that can collectively manage a group of files stored on a plurality of disks.
  • Patent Document 4 in order to reduce the network load in a distributed database system, when transferring database data to a client, it is arranged in a certain computer in consideration of the data transfer time. It has been proposed to move a relay server to another computer.
  • Patent Document 5 proposes a method of dividing a file according to the line speed and load status of each transfer path through which the file is transferred, and transferring the divided file.
  • Patent Document 6 proposes a stream processing apparatus that determines allocation of resources with high use efficiency in a short time in response to stream input / output requests for which various speeds are specified.
  • Patent Document 7 dynamically assigns an I / O node for each job without stopping execution of the job in a computer system in which a computing node accesses a file system using an I / O node.
  • a method for efficiently using I / O resources has been proposed.
  • Patent Documents 3 and 7 mentioned above a method for centrally handling data stored in a plurality of data servers and a method for determining an I / O node occupancy necessary for accessing a file system are proposed. It ’s just that.
  • the present invention has been made in view of the circumstances as described above, and reduces the data processing time of the entire system in a distributed system in which data devices for storing data and processing devices for processing the data are distributed. Provide technology to reduce.
  • the first aspect relates to a management device.
  • the management device according to the first aspect includes a plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and each of the data devices.
  • Each processing amount constraint condition that includes the data processing capacity per unit time of the processing device as an upper limit is set, from each of the second vertices to at least one third vertex of the subsequent stage from each of the second vertices
  • a model generation unit that generates model information capable of constructing a conceptual model including at least one second side; and the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively.
  • the concept Determine the flow rate of each side on the model, select each path on the conceptual model that satisfies the flow rate of each side, and according to each vertex included in each selected path, by the processing device and the processing device And a determination unit that determines a plurality of combinations with the data device that stores data to be processed.
  • the second aspect relates to a distributed processing management method.
  • at least one computer includes a plurality of first vertices indicating a plurality of data devices that store data, and a plurality of second vertices indicating a plurality of processing devices that process data.
  • a plurality of transfer amount constraint conditions each including a data transferable amount per unit time from each data device to each processing device as an upper limit value.
  • Each processing amount constraint condition including the first side of each of the processing units and the data processing capacity per unit time of each processing device as an upper limit value is set, respectively, from each second vertex to the subsequent stage from each second vertex Generating model information capable of constructing a conceptual model including at least one second side reaching at least one third vertex, and each of the first vertex, the second vertex, the first side, and the second side Including the concept
  • the amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition set for the first side and the second side included in the route for each route on Dell Using the sum, determine the flow rate of each side on the conceptual model, select each route on the conceptual model that satisfies the flow rate on each side, and depending on each vertex included in each selected route Determining a plurality of combinations of the processing device and the data device storing data processed by the processing device.
  • Another aspect of the present invention may be a management program that causes at least one computer to implement each configuration in the first aspect, or a computer-readable recording medium that stores such a program. There may be.
  • This recording medium includes a non-transitory tangible medium.
  • 10 is a flowchart showing detailed operations of the master server of the first embodiment in step (S404-10).
  • 7 is a flowchart showing detailed operations of the master server of the first embodiment in a step (S404-20). It is a flowchart which shows detailed operation
  • FIG. 23 is a flowchart showing a detailed operation of the master server in the first modified example of the second embodiment regarding the step (S404-20) shown in FIG.
  • FIG. It is a figure which shows the information stored in the input-output communication path information storage part in Example 1.
  • FIG. It is a figure which shows the information stored in the data location storage part in Example 1.
  • FIG. It is a figure which shows the model information produced
  • FIG. It is a figure which shows the conceptual model constructed
  • FIG. 10 is a diagram illustrating information stored in a job information storage unit according to the second embodiment. It is a figure which shows the information stored in the server state storage part in Example 2. FIG. It is a figure which shows the information stored in the data location storage part in Example 2. FIG. It is a figure which shows the model information produced
  • FIG. It is a figure which shows notionally the data transmission / reception implemented in Example 2.
  • FIG. It is a figure which shows the information stored in the data location storage part in Example 3.
  • FIG. It is a figure which shows the model information produced
  • FIG. It is a figure which shows the conceptual model constructed
  • FIG. 10 is a diagram conceptually illustrating a configuration of a distributed system in a fourth embodiment. It is a figure which shows the information stored in the server state storage part in Example 4. FIG. It is a figure which shows the information stored in the input-output communication path information storage part in Example 4. FIG. It is a figure which shows the model information produced
  • FIG. FIG. 62 is a diagram showing a conceptual model constructed from model information shown in FIG. 61.
  • FIG. 10 is a diagram illustrating information stored in a job information storage unit according to the fifth embodiment.
  • FIG. FIG. 68 is a diagram showing a conceptual model constructed from model information shown in FIG. 67. It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information.
  • FIG. It is a figure which shows notionally the data transmission / reception implemented in Example 5.
  • FIG. It is a figure which shows the information stored in the server state storage part in Example 6.
  • FIG. It is a figure which shows the model information produced
  • FIG. It is a figure which shows the conceptual model constructed
  • FIG. 1A is a diagram conceptually illustrating a configuration example of a distributed system in the first embodiment.
  • a configuration overview and an operation overview of the distributed system 350 in the first embodiment, and differences between the first embodiment and related technologies will be described with reference to FIG. 1A.
  • the distributed system 350 includes a master server 300, a network switch 320, a plurality of processing servers 330 # 1 to 330 # n, a plurality of data servers 340 # 1 to 340 # n, and the like that are connected to each other by a network 370.
  • the distributed system 350 may include a client 360, another server 399, and the like.
  • the data servers 340 # 1 to 340 # n may be collectively referred to as the data server 340
  • the processing servers 330 # 1 to 330 # n may be collectively referred to as the processing server 330.
  • the data server 340 stores data that can be processed by the processing server 330.
  • the processing server 330 receives the data from the data server 340 and processes the received data by executing a processing program.
  • the client 360 transmits request information that is information for requesting the master server 300 to start data processing.
  • the request information includes information indicating a processing program and data used by the processing program.
  • the master server 300 determines a processing server 330 for processing one or more of the data stored in the data server 340 for each data. For each processing server 330 determined, the master server 300 includes determination information including information indicating data to be processed and the data server 340 storing the data, and information indicating the data processing amount per unit time. Generate. The data server 340 and the processing server 330 transmit and receive data based on the determination information, and the processing server 330 processes the received data.
  • the master server 300, the processing server 330, the data server 340, and the client 360 may be individually realized by dedicated devices or may be realized by general-purpose computers.
  • a plurality of the master server 300, the processing server 330, the data server 340, and the client 360 may be realized by one dedicated device or one computer.
  • one dedicated device or one computer as hardware is collectively referred to as one computer device.
  • the processing server 330 and the data server 340 are realized by the one computer device.
  • the single computer device includes, for example, a CPU (Central Processing Unit), a memory, an input / output interface (I / F), and the like that are connected to each other via a bus.
  • the memory is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like.
  • the input / output I / F is connected to a communication device or the like that communicates with other devices via the network 370.
  • the input / output I / F may be connected to a user interface device such as a display device or an input device. Note that this embodiment does not limit the hardware configurations of the master server 300, the processing server 330, the data server 340, and the client 360.
  • FIG. 1B, FIG. 2A, and FIG. 2B are diagrams conceptually illustrating each configuration example of the distributed system 350.
  • the processing server 330 and the data server 340 are represented as computers, and the network 370 is represented as a data transmission / reception path via a switch.
  • the master server 300 is not shown.
  • the switch is a network device such as a hub or a router.
  • the distributed system 350 includes, for example, a plurality of computers 111 and 112 and switches 101 to 103 that connect them to each other.
  • a plurality of computers 111 and switches 102 are accommodated in a rack 121
  • a plurality of computers 112 and switches 103 are accommodated in a rack 122.
  • the racks 121 and 122 are accommodated in the data center 131, and the data centers 131 and 132 are connected by the inter-base communication network 141.
  • FIG. 1B illustrates a distributed system 350 in which switches and computers are connected in a star configuration.
  • FIGS. 2A and 2B illustrate a distributed system 350 configured by cascade-connected switches.
  • 2A and 2B show examples of data transmission / reception between the data server 340 and the processing server 330, respectively.
  • the computers 207 to 210 function as the data server 340, and the computers 207 and 209 also function as the processing server 330.
  • the computer 210 functions as the master server 300.
  • the unusable computer 208 stores processing target data 211 and 212 in the storage disk 204.
  • the unusable computer 210 stores the processing target data 213 in the storage disk 206.
  • the available computer 207 executes the processing process 214, and the available computer 209 executes the processing process 215.
  • FIG. 3 is a diagram showing the transferable amount per unit time between computers. According to the table 220 shown in FIG. 3, the transfer amount per unit time when transferring the data to be processed as described above to another computer is shown. In this example, it is assumed that each processing process can perform necessary processing on the allocated data in parallel.
  • the transferable amount per unit time is 50 megabytes per second (MB / s).
  • the transferable amount for the computer 208 and the computer 209 is 50 MB / s
  • the transferable amount for the computer 210 and the computer 207 is 50 MB / s
  • the transferable amount for the computer 210 and the computer 209 is 100 MB / s. s.
  • FIG. 4 is a diagram showing the processable amount per unit time of the computer. According to the table 221 shown in FIG. 4, the processable amount per unit time for the computer 207 to process data to be processed is 50 MB / s, and the processable amount for the computer 209 is 150 MB / s. .
  • the throughput of the data processing is a smaller value between the transferable amount per unit time of the path for transferring the processing target data and the processable amount per unit time of the computer that performs the processing.
  • the processing target data 211 is transmitted via the data transfer path 216 and processed by the available computer 207, and the processing target data 213 is transmitted via the data transfer path 217 for use.
  • the processing target data 211 is transmitted via the data transfer path 230 and processed by the available computer 207, and the processing target data 212 is transmitted via the data transfer path 231.
  • the data 213 to be processed is transmitted through the data transfer path 232 and processed by the available computer 209.
  • the total throughput of data processing in FIG. 2A is 150 MB / s, which is the sum of the throughput (50 MB / s) related to the data 211 to be processed and the throughput (100 MB / s) related to the data 213 to be processed.
  • the throughput (50 MB / s) regarding the data 211 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 216 and the processable amount 50 MB / s of the computer 207.
  • the throughput (100 MB / s) related to the processing target data 213 is a smaller value of the transferable amount 100 MB / s of the data transfer path 217 and the processable amount 150 MB / s of the computer 209.
  • the total throughput of the data processing in FIG. 2B is the throughput (50 MB / s) regarding the processing target data 211, the throughput regarding the processing target data 212 (50 MB / s), and the throughput regarding the processing target data 213 (100 MB / s).
  • the sum is 200 MB / s.
  • the throughput (50 MB / s) regarding the data 211 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 230 and the processable amount 50 MB / s of the computer 207.
  • the throughput (50 MB / s) regarding the data 212 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 231 and the processable amount 150 MB / s of the computer 209.
  • the throughput (100 MB / s) regarding the data 213 to be processed is a smaller value between the transferable amount 100 MB / s of the data transfer path 232 and the processable amount 150 MB / s of the computer 209.
  • the data processing in FIG. 2B has a higher total throughput and is more efficient than the data processing in FIG. 2A.
  • the distributed system 350 in the first embodiment performs efficient data allocation as shown in FIG. 2B in the situation illustrated in FIGS. 2A and 2B.
  • FIGS. 2A and 2B the details of the distributed system 350 in the first embodiment will be described.
  • FIG. 5 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the first embodiment.
  • Each of these processing units may be realized individually or in combination and realized as a hardware component, a software component, or a combination of a hardware component and a software component. May be.
  • a hardware component is a hardware circuit such as a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a gate array, a combination of logic gates, a signal processing circuit, an analog circuit, etc. is there.
  • a software component is realized by executing data (program) on one or more memories by one or more processors (for example, a CPU (Central Processing Unit), a DSP (Digital Signal Processor), etc.).
  • processors for example, a CPU (Central Processing Unit), a DSP (Digital Signal Processor), etc.
  • the processing server 330 includes a processing server management unit 331, a processing execution unit 332, a processing program storage unit 333, a data transmission / reception unit 334, and the like.
  • the processing server management unit 331 holds information regarding the execution state of the processing program used when the processing execution unit 332 processes data.
  • the processing server management unit 331 updates the information regarding the execution state of the processing program according to the change in the execution state of the processing program.
  • the execution state of the processing program includes, for example, a pre-execution state, a running state, and an execution completion state.
  • the pre-execution state indicates a state where the process of assigning data to the process execution unit 332 has been completed, but the process execution unit 332 has not yet executed the process of the data.
  • the in-execution state indicates a state in which the process execution unit 332 is executing the data.
  • the execution completion state indicates a state in which the process execution unit 332 has completed processing the data.
  • As the execution state of the processing program a state determined based on the ratio of the amount of data processed by the processing execution unit 332 to the total amount of data allocated to the processing execution unit 332 may be used.
  • the data transmission / reception unit 334 transmits / receives data to / from another processing server 330 or the data server 340.
  • the processing server 330 sends data to be processed from the data server 340 designated by the master server 300 to the data transmission / reception unit 343 of the data server 340, the data transmission / reception unit 322 of the network switch 320, and the processing server 330.
  • the data transmission / reception unit 334 receives the data.
  • the process execution unit 332 of the process server 330 processes the received data to be processed.
  • the processing server 330 may directly acquire processing target data from the processing data storage unit 342.
  • the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 334 of the processing server 330 may directly communicate without passing through the data transmission / reception unit 322 of the network switch 320.
  • the data server 340 includes a data server management unit 341, a processing data storage unit 342, a data transmission / reception unit 343, and the like.
  • the data server management unit 341 transmits location information of data stored in the processing data storage unit 342 to the master server 300.
  • the processing data storage unit 342 stores data uniquely identified in the distributed system 350.
  • the processing data storage unit 342 includes a hard disk drive (HDD), a solid state drive (SSD), a USB memory (Universal Serial Bus flash drive), and a RAM (Random Access Memory) disk. And so on.
  • the data stored in the processing data storage unit 342 may be data output by the processing server 330 or data being output.
  • the data stored in the processing data storage unit 342 may be received by the processing data storage unit 342 from another server or the like, or may be read by the processing data storage unit 342 from a portable storage medium or the like.
  • the data transmission / reception unit 343 transmits / receives data to / from another processing server 330 or another data server 340.
  • the network switch 320 has a data transmission / reception unit 322.
  • the data transmission / reception unit 322 relays data transmitted / received between the processing server 330 and the data server 340.
  • the master server 300 includes a data location storage unit 3070, a server state storage unit 3060, an input / output channel information storage unit 3080, a model generation unit 301, a determination unit 303, and the like.
  • Data may be explicitly specified by an identification name in a structure program that defines the structure of a directory or data, or may be specified based on other processing results such as an output result of a specified processing program.
  • the structure program is information that defines data to be processed by the processing program.
  • the structure program receives information (name or identifier) indicating certain data as an input, and outputs a directory name in which data corresponding to the input is stored and a file name indicating a file constituting the data.
  • the structure program may be a list of directory names or file names.
  • the data is each distributed file.
  • the unit of information received as an argument by the processing program is a row or a record
  • the data is a plurality of rows or a plurality of records in the distributed file.
  • the unit of information received as an argument by the processing program is a “row” of a table in a relational database
  • the data is a set of rows obtained by a predetermined search from a set of tables or A set of rows obtained by a range search of a certain attribute from the set is obtained.
  • the data may be a container such as Map or Vector of a program such as C ++ or JAVA (registered trademark), or may be an element of the container. Furthermore, even if the data is a matrix, it may be a row, column, or matrix element of the matrix.
  • the data to be processed is determined by registering one or more data identifiers in the data location storage unit 3070.
  • the name of the data to be processed is stored in the data location storage unit 3070 in association with the identifier of the data and the identifier of the data storage device.
  • Each data may be divided into a plurality of subsets (partial data), and the plurality of subsets may be distributed in a plurality of storage devices. Further, certain data may be multiplexed and arranged in two or more storage devices. In this case, data multiplexed from one data is also collectively referred to as distributed data.
  • the processing server 330 may input any one of the distributed data as the processing data in order to process the multiplexed data.
  • FIG. 6 is a diagram illustrating an example of information stored in the data location storage unit 3070.
  • the data location storage unit 3070 stores a plurality of data location information, which is information associated with a data name 3071, a distributed form 3073, a data description 3074, or a data name 3077.
  • the distribution form 3073 is information indicating a data storage form.
  • data for example, MyDataSet1
  • single is set in the distributed form 3073 of the row (data location information) corresponding to the data.
  • data for example, MyDataSet2
  • distributed arrangement is set in the distributed form 3073 of the row information (data location information) corresponding to the data.
  • data for example, MyDataSet3
  • n-duplication (1 / n) (n is 2 or more) in the distribution form 3073 of the information (data location information) of the row corresponding to the data Integer).
  • the data description 3074 includes a data identifier 3075, an identifier 3076 of a data storage device (data server 340 or its processing data storage unit 342), and a processing status 3078.
  • the data identifier 3075 is an identifier that uniquely indicates the data in each data storage device.
  • the information specified by the data identifier 3075 is determined according to the type of target data. For example, when the data is a file, the data identifier 3075 is information for specifying a file name. When the data is a database record, the data identifier 3075 may be information specifying SQL (Structured Query Language) for extracting the record.
  • SQL Structured Query Language
  • the storage device identifier 3076 is an identifier of the data server 340 or the processing data storage unit 342 for storing each data.
  • the identifier 3076 may be unique information in the distributed system 350, or may be an IP (Internet Protocol) address assigned to each device.
  • the processing status 3078 is information indicating the processing status of the data specified by the data identifier 3075.
  • “unprocessed” indicating that all of the data is unprocessed
  • “processing” indicating that the data is being processed by the processing server 330
  • all of the data has been processed. “Processed” is set to indicate that.
  • the processing status 3078 may be information indicating the progress of processing the data (for example, unprocessed after the 50th MB). Further, in the case of multiplexing or the like, when the processing statuses of the data indicated by all the data identifiers are equal, they may be described together.
  • the processing status 3078 is updated by the master server 300 or the like according to the progress status of processing by the processing server 330.
  • the data location storage unit 3070 stores each of the partial data names 3077 as the data name 3071 in association with the distribution form 3073 and the data description 3074 (for example, the fifth line in FIG. 6).
  • data for example, SubSet1
  • the name 3071 of the data is associated with the distribution form 3073 and the data description 3074 for each multiplexed data included in the data, and the data location It is stored in the storage unit 3070.
  • the data description 3074 includes an identifier 3076 of a storage device that stores the multiplexed data and an identifier (data identifier 3075) that uniquely indicates the data in the storage device.
  • Information of each row (each data location information) in the data location storage unit 3070 is deleted by the master server 300, the processing server 330, or the data server 340 when processing of the corresponding data is completed. Further, instead of deleting the information on each row (each data location information) in the data location storage unit 3070, information indicating completion and incomplete data processing is added to the information on each row (each data location information). Thus, completion of data processing may be recorded.
  • the data location storage unit 3070 may not include the distributed form 3073.
  • the type of data distribution is one of the above-described types.
  • the master server 300, the data server 340, and the processing server 330 may switch processes described below based on the description of the distributed form 3073.
  • FIG. 7 is a diagram illustrating an example of information stored in the input / output communication path information storage unit 3080.
  • the input / output communication path information storage unit 3080 is input / output communication that is information that associates the communication path ID 3081, the usable bandwidth 3082, the input source apparatus ID 3083, and the output destination apparatus ID 3084 with respect to each input / output communication path configuring the distributed system 350. Stores road information.
  • the communication path ID 3081 is an identifier of an input / output communication path between devices in which input / output communication occurs.
  • the available bandwidth 3082 is bandwidth information currently available on the input / output communication path. The available bandwidth generally indicates the amount of data that can be transferred per unit time.
  • the band information may be an actual measurement value or an estimated value.
  • the input source device ID 3083 is an identifier of a device that inputs data to the input / output communication path.
  • the output destination device ID 3084 is an identifier of a device from which the input / output communication path outputs data.
  • the device identifiers indicated by the input source device ID 3083 and the output destination device ID 3084 are unique identifiers in the distributed system 350 assigned to the data server 340, the processing server 330, the network switch 320, the processing data storage unit 342, and the like. It may be an IP address assigned to each device.
  • the input / output communication path may be a communication path between the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 334 of the processing server 330, or the processing data storage unit 342 and the data transmission / reception unit 343 in the data server 340. Or a communication path between the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 322 of the network switch 320.
  • the input / output communication path may be a communication path between the data transmission / reception unit 322 of the network switch 320 and the data transmission / reception unit 334 of the processing server 330, or a communication path between the data transmission / reception unit 322 of the network switch 320. It may be.
  • the input / output communication path is also It is included in the communication path.
  • such an input / output communication path is also simply referred to as a communication path.
  • FIG. 8 is a diagram illustrating an example of information stored in the server state storage unit 3060.
  • the server status storage unit 3060 includes a server ID 3061, load information 3062, configuration information 3063, processing data storage unit information 3064, and processable amount information 3065 for each processing server 330 and each data server 340 operating in the distributed system 350.
  • Each of the processing server status information which is information associated with each other, is stored.
  • the server ID 3061 is an identifier of the processing server 330 or the data server 340.
  • the identifiers of the processing server 330 and the data server 340 may be unique identifiers in the distributed system 350, or may be IP addresses assigned to them.
  • the load information 3062 includes information regarding the processing load of the processing server 330 or the data server 340.
  • the load information 3062 is, for example, a CPU usage rate, a memory usage amount, a network usage band, and the like.
  • Configuration information 3063 includes configuration status information of the processing server 330 or the data server 340.
  • the configuration information 3063 is, for example, hardware specifications such as the CPU frequency, the number of cores, and the memory amount in the processing server 330, and software specifications such as an OS (Operating System).
  • the processing data storage unit information 3064 includes an identifier of the processing data storage unit 342 included in the data server 340.
  • the processable amount information 3065 indicates the amount of data that can be processed by the processing server 330 per unit time.
  • Information stored in the server status storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 may be updated by status notifications transmitted from the switch 320, the processing server 330, the data server 340, and the like.
  • the master server 300 may be updated with response information obtained through an inquiry.
  • the switch 320 generates information indicating the communication throughput of each port of itself and the identifier (MAC (Media Access Control) address, IP address, etc.) of the connection destination device of each port, and notifies the generated information of the status To the master server 300.
  • the server status storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 update the stored information based on the information sent as the status notification.
  • the processing server 330 generates information indicating the throughput of the network interface, information indicating the allocation status of the processing target data to the processing execution unit 332, and information indicating the usage status of the processing execution unit 332.
  • the generated information may be transmitted to the master server 300 as the state notification.
  • the data server 340 generates information indicating the throughput of its own processing data storage unit 342 (disk) or network interface, and information indicating a list of data elements stored in the data server 340. Information may be transmitted to the master server 300 as the status notification.
  • the master server 300 may receive the status notification as described above by transmitting information requesting the status notification as described above to the switch 320, the processing server 330, and the data server 340.
  • Information stored in the server state storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 may be given in advance by the administrator of the client 360 or the distributed system 350. Also, these pieces of information may be collected by a program such as a crawler that searches the distributed system 350. Further, the input / output communication path information storage unit 3080 and the data location storage unit 3070 may be provided in a distributed device by a technique such as a distributed hash table.
  • the model information includes information indicating each communication path from each data server 340 to each processing server 330, and a transfer amount constraint including, as an upper limit, a data transferable amount per unit time from each data server 340 to each processing server 330. And a processing amount constraint condition that includes the data processing capacity per unit time of the processing server as an upper limit value.
  • FIG. 9A is a diagram showing an example of model information.
  • Each line (each entry) of the model information 500 includes an identifier, a lower limit value of the flow rate, an upper limit value of the flow rate, and a pointer to the next element.
  • the identifier is information for specifying a node included in the model.
  • logical software elements may be set in the nodes included in the model.
  • the data server 340 and the processing server 330 are allocated as nodes indicating hardware elements, but a storage device (data device) included in the data server 340 may be allocated, or a CPU included in the processing server 330.
  • a processing device such as may be assigned. The logical elements will be described later.
  • an identifier indicating another node connected from the node indicated by the corresponding identifier is set.
  • the pointer to the next element may be set with a line number that can identify each line or memory address information.
  • the transfer amount restriction condition or the processing amount restriction condition is set in the flow rate lower limit value and the flow rate upper limit value.
  • the following conceptual model indicated by a plurality of vertices (nodes) and a plurality of sides connecting the vertices can be constructed.
  • This conceptual model is called a directed graph based on a network model or graph theory.
  • Each vertex corresponds to each node in the model information of FIG. 9A.
  • Each side corresponds to an input / output communication path (communication path) that connects the hardware elements indicated by each vertex, or a process for target data itself.
  • an available bandwidth of the input / output communication path is set as the transfer amount restriction condition.
  • the processing amount restriction condition is set for each side indicating the processing for the target data itself.
  • the input / output communication path is indicated by a subgraph composed of sides and nodes that are end points of the sides.
  • FIG. 9B is a diagram showing an example of a conceptual model constructed by model information.
  • the side connecting the data server D and the processing server P has the available bandwidth of the corresponding communication path as an attribute value (transfer amount constraint condition).
  • the side connecting the processing server P and the logical vertex ⁇ has the possible processing amount per unit time of the processing server P as an attribute value (processing amount constraint condition).
  • processing amount constraint condition a side where there is no restriction on the usable bandwidth or the possible processing amount is treated as having no constraint condition, that is, the usable bandwidth or the possible processing amount is infinite.
  • the available bandwidth and the possible processing amount on the side where there is no such constraint condition may be treated as special values other than infinity.
  • a plurality of vertices and sides may exist before the vertex indicating the data server D, between the data server D and the processing server P, and between the processing server P and the vertex ⁇ .
  • the job executed in the distributed system 350 is a unit of program processing requested to be executed by the distributed system 350, for example.
  • the form of the model information is not limited to the form of reference numeral 500 in FIG. 9A.
  • the model information may be realized by a linked list in which data field groups storing vertex and edge information are linked by reference.
  • the model generation unit 301 may change the generation method of the model information according to the operation state of the hardware element. For example, the model generation unit 301 may determine that a processing server 330 with a high CPU usage rate cannot be used, and exclude such processing server 330 from the model information target.
  • Execution can be performed according to the transfer amount constraint condition and the processing amount constraint condition set for the first side and the second side included in the route for each route on the conceptual model including the second side to the third vertex of
  • the flow rate of each side on the conceptual model is determined.
  • Each path in the conceptual model has a data flow from when certain data to be processed is sent from the processing data storage unit 342 of the data server 340 toward the processing server 330 until it is processed by the processing server 330. Indicates.
  • the flow rate of each side is determined so that, for example, the total amount of data processing per unit time for each route on the conceptual model is maximized.
  • the flow of each side is narrowed down from the multiple paths on the conceptual model to the path that minimizes the number of edges (number of communication hops) included in each path.
  • the total data processing amount per unit time for each narrowed path is determined to be the maximum.
  • the determination unit 303 selects each path on the conceptual model that satisfies the flow rate of each side determined in this way, and the processing server 330 and the processing server 330 according to each vertex included in each selected path. A plurality of combinations with the data server 340 that stores data to be processed is determined.
  • the information including the routes including the first side and the second side in the conceptual model and the flow rate of each route may be referred to as data flow Fi or data flow information.
  • the determination unit 303 generates such data flow information.
  • the flow rate of each side can also be expressed as a flow rate function f (e) that satisfies the following constraint expression on all sides e on the conceptual model.
  • Constraint expression: l (e) ⁇ f (e) ⁇ u (e) u (e) represents an upper limit capacity function that outputs an upper limit value (flow rate upper limit value of model information) of the transfer amount constraint condition or the processing amount constraint condition set for each side e, and l (e) represents The lower limit capacity function for outputting the lower limit value (flow rate lower limit value of the model information) of the transfer amount constraint condition or the processing amount constraint condition set for each side e is shown.
  • the determination unit 303 determines the flow function f that minimizes the processing time of the job executed in the distributed system 350.
  • the flow function f can be determined, for example, by maximizing the objective function with ⁇ e ⁇ E ′ (f (e)) as an objective function for a certain edge set E ′. Maximization of the objective function can be realized by using a linear programming method, a flow increasing method in a maximum flow problem, a preflow push method, or the like.
  • the determination unit 303 may add logical vertices and edges to the conceptual model constructed from the model information in order to determine the flow function f.
  • the determination unit 303 includes a logical start point, a set of edges connecting the start point and the vertex indicating the data server 340, a logical An end point and a set of sides connecting the vertex and the end point indicating the processing server 330 may be added to the conceptual model.
  • Such logical vertices and edges may be included in the model information by the model generation unit 301.
  • E represents a set of edges constituting the network model
  • V represents a set of vertices constituting the network model
  • f (e) represents a flow function of the edge e
  • s represents a starting point of the network model
  • t represents Indicates the end point of the network model
  • indicates a set of edges that exit from a certain vertex
  • ⁇ + indicates a set of edges that enter a certain vertex
  • u (e) is an upper constraint that outputs a flow rate upper limit value of the edge e
  • the capacity function is shown
  • l (e) is a downward-constrained capacity function that outputs the lower limit value of the flow rate of the side e.
  • the determination unit 303 may add a constraint condition so that the determined data flow information can be executed by the distributed system 350.
  • the flow function f may be determined in a state in which a constraint condition is not added to the model information that does not include a flow in which the correspondence between the transferred data and the processed data is inconsistent.
  • FIG. 10 is a diagram showing an example of data flow information.
  • the data flow information includes route information and information on the flow rate of the route.
  • the processing amount per unit time (unit processing amount) is set as the flow rate of the route.
  • the route information is indicated by information on each vertex (data server D1, processing server P1, and logical vertex ⁇ ) included in the route.
  • an identifier (Flow1) for specifying the data flow Fi is set.
  • Flow1 for specifying the data flow Fi is set.
  • the determination unit 303 determines a combination of the processing server 330 and the data server 340 from which the processing target data of the processing server 330 is acquired based on the data flow information generated as described above.
  • the decision information indicating is generated.
  • the generated determination information is acquired by each processing server 330 included in the combination.
  • FIG. 11 is a diagram showing an example of decision information.
  • the determination information includes a data server ID, a processing data storage unit ID, a data ID, received data specifying information, and a data processing amount per unit time.
  • the data server ID is an identifier of the data server 340 that stores data to be processed by the processing server 330
  • the processing data storage unit ID is an identifier of the processing data storage unit 342 of the data server 340
  • the data ID is This is the identifier of the data to be processed.
  • the data ID may not be included in the decision information.
  • the processing server 330 uses the data stored in the processing data storage unit 342 of the data server 340 specified by the data server ID and the processing data storage unit ID, to determine whether the processing target of the job has been processed. Data may be acquired.
  • the received data identification information is multiplexed when the processing target data stored in a certain processing data storage unit 342 of a certain data server 340 is processed by a plurality of processing servers 330 or multiplexed to a plurality of data servers 340. This is set when the stored processing target data is processed by a plurality of processing servers 330.
  • the received data specifying information for example, information specifying a predetermined section in the data (for example, the start position of the section, the processing amount) is set. In addition to the cases described above, the reception data specifying information may not be set in the determination information.
  • the data processing amount per unit time that can be included in the decision information is set based on the unit processing amount included in the data flow information.
  • the processing server 330 sends the data specified by the determination information to the data server 340 with the data processing amount per unit time. Request to transfer. If the data processing amount per unit time is not included in the determination information, the processing server 330 may request the data server 340 to transfer at an arbitrary processing amount.
  • the determination information may be acquired by each data server 340 included in the combination.
  • the determination information may include the processing server ID, which is the identifier of the processing server 330, instead of the data server ID.
  • the determination unit 303 may distribute the processing program received from the client 360 to the processing server 330, for example.
  • the determination unit 303 inquires of the processing server 330 whether or not the processing program corresponding to the determination information is stored, and when the processing server 330 determines that the processing program is not stored, the processing received from the client The program may be distributed to the processing server 330.
  • the information for designating the above-described conceptual model, constraint conditions, and objective function may be described in a structure program or the like, and the structure program or the like may be given from the client 360 to the master server 300. Further, information for designating the conceptual model, the constraint condition, and the objective function may be given from the client 360 to the master server 300 as an activation parameter or the like.
  • the master server 300 may determine the conceptual model with reference to the data location storage unit 3070 and the like.
  • the master server 300 stores the model information generated by the model generation unit 301 and the data flow information generated by the determination unit 303 in a memory or the like, and stores the model information and data flow information in the model generation unit 301 and the determination unit. You may give to the input of 303. In this case, the model generation unit 301 and the determination unit 303 may use the model information and data flow information for model generation and optimal arrangement calculation. Further, the master server 300 may be realized so as to be compatible with all conceptual models, constraint conditions, and objective functions, or may be realized so as to be compatible only with a specific conceptual model.
  • FIG. 12 is a flowchart showing an overall outline of an operation example of the distributed system 350.
  • the master server 300 When the master server 300 receives request information that is a request to execute a processing program from the client 360, the master server 300 acquires the following pieces of information (S401).
  • the master server 300 includes a set of input / output communication path information in the distributed system 350, a set of data location information in which processing target data is associated with the data server 340 storing the data, and identifiers of usable processing servers 330. Get a set.
  • the master server 300 determines whether or not unprocessed data remains in the acquired set of processing target data (S402). When the master server 300 determines that unprocessed data does not remain in the acquired set of processing target data (S402; No), the process ends.
  • the master server 300 determines that unprocessed data remains in the acquired processing target data set (S402; Yes)
  • the master server 300 further includes the acquired identifiers of the available processing servers 330.
  • the master server 300 determines that there is a processing server 330 that can additionally execute processing (S403; Yes)
  • the master server 300 uses the acquired set of identifiers of the processing server 330 and the set of identifiers of the data server 340 as keys.
  • the input / output communication path information and the processing server state information are acquired, and model information is generated based on these information (S404).
  • the processing server 330 that can additionally execute processing is also referred to as an available processing server 330.
  • the master server 300 determines each combination of the processing server 330 and the data server 340 that maximizes a predetermined objective function under predetermined constraint conditions based on the generated model information (S405).
  • the master server 300 generates data flow information indicating each determined combination.
  • Each processing server 330 and each data server 340 corresponding to the combination determined in (S405) by the master server 300 transmits and receives the processing target data, and each processing server 330 processes the received processing target data. (S406). Thereafter, the processing of the distributed system 350 returns to the step (S401).
  • FIG. 13 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S401).
  • the model generation unit 301 of the master server 300 acquires from the data location storage unit 3070 a set of identifiers of the data server 340 that stores the processing target data specified by the request information from the client 360 (S401-1).
  • the model generation unit 301 acquires a set of identifiers of the data server 340 and a set of identifiers of the processing server 330 from the server state storage unit 3060 (S401-2). Note that the step (S401-2) may be executed before the step (S401-1).
  • FIG. 14 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404).
  • the model generation unit 301 of the master server 300 acquires input / output communication path information indicating a communication path for the processing server 330 to process the processing target data from the input / output communication path information storage unit 3080. Based on the input / output communication path information acquired in the model information (for example, reference numeral 500 in FIG. 9A) stored in the memory or the like, the model generation unit 301 stores information on the communication path from the data server 340 to the processing server 330. Is added (S404-10).
  • the model generation unit 301 adds logical communication path information from the processing server 330 to the subsequent logical vertex to the model information (S404-20). Note that the step (S404-20) may be executed before the step (S404-10).
  • FIG. 15 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404-10).
  • the model generation unit 301 refers to the information acquired from the data location storage unit 3070 based on the request information, and executes the process (S404-12) for each data server Di storing the processing target data. (S404-11).
  • the model generation unit 301 executes steps (S404-13) to (S404-15) for each available processing server Pj (S404-12).
  • the model generation unit 301 adds a line including the name (or identifier) of the data server Di to the model information 500 (S404-13).
  • the model generation unit 301 sets the name (or identifier) of the processing server Pj as a pointer to the next element of the added row (S404-14).
  • the “identifier” and the “pointer to the next element” in the model information 500 may be information that can identify a certain node in the conceptual model.
  • the model generation unit 301 sets the usable bandwidth of the communication path between the data server Di and the processing server Pj to the flow rate upper limit value of the additional row, and sets the flow rate lower limit value of the additional row to 0 or more and the flow rate upper limit value.
  • the following values are set (S404-15). Note that the step (S404-15) may be executed before the step (S404-14).
  • FIG. 16 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404-20).
  • the model generation unit 301 executes steps (S404-22) to (S404-26) for each available processing server Pj acquired from the server state storage unit 3060 based on the request information (S404-26). 21).
  • the model generation unit 301 adds a line including the name (or identifier) of the processing server Pj to the model information 500 (S404-22).
  • the model generation unit 301 determines whether or not a vertex exists in the subsequent stage of the processing server (S404-23).
  • the vertex at the subsequent stage of the processing server refers to an identifier of a line that can be reached by following a pointer to the next element of the line including the name (or identifier) of an arbitrary processing server in the model information 500.
  • the model generation unit 301 determines that there is no vertex in the subsequent stage of the processing server (S404-23; No)
  • the model generation unit 301 sets an identifier ⁇ that is an arbitrary name that does not match the identifier included in the model information 500 (S404-). 24). Note that if the model generation unit 301 determines that there is a vertex in the subsequent stage of the processing server (S404-23; Yes), the model generation unit 301 does not execute the process (S404-24). Subsequently, the model generation unit 301 sets the identifier ⁇ as a pointer to the next element of the added row (S404-25).
  • the model generation unit 301 sets a possible processing amount per unit time of the processing server Pj to the upper limit flow rate value of the additional row, and a value that is greater than or equal to 0 and less than or equal to the upper limit flow rate of the additional row Is set (S404-26). Note that the step (S404-25) may be performed after the step (S404-26). Further, the step (S404-26) may be executed at any time after the step (S404-22).
  • FIG. 17 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S405).
  • the determination unit 303 of the master server 300 operates as follows using a conceptual model (herein referred to as a directed graph) that can be constructed based on the model information generated as described above.
  • the determining unit 303 determines the flow rate (data flow Fi) of each side based on the directed graph so that the processing time of the job executed by the distributed system 350 is minimized (S405-1).
  • the determination unit 303 generates the flow rate function f (e) so that the job processing time is minimized.
  • the determination unit 303 maximizes the objective function ( ⁇ e ⁇ E ′ (f (e)) for a certain edge set E ′) based on the network model constructed from the model information.
  • the determination unit 303 performs processing for maximizing the objective function using a linear programming method, a flow increase method in the maximum flow problem, or the like. A specific example of the operation using the flow increasing method in the maximum flow problem will be described later as a first embodiment.
  • the determining unit 303 sets the vertex indicating the starting point in the directed graph to the vertex variable i (S405-2). Next, the determination unit 303 secures an area for storing the path information array and the unit processing amount on the memory, and initializes the value of the unit processing amount to infinity (S405-3).
  • the determination unit 303 determines whether the vertex indicated by the vertex variable i is the end point of the directed graph (S405-4). Hereinafter, the vertex indicated by the vertex variable i is simply expressed as the vertex variable i.
  • the determination unit 303 determines that the vertex variable i is not the end point of the directed graph (S405-4; No), is there a communication channel with a non-zero flow rate among the communication channels exiting from the vertex variable i in the directed graph? It is determined whether or not (S405-5). When there is no communication path with a non-zero flow rate (S405-5; No), the determination unit 303 ends the process.
  • the determination unit 303 selects the communication path (S405-6). Subsequently, the determination unit 303 adds the vertex variable i to the path information array secured on the memory (S405-7).
  • the determination unit 303 determines whether the unit processing amount secured in the memory is smaller than or equal to the flow rate of the communication path selected in step (S405-6) (S405-8), and determines the unit processing amount. Is larger than the flow rate of the communication channel (S405-8; No), the unit processing amount secured in the memory is updated with the flow rate of the communication channel (S405-9). If the unit processing amount is smaller than or equal to the flow rate of the communication path (S405-8; Yes), the determination unit 303 does not execute the step (S405-9).
  • the determination unit 303 sets the vertex serving as the other end point of the communication path selected in the step (S405-6) to the vertex variable i (S405-10), returns to the step (S405-4), and executes it.
  • the determination unit 303 determines from the path information stored in the path information array and the unit processing amount. Data flow information is generated, and the data flow information is stored in the memory (S405-11).
  • the path information of the data flow information generated here at least a vertex indicating the data server 340 and a vertex indicating the processing server 330 included in one path from the start point to the end point in the directed graph are set.
  • the unit processing amount of the data flow information the data processing amount per unit time indicated by one path from the start point to the end point in the directed graph is set.
  • the determination unit 303 updates the flow rate of each side connecting the vertices included in the route information with a value obtained by subtracting the unit processing amount from the original flow rate (S405-12). Thereafter, the determination unit 303 returns to the step (S405-2) and executes the step again.
  • FIG. 18 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S406).
  • the determination unit 303 executes the process (S406-2) for each processing server Pj in the set of available processing servers 330 (S406-1).
  • the determining unit 303 executes steps (S406-3) to (S406-4) for each piece of route information Fj in the set of route information including the processing server Pj (S406-2). Each path information Fj is included in the data flow information generated in the step (S405).
  • the determination unit 303 extracts the identifier of the data server 340 storing the processing target data from the path information Fj (S406-3).
  • the determination unit 303 transmits the processing program and the determination information to the processing server Pj (S406-4).
  • the processing program is a processing program for instructing the data server 340 storing the processing target data to transfer the data.
  • the data server 340 and the processing target data are specified by information included in the determination information.
  • the communication bandwidth of each input / output communication path in the distributed system 350 and the processing capability of each processing server 330 are determined from the entire arbitrary combination of each data server 340 and each processing server 330.
  • Considered model information is generated.
  • a combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined, and the combination Accordingly, transmission / reception and processing of the processing target data constituting the job executed in the distributed system 350 are executed.
  • the entire distributed system 350 including the plurality of data servers 340 and the plurality of processing servers 330 is executed while avoiding a decrease in efficiency due to a bottleneck of the communication bandwidth and processing server capability. Job processing time can be minimized.
  • the network model is generated in consideration of the communication bandwidth of each input / output communication path in the distributed system 350, the total amount of processing data of all the processing servers 330 per unit time in the distributed system 350. It is possible to determine a combination of the processing server 330 and the data server 340 based on the data transfer path that maximizes.
  • the first embodiment may be configured such that the master server 300 outputs the data flow information generated by the determination unit 303.
  • the determination unit 303 outputs the generated data flow information after executing the step (S405-11) of FIG.
  • the determination unit 303 outputs the information shown in the example of FIG.
  • This output form is not limited.
  • Data flow information may be output to a file, transmitted to another device, displayed on a display device, or sent to a printing device.
  • the data flow information output in this way can be used for planning more detailed data processing.
  • the distributed system 350 can dynamically determine the data transfer path according to the processing status.
  • the distribution system 350 according to the second embodiment will be described focusing on the content different from the first embodiment. The same contents as those in the first embodiment are omitted as appropriate.
  • the master server 300 handles a plurality of program processes requested to be executed by the distributed system 350. A unit of program processing requested to be executed by the distributed system is expressed as a job.
  • a mode in which the processing amount per unit time is changed according to the portion of the processing target data in the program processing requested to be executed by the distributed system 350 is also supported.
  • the job is handled by being replaced with a set of data having the same processing amount per unit time.
  • This data set may be expressed as a logical data set.
  • FIG. 19 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the second embodiment.
  • the master server 300 in the second embodiment further includes a job information storage unit 3040 in addition to the configuration of the first embodiment.
  • FIG. 20 is a diagram illustrating an example of information stored in the job information storage unit 3040.
  • Each row (each entry) stored in the job information storage unit 3040 includes a job ID 3041, a data name 3042, a minimum unit processing amount 3043, and a maximum unit processing amount 3044.
  • the job ID 3041 a unique identifier in the distributed system 350 assigned for each job executed by the distributed system 350 is set.
  • the data name 3042 the name (identifier) of data handled by the job is set.
  • the minimum unit processing amount 3043 is the minimum value of the processing amount per unit time specified for the logical data set that is data handled by the job.
  • the maximum unit processing amount 3044 is the maximum value of the processing amount per unit time specified for the logical data set.
  • the job information storage unit 3040 stores a plurality of rows having one job ID, and in each of these rows, a different data name 3042 and minimum unit processing amount are stored. 3043 and the maximum unit processing amount 3044 may be stored, respectively.
  • the model generation unit 301 of the master server 300 further reflects the job configuration information stored in the job information storage unit 3040 in the model information 500.
  • This reflection operation will be described in the following operation example section.
  • the logical vertex indicating the job is placed in the previous stage of the vertex indicating the data server 340, and the vertex of the job is transferred to the data server 340.
  • the edge indicating the logical communication path leading to the job the logical vertex preceding the job vertex, and the edge indicating the logical communication path from the vertex preceding the job to the job vertex.
  • FIG. 21 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S401).
  • step (S401-0) is added to FIG. 13 showing the detailed operation of the first embodiment.
  • the model generation unit 301 acquires a set of jobs being executed from the job information storage unit 3040.
  • FIG. 22 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S404).
  • the step (S404-30) is added to FIG. 14 showing the detailed operation of the first embodiment.
  • the model generation unit 301 includes, in the model information 500, logical communication path information to each job in the job set acquired from the job information storage unit 3040, and each job from each job.
  • Logical communication path information to the data server 340 storing the data to be processed in (1) is added (S404-30). Note that the order of the steps (S404-30), (S404-10), and (S404-20) shown in FIG. 22 may be changed.
  • FIG. 23 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S404-30).
  • the model generation unit 301 of the master server 300 executes the process (S404-32) and subsequent steps (S404-31) for each job Ji in the acquired job set.
  • the model generation unit 301 determines whether or not there is a vertex in the previous stage of the job Ji (S404-32).
  • the top vertex of the job Ji corresponds to an identifier of a line in the model information 500 in which information (job name) indicating a certain job is set as a pointer to the next element.
  • the model generation unit 301 sets an identifier ⁇ when there is no vertex in the previous stage of the job (S404-32; No) (S404-34).
  • the identifier ⁇ is an arbitrary name that does not match the identifier included in the model information 500.
  • the model generation unit 301 acquires the identifier ⁇ of the previous stage (S404-33).
  • the model generation unit 301 adds a line including ⁇ as an identifier to the model information 500 (S404-35).
  • the model generation unit 301 sets the name of the job Ji as a pointer to the next element of the added row (S404-36).
  • the model generation unit 301 sets the maximum unit processing amount and the minimum unit processing amount assigned to the job Ji to the upper limit flow rate and lower limit flow rate of the additional row (S404-37).
  • the model generation unit 301 executes steps (S404-39) to (S404-3B) for each data server Dj that stores data handled by the job Ji (S404-38).
  • the model generation unit 301 adds a line whose identifier indicates job Ji to the model information 500 (S404-39).
  • the model generation unit 301 sets the name (or identifier) of the data server Dj as a pointer to the next element of the added row (S404-3A).
  • the model generation unit 301 sets a transfer amount that can be allocated to the data server Dj by the job Ji to the flow rate upper limit value of the additional row, and a value that is greater than or equal to 0 and less than or equal to the flow rate upper limit value of the flow rate lower limit value of the additional row. Is set (S404-3B).
  • the transfer amount that can be allocated to the data server Dj by the job Ji indicates, for example, the requested processing amount specified for each data handled by the job Ji, and may be given by the user or determined by the distributed system 350 May be.
  • the unit processing amount specified for each job is generated from a network model (conceptual model) that can be generated based on model information in which constraints and unit processing amount constraints specified for each data handled in each job are added.
  • a combination with the data server 340 that is the acquisition destination of data to be processed by the server 330 is determined.
  • the unit processing amount specified for the job executed in the distributed system 350 is taken into consideration, and transmission / reception and processing of the processing target data constituting the job are executed. , The processing time of the job can be minimized.
  • each priority when priority is set for each job, each priority can be set as a ratio between jobs of a unit processing amount specified for each job. Therefore, according to the second embodiment, even when a priority is set for each job, the processing target data is set so as to satisfy the set priority constraint and minimize the processing time as a whole. Transmission / reception and processing can be executed.
  • the master server 300 sets a termination point for each job as a pointer to the next element in the line including the identifier indicating the processing server 330 of the model information 500.
  • the number of rows including the identifier indicating the processing server 330 is equal to the number of job end points.
  • FIG. 24 is a diagram illustrating an example of information stored in the server state storage unit 3060 in the first modification of the second embodiment. As illustrated in FIG. 24, the server state storage unit 3060 stores the processable amount for each job as the processable amount information 3065 of each processing server 330.
  • FIG. 25 is a flowchart showing detailed operations of the master server 300 in the first modified example of the second embodiment regarding the step (S404-20) shown in FIG.
  • the model generation unit 301 executes the process (S404-2B) for each available processing server Pi acquired from the server state storage unit 3060 based on the request information (S404-2A).
  • the model generation unit 301 executes the steps (S404-2C) to (S404-2E) for each job Jj (S404-2B).
  • the model generation unit 301 adds a line including the name (or identifier) of the processing server Pi to the model information 500 (S404-2C).
  • the model generation unit 301 sets an identifier indicating the end point of the job Jj as a pointer to the next element of the additional row (S404-2D).
  • the model generation unit 301 sets a possible processing amount per unit time regarding the job Jj of the processing server Pi to the upper limit flow rate value of the additional row, and the flow rate lower limit value of the additional row is greater than or equal to 0 and less than or equal to the upper limit flow rate.
  • a value is set (S404-2E). Note that the step (S404-2E) may be performed before the step (S404-2D).
  • jobs having different processable amounts can be handled in the processing server 330, and each process is performed when determining the combination of the processing server 330 and the data server 340.
  • the processable amount for each job in the server 330 can be taken into account. Therefore, according to the first modification of the second embodiment, it is possible to more accurately minimize the processing time of each job to be executed in the entire system.
  • the master server 300 sets information (name) indicating a job in the pointer to the next element in the line including the identifier indicating the processing server 330 in the model information 500.
  • the number of rows including the identifier indicating the processing server 330 is equal to the number of jobs.
  • the pointer to the next element is set to the name (or identifier) of the job Jj in the above-described step of FIG. 25 (S404-2D).
  • the information stored in the server state storage unit 3060 is the same as that in the first modification of the second embodiment. That is, also in the second modification example of the second embodiment, the possibility per unit time for each job of the processing server 330 is set as the flow rate lower limit value and the flow rate upper limit value of the line including the identifier indicating the processing server 330 in the model information 500. A processing amount is set.
  • the determination process of the flow rate of each side (see step (S405-1) in FIG. 17) by the determination unit 303 is easier and faster than the first modification.
  • model information is modeled by the circulation flow
  • an identifier indicating the data server 340 and an identifier indicating the data are set in the pointer to the next element in the line including the identifier indicating the processing server 330.
  • an edge from the vertex indicating the processing server 330 to the vertex indicating the data server 340 or the logical vertex indicating data is provided.
  • the distribution system 350 according to the third embodiment will be described focusing on the content different from the first embodiment and the second embodiment. The same contents as those in the first embodiment and the second embodiment are omitted as appropriate.
  • the master server 300 also handles multiplexed processing target data.
  • the data location storage unit 3070 of the master server 300 further stores data size information.
  • FIG. 26 is a flowchart showing a detailed operation of the master server 300 of the third embodiment in the step (S404).
  • step (S404-40) is added to FIG. 14 showing the detailed operation of the first embodiment.
  • the model generation unit 301 adds logical communication path information from the data to the data server to the model information 500. Note that the order of the steps (S404-40), (S404-10), and (S404-20) shown in FIG. 26 may be changed.
  • FIG. 27 is a flowchart showing detailed operations of the master server 300 of the third embodiment in the step (S404-40).
  • the model generation unit 301 executes the process (S404-42) for each data di in the set of processing target data specified based on the request information (S404-41).
  • the model generation unit 301 executes steps (S404-43) to (S404-45) for each data server Dj that stores the multiplexed data di (S404-42).
  • the multiplexed data di exists for the number of multiplexed data.
  • the model generation unit 301 adds a line including di as an identifier (S404-43).
  • the model generation unit 301 sets the name (or identifier) of the data server Dj as a pointer to the next element of the added row (S404-44).
  • the model generation unit 301 sets the maximum processing amount and the minimum processing amount of the data di specified for the data server Dj to the flow rate upper limit value and the flow rate lower limit value in the additional row (S404-45).
  • the master server 300 may determine the upper limit flow rate and the lower limit flow rate. In this case, for example, the master server 300 sets infinity as the flow rate upper limit value and sets 0 as the flow rate lower limit value.
  • a common identifier di is attached to the multiplexed data in the model information 500 generated by the model generation unit 301. That is, the lines with the common identifier di are added by the number of multiplexed data di.
  • FIG. 28 is a flowchart showing a detailed operation of the master server 300 of the third embodiment in the step (S406).
  • each processing server 330 is assigned to each piece of data multiplexed and stored in different data servers 340. Thereby, the same vertex which shows the multiplexed same data may be contained in several data flow information.
  • the determination unit 303 of the master server 300 executes the step (S406-2-1) and the step (S406-3-1) for each data di in the set of processing target data (S406-1-1). .
  • the determination unit 303 identifies data flow information including the data di in the path information, and sets the unit processing amount set in each identified data flow information to the processing server 330 included in the path information of each data flow information. Aggregate every time.
  • the deciding unit 303 divides the data di by the ratio of the unit processing amount for each of the aggregated processing servers 330, and associates each divided data di with each data server 340 that stores the divided data di (S406-2-1). ).
  • the determination unit 303 executes the step (S406-4-1) for each piece of route information fj in the set of route information including the data di (S406-3-1).
  • the determination unit 303 sends the processing program and the determination information to the processing server Pk included in the route information fj (S406-4-1).
  • the processing program is a processing program for instructing to transfer a divided portion of the data di from the data server 340 storing the data di.
  • the data server 340 and the data are specified by information included in the determination information.
  • a processing server 330 is assigned to each multiplexed data, and a communication band for processing each multiplexed data And model information considering the processing capability of the processing server 330 is generated. Then, based on a conceptual model (network model) that can be constructed from the model information, a combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined.
  • a conceptual model network model
  • each multiplexed data is not transferred and processed in duplicate, but each data divided into an amount corresponding to the communication bandwidth and processing capacity is allocated to each processing server 330, and the entire system The data to be multiplexed is controlled so as to be transferred and processed.
  • the distributed system 350 according to the fourth embodiment will be described focusing on the contents different from those of the first to third embodiments. The same contents as those in the first to third embodiments are omitted as appropriate.
  • the master server 300 further defines between the data server 340 and the processing server 330 by intermediate devices and communication paths included therebetween, and detailed constraint information (available) Model information is generated in consideration of the bandwidth. Therefore, in the fourth embodiment, a portion between the data server 340 and the processing server 330 is referred to as a data transfer path instead of a communication path (input / output communication path), and a path that forms the data transfer path is referred to as a communication path.
  • FIG. 29 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the fourth embodiment.
  • the network switch 320 in the fourth embodiment further includes a switch management unit 321 in addition to the configuration of the first embodiment.
  • the processing server management unit 331 transmits status information such as the disk available bandwidth and the network available bandwidth of the processing server 330 to the master server 300.
  • the data server management unit 341 transmits status information including the disk available bandwidth and the network available bandwidth of the data server 340 to the master server 300.
  • the switch management unit 321 acquires information such as an available bandwidth of a communication path connected to the network switch 320 and transmits the information to the master server 300 via the data transmission / reception unit 322.
  • the input / output communication path information storage unit 3080 stores information on communication paths included in the data transfer path from the data server 340 to the processing server 330.
  • the communication path information includes an identifier of the connection source apparatus, an identifier of the connection destination apparatus, usable bandwidth information of the communication path, and the like.
  • the model generation unit 301 further includes a vertex indicating an intermediate device (for example, the network switch 320) through which the data stored in the data server 340 is received by the processing server 330, and from the vertex indicating the data server 340.
  • the edge that reaches the apex that indicates the nearest intermediate device of the data server 340, and the upper limit of the transfer amount restriction condition is set to the transferable amount per unit time from the data server 340 to the nearest intermediate device
  • the upper limit value of the transfer amount restriction condition is set to the transferable amount per unit time from the intermediate device to the other intermediate device from the vertex indicating the intermediate device to the vertex indicating the other intermediate device.
  • Upper limit generates a further model information may be constructed a conceptual model comprising at least one of the sides is set to transferable per unit time to the processing server 330 from the nearest intermediate devices.
  • the conceptual model indicates the vertex indicating the intermediate device, the edge from the vertex indicating the data server 340 to the vertex indicating the intermediate device, and the intermediate device. And an edge from the vertex to the vertex indicating the processing server 330.
  • the conceptual model is the first intermediate device and the second intermediate device.
  • Two vertices indicating two intermediate devices an edge from the vertex indicating the data server 340 to the vertex indicating the first intermediate device, an edge extending from the vertex indicating the first intermediate device to the vertex indicating the second intermediate device, 2 includes an edge extending from the vertex indicating the intermediate device to the vertex indicating the processing server 330.
  • the model generation unit 301 has a vertex indicating the intermediate device, one or more vertexes indicating one or more input units to the intermediate device, and one or more outputs of the intermediate device. You may make it comprise one or more vertices which show a part, and one or more sides which connect between the input part and output part which can transfer data.
  • FIG. 30 is a flowchart showing detailed operations of the master server 300 of the fourth embodiment in the step (S404).
  • the model generation unit 301 of the master server 300 executes the process (S404-12-10) for each data server Di storing unprocessed processing target data (S404-11).
  • the model generation unit 301 adds a line related to the data transfer path from the data server Di to the processing server 330 to the model information 500 (S404-12-10).
  • FIGS. 31A, 31B, and 31C are flowcharts showing detailed operations of the master server 300 of the fourth embodiment in the step (S404-12-10).
  • the name (or identifier) of the data server Di is set as the initial value in the device IDi.
  • the model generation unit 301 extracts information (input / output channel information) of a row in which the device IDi is set as the input source device ID from the input / output channel information storage unit 3080 (S404-12-11).
  • the model generation unit 301 identifies a set of output destination device IDs included in the extracted input / output communication path information (S404-12-12).
  • the model generation unit 301 determines whether or not the device IDi indicates a switch (S404-12-13). When the model generation unit 301 determines that the device IDi indicates a switch (S404-12-13; Yes), the model generation unit 301 performs the process of FIG. 31B. FIG. 31B will be described later.
  • the model generation unit 301 determines that the device IDi does not indicate a switch (S404-12-13; No), whether or not a line including the device IDi as an identifier has already been set in the model information 500. Determination is made (S404-12-14). If the model generation unit 301 determines that the line including the device IDi as an identifier has already been set in the model information 500 (S404-12-14; Yes), the model generation unit 301 ends the process of FIG. 31A.
  • model generation unit 301 determines that the line including the device IDi as an identifier has not yet been set in the model information 500 (S404-12-14; No), the output specified by the step (S404-12-12)
  • the process (S404-12-16) is executed for each output destination device IDj in the set of destination devices ID (S404-12-15).
  • step (S404-12-16) the model generation unit 301 determines whether the output destination device IDj indicates a switch.
  • the model generation unit 301 determines that the output destination device IDj indicates a switch (S404-12-16; Yes)
  • the model generation unit 301 performs the process of FIG. 31C. FIG. 31C will be described later.
  • the model generation unit 301 determines that the output destination device IDj does not indicate a switch (S404-12-16; No)
  • the model generation unit 301 adds a line including the device IDi as an identifier to the model information 500 (S404-12). -17).
  • the model generation unit 301 sets the output destination device IDj as a pointer to the next element in the added row (S404-12-18).
  • the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device indicated by the output destination device IDj as the flow rate upper limit value in the additional row, and adds the additional A value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the row (S404-12-19). Note that the step (S404-12-19) may be executed before the step (S404-12-18).
  • the model generation unit 301 determines whether or not the output destination device IDj indicates the processing server 330 (S404-12-1A).
  • the model generation unit 301 determines that the output destination device IDj does not indicate the processing server 330 (S404-12-1A; No)
  • the model generation unit 301 recursively executes the processing of FIG. 31A (S404-12-1B).
  • the output destination device IDj is set as the initial value for the device IDi.
  • a line including the output destination device IDj as an identifier is added to the model information 500. If the model generation unit 301 determines that the output destination device IDj indicates the processing server 330 (S404-12-1A; Yes), the model generation unit 301 does not perform recursive execution of the processing in FIG. 31A.
  • step (S404-12-13) in FIG. 31A If it is determined in step (S404-12-13) in FIG. 31A that the device IDi indicates a switch (S404-12-13; Yes), the model generation unit 301 identifies the device ID in step (S404-12-12).
  • the process (S404-12-1D) is executed for each output destination device IDj in the set of output destination device IDs (S404-12-1C).
  • the device ID i indicating the switch may be referred to as a switch i.
  • the model generation unit 301 determines whether a line including the identifier of the output port for the output destination device IDj in the switch i exists in the model information 500 (S404-12-1D). When the model generation unit 301 determines that there is no line including the identifier of the output port in the model information (S404-12-1D; No), the model generation unit 301 stores the output port to the output destination device IDj in the switch i. A line including the identifier is added (S404-12-1E).
  • the model generation unit 301 sets the output destination device IDj as a pointer to the next element in the added row (S404-12-1F). Further, the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device indicated by the output destination device IDj as the flow rate upper limit value in the additional row, and adds the additional A value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the row (S404-12-1G). Note that the step (S404-12-1G) may be performed before the step (S404-12-1F).
  • model generation unit 301 determines that there is a line including the identifier of the output port in the model information 500 (S404-12-1D; Yes)
  • the above-described steps (S404-12-1E) and (S404- 12-1F) and (S404-12-1G) are not executed.
  • the model generation unit 301 executes the step (S404-12-1I) for the input port identifier k in each switch i whose input source device ID is different from the output destination device IDj (S404-12-1H).
  • the model generation unit 301 determines whether or not (S404-12-1I). When the model generation unit 301 determines that there is no corresponding row in the model information 500 (S404-12-1I; No), the model generation unit 301 adds a row including k as an identifier to the model information 500 (S404-12-1J). ).
  • the model generation unit 301 sets the identifier of the output port to the output destination device IDj in the switch i in the pointer to the next element in the additional row (S404-12-1K).
  • the model generation unit 301 sets infinity to the upper limit of flow rate in the additional row, and sets a value that is greater than or equal to 0 and lower than or equal to the upper limit of flow rate in the additional row (S404-12-1L). Note that the step (S404-12-1L) may be executed before the step (S404-12-1K).
  • the model generation unit 301 determines whether or not the output destination device IDj indicates the processing server 330 (S404-12-1M).
  • the model generation unit 301 determines that the output destination device IDj does not indicate the processing server 330 (S404-12-1M; No)
  • the model generation unit 301 recursively executes the processing of FIG. 31A (S404-12-1N).
  • the output destination device IDj is set as the initial value for the device IDi.
  • a line including the output destination device IDj as an identifier is added to the model information 500.
  • the model generation unit 301 determines that the output destination device IDj indicates the processing server 330 (S404-12-1M; Yes)
  • the model generation unit 301 does not perform recursive execution of the processing in FIG. 31A.
  • the output destination device IDj indicating a switch may be referred to as a switch j.
  • the model generation unit 301 determines that the output destination device ID is the device IDi.
  • the process (S404-12-1P) is executed for each output port identifier k in each different switch j (S404-12-1O).
  • the model generation unit 301 determines whether there is a line in the model information 500 in which the identifier k is set as the pointer to the next element (S404-12-1P). When the model generation unit 301 determines that the corresponding row does not exist (S404-12-1P; No), the model generation unit 301 adds a row including the identifier of the input port from the device IDi in the switch j to the model information 500 (S404). -12-1Q).
  • the model generation unit 301 sets the identifier k to the pointer to the next element in the added row (S404-12-1R). Further, the model generation unit 301 sets infinity to the upper limit value of the flow rate in the additional row, and sets 0 to the lower limit value of the flow rate in the additional row (S404-12-1S). Note that the step (S404-12-1S) may be executed before the step (S404-12-1R). On the other hand, when the model generation unit 301 determines that the corresponding row exists (S404-12-1M; Yes), the process (S404-12-1Q), (S404-12-1R), and (S404-12-1S) ) Is not executed.
  • the model generation unit 301 adds a line including the device IDi as an identifier to the model information 500 (S404-12-1T).
  • the model generation unit 301 sets the identifier of the input port from the switch j in the device IDi to the pointer to the next element in the additional row (S404-12-1U).
  • the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device (switch j) indicated by the output destination device IDj to the flow rate upper limit value in the additional row. Then, a value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the additional row (S404-12-1V). Note that the step (S404-12-1V) may be executed before the step (S404-12-1U).
  • the model generation unit 301 recursively executes the process of FIG. 31A (S404-12-1W). At this time, the output destination device IDj is set as the initial value for the device IDi. As a result, a line including the output destination device IDj as an identifier is added to the model information 500.
  • the devices constituting the data transfer path between each data server 340 and each processing server 330 and the available bandwidth of the communication path are further considered, and the processing server 330 and the processing target of the processing server 330 are The combination with the data server 340 that is the data acquisition destination is determined. Therefore, according to the fourth embodiment, it is possible to more accurately minimize the processing time of each job to be executed in the entire system.
  • the distributed system 350 according to the fifth embodiment will be described focusing on the content different from the above-described embodiments. About the same content as each above-mentioned embodiment, it abbreviate
  • the master server 300 generates model information in consideration of a decrease in processing capability when processing of a plurality of jobs is executed by the processing server 330 in parallel.
  • FIG. 32 is a flowchart showing the detailed operation of the master server 300 of the fifth embodiment in the step (S404-20) (see FIG. 22).
  • an edge indicating a decrease in processing capability of the processing server 330 due to processing of a plurality of jobs is added to the model information.
  • the model generation unit 301 executes steps (S404-2G) to (S404-2J) for each available processing server Pi acquired from the server state storage unit 3060 based on the request information (S404-). 2F).
  • the model generation unit 301 adds a line including the name (or identifier) of the processing server Pi to the model information 500 (S404-2G).
  • the model generation unit 301 sets the second name (or identifier) indicating the processing server Pi to the pointer to the next element in the added row (S404-2H).
  • the second name (or identifier) indicating the processing server Pi is a name (or identifier) corresponding to Pi that is unique in the model information 500.
  • the model generation unit 301 sets a possible processing amount per unit time of the processing server Pi in the flow rate upper limit value in the additional row, and a value that is greater than or equal to 0 and less than or equal to the flow rate upper limit value in the flow rate lower limit value in the additional row. Is set (S404-2I). Note that the step (S404-2I) may be performed before the step (S404-2H).
  • the model generation unit 301 executes the process (S404-2K) to the process (S404-2M) for each job Jj (S404-2J).
  • the model generation unit 301 adds a line including the second name (or identifier) indicating the processing server Pi to the model information 500 (S404-2K).
  • the model generation unit 301 sets an identifier indicating the end point of the job Jj as a pointer to the next element of the additional row (S404-2L). Further, the model generation unit 301 sets a possible processing amount per unit time of the processing server Pi for the job Jj to the upper limit flow rate value of the additional row, and the flow rate lower limit value of the additional row is 0 or more and the upper limit flow rate value.
  • the following values are set (S404-2M). Note that the step (S404-2M) may be performed before the step (S404-2L).
  • each processing capability of each job executed in parallel by the processing server 330 is further added to the model information as a constraint condition, and a conceptual model (network model) that can be constructed from this model information. ), The combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined.
  • the fifth embodiment it is possible to realize data transmission / reception and data processing of the distributed system 350 in consideration of a decrease in processing capacity due to processing of a plurality of jobs, and to minimize processing time in the entire system. Can do.
  • the distributed system 350 according to the sixth embodiment will be described focusing on the content different from the above-described embodiments. About the same content as each above-mentioned embodiment, it abbreviate
  • the master server 300 generates model information in consideration of a decrease in processing capability when processing of a plurality of jobs having different processing loads is performed in parallel by the processing server 330.
  • FIG. 33 is a diagram illustrating an example of information stored in the server state storage unit 3060 according to the sixth embodiment.
  • the server state storage unit 3060 stores the remaining resource information as the load information 3062, and further stores new processing load information 3066.
  • the remaining resource information indicates the remaining amount of resources used by the processing server 330 when the processing is executed.
  • the processing load information indicates the amount of resources of the processing server 330 used when executing the processing.
  • the processing load information is represented by, for example, a resource usage amount per unit processing amount.
  • the determining unit 303 of the master server 300 acquires the flow rate (flow rate function f (e)) of each side (step (S405-1 in FIG. 17)) and is acquired from the server state storage unit 3060.
  • the remaining resource information (load information 3062) and processing load information 3066 are further considered.
  • the processing load (resource amount) and the remaining load (remaining resource amount) consumed per unit processing amount of each job are managed, and these are used as constraints as a concept.
  • a route in the model is selected, and as a result, the correspondence between the data server 340, the processing server 330, and the unit processing amount for exchanging data regarding each job is determined. That is, in the sixth embodiment, the processing capability of the processing server 330 when jobs with different processing loads are processed in parallel is further added.
  • the sixth embodiment it is possible to realize data transmission / reception and data processing of the distributed system 350 in consideration of a decrease in processing capacity due to processing of a plurality of jobs having different processing loads. It can be minimized.
  • the processing load (resource amount) spent per unit processing amount for each job is stored in the processing load information of the server state storage unit 3060. May store the processing load consumed per unit processing amount without distinguishing between jobs.
  • Example 1 shows a specific example of the first embodiment described above.
  • FIG. 34 is a diagram conceptually illustrating a configuration example of the distributed system 350 in the first embodiment.
  • the distributed system 350 according to the first embodiment includes switches sw1 and sw2 and servers n1 to n4, and the servers n1 to n4 are connected to each other via the switches sw1 and sw2.
  • the servers n1 to n4 function as the processing server 330 and the data server 340 depending on the situation.
  • the servers n1 to n4 have disks D1 to D4 as the processing data storage unit 342.
  • any one of the servers n1 to n4 functions as the master server 300.
  • the server n1 has p1 as the usable process execution unit 332, and the server n3 has p3 as the usable process execution unit 332.
  • FIG. 35 is a diagram illustrating information stored in the server state storage unit 3060 according to the first embodiment.
  • the servers n1 and n3 are available processing servers 330.
  • the processable amount of the server n1 is 50 MB / s
  • the processable amount of the server n3 is 150 MB / s.
  • FIG. 36 is a diagram illustrating information stored in the input / output communication path information storage unit 3080 according to the first embodiment.
  • the available bandwidth at the time of data transmission from the server n2 to the server n1, the available bandwidth at the time of data transmission from the server n2 to the server n3, and the available bandwidth at the time of data transmission from the server n4 to the server n1 are 50 MB / s, respectively.
  • the available bandwidth at the time of data transmission from the server n4 to the server n3 is 100 MB / s.
  • FIG. 37 is a diagram illustrating information stored in the data location storage unit 3070 according to the first embodiment.
  • the processing target data (MyDataSet1) is divided and stored in files da, db, and dc.
  • the files da and db are stored in the disk D2 of the server n2, and the file dc is stored in the disk D4 of the server n4.
  • the processing target data (MyDataSet1) is data that is distributed and not multiplexed.
  • the model generation unit 301 of the master server 300 obtains ⁇ n2, n4 ⁇ as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 in FIG.
  • the model generation unit 301 obtains ⁇ n1, n3 ⁇ as a set of identifiers of the available processing servers 330 from the server state storage unit 3060 in FIG.
  • the model generation unit 301 adds the information stored in the server state storage unit 3060 and the information stored in the input / output communication path information storage unit 3080 with respect to the acquired ⁇ n1, n3 ⁇ and ⁇ n2, n4 ⁇ . Based on this, model information is generated.
  • FIG. 38 is a diagram illustrating model information generated in the first embodiment.
  • FIG. 39 is a diagram showing a conceptual model (directed graph, network model) constructed by the model information shown in FIG.
  • the value given to each side on the conceptual model shown in FIG. 39 is the maximum value of the current data transfer amount per unit time in the communication channel (the upper limit value of the transfer amount constraint condition) or the start point of the side
  • the maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.
  • the determination unit 303 of the master server 300 determines the flow rate (flow rate function f of each side) in the conceptual model so that the processing time of the job corresponding to the request information is minimized in the entire system. ) Respectively.
  • the flow function f is determined so that the data processing amount per unit time in the distributed system 350 is maximized.
  • 40A to 40G are diagrams conceptually showing the flow function f determination process and the data flow information determination process by the flow increase method in the maximum flow problem in the first embodiment.
  • the determination unit 303 constructs a network model as shown in FIG. 40A based on the model information of FIG. Note that the construction of the network model does not only mean that some data form as a software element is generated, but is for explaining the processing until the flow rate function f is determined by using the model information. It is also used simply as a concept. In this network model, a start point s is set and an end point t is set.
  • the determination unit 303 displays the residual graph of the network shown in FIG. 40C. Identify.
  • the flow with the flow rate 0 is not shown in the residual graph.
  • the determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 40C and assigns a flow to the path.
  • a flow of 50 MB / s is given to the route (s, n2, n3, t) as shown in FIG. 40D.
  • the determination unit 303 specifies the network residual graph shown in FIG. 40E.
  • the determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 40E and gives a flow to the path. Based on the residual graph shown in FIG. 40E, the determination unit 303 assigns a flow of 100 MB / s to the route (s, n4, n3, t) as shown in FIG. 40F. As a result, the determination unit 303 identifies the residual graph of the network illustrated in FIG. 40G.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 41 is a diagram showing data flow information in the first embodiment.
  • the determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.
  • FIG. 42 is a diagram conceptually illustrating data transmission / reception performed in the first embodiment.
  • the processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes the process of the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p3 executes processing of the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n4.
  • the process execution unit p3 executes processing of the acquired data.
  • Example 2 shows a specific example of the first modification of the second embodiment described above.
  • the configuration of the distributed system 350 in the second embodiment is the same as that in the first embodiment (see FIG. 34).
  • the state of the input / output communication path information storage unit 3080 in the second embodiment is also the same as that in the first embodiment (see FIG. 36).
  • FIG. 43 is a diagram illustrating information stored in the job information storage unit 3040 according to the second embodiment.
  • a job MyJob1 and a job MyJob2 are input as units for executing a program.
  • the maximum unit processing amount of the job MyJob1 is set to 25 MB / s, and the minimum unit processing amount of the job MyJob1 is not set.
  • the minimum unit processing amount of the job MyJob2 is set to 50 MB / s, and the maximum unit processing amount of the job MyJob2 is not set.
  • FIG. 44 is a diagram illustrating information stored in the server state storage unit 3060 according to the second embodiment.
  • the server n1 can process the jobs MyJob1 and MyJob2 at 50 MB / s, respectively, and the server n3 can process the jobs MyJob1 and MyJob2 at 150 MB / s, respectively.
  • FIG. 45 is a diagram illustrating information stored in the data location storage unit 3070 according to the second embodiment.
  • the data location storage unit 3070 stores information about the processing target data MyDataSet1 and MyDataSet2.
  • MyDataSet1 is stored in the file da, and the file da is stored in the disk D2 of the server n2.
  • MyDataSet2 is divided into files db, dc, and dd and stored.
  • the files db and dc are stored in the disk D2 of the server n2, and the file dd is stored in the disk D4 of the server n4. ing.
  • MyDataSet1 and MyDataSet2 are data that are distributed and not multiplexed.
  • the job information storage unit 3040, server state storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 43, 44, 36, and 45.
  • the request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) by the client 360 and requesting execution of the job MyJob2 using the processing target data (MyDataSet2) is received by the master server 300.
  • MyDataSet1 processing target data
  • MyDataSet2 processing target data
  • the model generation unit 301 of the master server 300 obtains ⁇ MyJob1, MyJob2 ⁇ as a set of jobs instructed to be executed from the job information storage unit 3040 in FIG.
  • the model generation unit 301 acquires the name of data used by the job, the minimum unit processing amount, and the maximum unit processing amount for each job.
  • the model generation unit 301 obtains ⁇ D2, D4 ⁇ as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 of FIG.
  • the model generation unit 301 acquires ⁇ n2, n4 ⁇ as a set of identifiers of the data server 340 from the server state storage unit 3060 in FIG. 44, and uses ⁇ n1, n3 ⁇ as a set of identifiers of the available processing servers 330. obtain. Further, the model generation unit 301 obtains the processable amount information of the available processing servers n1 and n3 from the server state storage unit 3060 in FIG.
  • the model generation unit 301 generates model information based on each set acquired in this way and information stored in the input / output channel information storage unit 3080 in FIG.
  • FIG. 46 is a diagram illustrating model information generated in the second embodiment.
  • FIG. 47 is a diagram showing a conceptual model constructed by the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 47 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.
  • the determination unit 303 determines the flow function f based on the model information of FIG. 46 so that the job processing time is minimized.
  • FIGS. 48A to 48F and FIGS. 49A to 49J are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the second embodiment.
  • 48A to 48F are diagrams showing an example of an initial flow calculation procedure that satisfies the lower limit flow rate restriction.
  • the determination unit 303 constructs the network model shown in FIG. 48A based on the model information of FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 is set, and an end point t2 corresponding to the start point s2 is set. Furthermore, the determination unit 303 sets a virtual start point s * and a virtual end point t * for the network model shown in FIG. 48A.
  • the determination unit 303 sets a difference value between the flow rate upper limit value before the change and the flow rate lower limit value before the change in the new flow rate upper limit value on the side to which the flow rate restriction is given. Moreover, the determination part 303 sets 0 to the new flow volume lower limit value of the side. Such processing is performed on the network model shown in FIG. 48A, whereby the network model shown in FIG. 48B is constructed.
  • the determining unit 303 connects the end point of the side connecting the s2 to which the lower limit flow rate restriction is given and MyJob2 and the virtual start point s *, and the s2 and the virtual end point t *, respectively. Specifically, a side where a predetermined flow rate upper limit value is set is added between the aforementioned vertices. This predetermined flow rate upper limit value is a flow rate lower limit value before change that has been set on the side where the lower limit flow rate restriction is given. Further, the determination unit 303 connects the end point t2 and the start point s2. Specifically, a side where the upper limit of the flow rate is infinite is added between the end point t2 and the start point s2. Such processing is performed on the network model shown in FIG. 48B, whereby the network model shown in FIG. 48C is constructed.
  • the determination unit 303 obtains an s * -t * -flow in which the flow rate of the side exiting from s * and the side entering t * is saturated for the network model shown in FIG. 48C. Note that the absence of the corresponding flow indicates that no solution satisfying the lower limit flow rate restriction exists in the original network model.
  • the path (s *, MyJob2, n4, n3, t2, s2, t *) shown in FIG. 48D corresponds to the s * -t * -flow.
  • the determining unit 303 deletes the added vertex and edge from the network model, and returns the flow restriction value of the edge to which the flow restriction is given to the original value before the change. And the determination part 303 gives a flow only by the part of the flow volume lower limit with respect to the said edge
  • the determination unit 303 as illustrated in FIG. 48E, the actual path (s2, MyJob2, n4, n3, t2) from which the added vertex and edge are deleted. Gives a flow of 50 MB / s.
  • the residual graph of the network shown in FIG. 48F is specified.
  • This path (s2, MyJob2, n4, n3, t2) is an initial flow (FIG. 49A) that satisfies the lower limit flow rate restriction.
  • the determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49B (similar to FIG. 48F) and gives a flow to the path. Specifically, the determination unit 303 gives a flow of 25 MB / s to the route (s1, MyJob1, n2, n1, t1) as shown in FIG. 49C based on the residual graph shown in FIG. 49B. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 49D.
  • the determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49D and gives a flow to the path. Based on the residual graph shown in FIG. 49D, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n4, n3, t2) as shown in FIG. 49E. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 49F.
  • the determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49F and gives a flow to the path. Accordingly, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n2, n1, t2) as illustrated in FIG. 49G based on the residual graph illustrated in FIG. 49F. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 49H.
  • the determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49H and gives a flow to the path. Based on the residual graph shown in FIG. 49H, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n2, n3, t2) as shown in FIG. 49I. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 49J.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 50 is a diagram illustrating data flow information generated in the second embodiment.
  • the determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.
  • FIG. 51 is a diagram conceptually illustrating data transmission / reception performed in the second embodiment.
  • the processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes MyJob1 process on the acquired data.
  • the processing server n1 acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes MyJob2 process on the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p3 executes MyJob2 process on the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n4.
  • the process execution unit p3 executes MyJob2 process on the acquired data.
  • Example 3 shows a specific example of the above-described third embodiment.
  • the configuration of the distributed system 350 in the third embodiment is the same as that in the first embodiment (see FIG. 34). Further, the state of the server state storage unit 3060 in the third embodiment is the same as that in the first embodiment (see FIG. 35). The state of the input / output communication path information storage unit 3080 in the third embodiment is also the same as that in the first embodiment (see FIG. 36).
  • FIG. 52 is a diagram illustrating information stored in the data location storage unit 3070 according to the third embodiment.
  • the processing target data (MyDataSet1) is divided and stored into a file da and data db.
  • the file da is a single file that is not multiplexed and is stored in the disk D2 of the server n2.
  • the data db is duplicated into a file db1 and a file db2, the file db1 is stored on the disk D2 of the server n2, and the file db2 is stored on the disk D4 of the server n4.
  • the processing target is processed by the client 360.
  • request information for requesting execution of a processing program that uses data MyDataSet1 is transmitted to the master server 300.
  • MyDataSet1 request information for requesting execution of a processing program that uses data
  • the model generation unit 301 acquires ⁇ n2, n4 ⁇ from the data location storage unit 3070 in FIG. 52 and the server state storage unit 3060 in FIG. ⁇ N1, n3 ⁇ is acquired as a set of identifiers of processable servers 330 that can be processed.
  • the model generation unit 301 includes each of the acquired sets, information stored in the server state storage unit 3060 in FIG. 35, information stored in the input / output communication path information storage unit 3080 in FIG. 36, and FIG. Model information is generated based on the information stored in the data location storage unit 3070.
  • FIG. 53 is a diagram showing model information generated in the third embodiment.
  • FIG. 54 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 54 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.
  • the determining unit 303 determines the flow function f based on the model information shown in FIG. 53 so that the job processing time is minimized.
  • FIGS. 55A to 55G are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the third embodiment.
  • the determination unit 303 constructs the network model shown in FIG. 55A based on the model information of FIG. In this network model, a start point s is set and an end point t is set.
  • the determination unit 303 gives a flow of 50 MB / s to the route (s, db, n2, n1, t).
  • the determination unit 303 specifies the residual graph of the network illustrated in FIG. 55C.
  • the determining unit 303 identifies the flow increasing path from the residual graph shown in FIG. 55C and gives a flow of 50 MB / s to the path (s, da, n2, n3, t) as shown in FIG. 55D. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 55E.
  • the determination unit 303 identifies the flow increasing path from the residual graph shown in FIG. 55E, and gives a flow of 100 MB / s to the path (s, db, n4, n3, t) as shown in FIG. 55F. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 55G.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 56 is a diagram showing data flow information in the third embodiment.
  • the determination unit 303 determines the data to be processed by each processing server 330 for the duplicated data db. Identify. Specifically, the determination unit 303 identifies data flow information (Flow 1 and Flow 3) including the data db in the path information, and unit processing amounts (50 MB / s and 100 MB) set in each identified data flow information / S) for each processing server 330 (n1 and n3) included in the path information of the data flow information.
  • the unit processing amounts for the processing servers n1 and n3 are 50 MB / s and 100 MB / s.
  • the master server 300 causes the processing server n1 to process the 0th byte to the 2nd gigabyte of the file db1 stored in the data server n2, and stores the processing server n3 in the data server n4. It is determined that the second to sixth gigabytes of the file db2 to be processed are processed. This information is included in the decision information corresponding to the processing program.
  • FIG. 57 is a diagram conceptually illustrating data transmission / reception performed in the third embodiment.
  • the processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes a process on the acquired data from the 0th byte to the 2nd gigabyte.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p3 executes processing of the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n4.
  • the process execution unit p3 executes a process on the acquired data from the 2nd to 6th gigabytes.
  • Example 4 shows a specific example of the above-described fourth embodiment.
  • FIG. 58 is a diagram conceptually illustrating the configuration of the distributed system 350 in the fourth embodiment.
  • the distributed system 350 according to the fourth embodiment includes a switch sw and servers n1 to n4.
  • Servers n1 to n4 are connected to each other via a switch sw.
  • the servers n1 to n4 function as the processing server 330 and the data server 340 depending on the situation.
  • the servers n1 to n4 have disks D1 to D4 as processing data storage units 342, respectively.
  • any one of the servers n1 to n4 functions as the master server 300.
  • the servers n1 to n3 have p1 to p3 as available processing execution units 332.
  • FIG. 59 is a diagram illustrating information stored in the server state storage unit 3060 according to the fourth embodiment.
  • the servers n1 to n3 are available processing servers 330, and the processable amounts of the servers n1 to n3 are 50 MB / s, 50 MB / s, and 100 MB / s, respectively.
  • FIG. 60 is a diagram illustrating information stored in the input / output communication path information storage unit 3080 according to the fourth embodiment.
  • Each reading speed of the disk D2 of the server n2 and the disk D4 of the server n4 is 100 MB / s.
  • the available bandwidth at the time of data transmission from the server n2 to the switch sw and the available bandwidth at the time of data transmission from the server n4 to the switch sw are 100 MB / s.
  • the available bandwidth at the time of data transmission from the switch sw to the server n1 the available bandwidth at the time of data transmission from the switch sw to the server n2, and the available bandwidth at the time of data transmission from the switch sw to the server n3 are 100 MB / s.
  • the state of the data location storage unit 3070 in the fourth embodiment is the same as that in the first embodiment (see FIG. 37).
  • the model generation unit 301 of the master server 300 receives ⁇ n2, n4 ⁇ as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 in FIG. 37 and the server state storage unit 3060 in FIG. And obtain ⁇ n1, n2, n3 ⁇ as a set of identifiers of the available processing servers 330. Based on each of these acquired sets, information stored in the server state storage unit 3060 in FIG. 59, and information stored in the input / output communication path information storage unit 3080 in FIG. Produce a top cloth.
  • FIG. 61 is a diagram illustrating model information generated in the fourth embodiment.
  • FIG. 62 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 62 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.
  • the determination unit 303 of the master server 300 determines the flow rate function f based on the model information of FIG. 61 so that the job processing time is minimized.
  • FIGS. 63A to 63G are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the fourth embodiment.
  • the determination unit 303 constructs the network model shown in FIG. 63A based on the model information of FIG. In this network model, a start point s is set and an end point t is set. The determination unit 303 gives a flow of 50 MB / s to the route (s, D2, ON2, n2, t) as illustrated in FIG. 63B. As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 63C.
  • the determination unit 303 identifies the flow increasing path from the residual graph illustrated in FIG. 63C, and as illustrated in FIG. 63D, the flow of 50 MB / s on the path (s, D2, ON2, i2sw, o1sw, n1, t). give. As a result, the determination unit 303 identifies the residual graph of the network illustrated in FIG. 63E.
  • the determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 63E, and a flow of 50 MB / s on the path (s, D4, ON4, i4sw, o3sw, n3, t) as illustrated in FIG. 63F. give. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 63G.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 64 is a diagram showing data flow information in the fourth embodiment.
  • the determination unit 303 transmits the processing program from the processing servers n1 to n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 to n3.
  • FIG. 65 is a diagram conceptually illustrating data transmission / reception performed in the fourth embodiment.
  • the processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes the process of the acquired data.
  • the processing server n2 acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p2 executes the process of the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n4.
  • the process execution unit p3 executes the process of the acquired data.
  • Example 5 shows a specific example of the fifth embodiment described above.
  • the configuration of the distributed system 350 in the fifth embodiment is the same as that in the first embodiment (see FIG. 34).
  • the state of the input / output communication path information storage unit 3080 in the fifth embodiment is also the same as that in the first embodiment (see FIG. 36).
  • the state of the server state storage unit 3060 in the fifth embodiment is the same as that in the second embodiment (see FIG. 44).
  • the state of the data location storage unit 3070 in the fifth embodiment is the same as that in the second embodiment (see FIG. 45).
  • FIG. 66 is a diagram illustrating information stored in the job information storage unit 3040 according to the fifth embodiment.
  • a job MyJob1 and a job MyJob2 are input as units for executing a program.
  • the minimum unit processing amount and the maximum unit processing amount are not set for the jobs MyJob1 and MyJob2.
  • the job information storage unit 3040, server status storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 66, 44, 36, and 45.
  • request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) and the job MyJob2 using the processing target data (MyDataSet2) is transmitted to the master server 300.
  • MyDataSet1 processing target data
  • MyDataSet2 job MyJob2 using the processing target data
  • the model generation unit 301 acquires ⁇ MyJob1, MyJob2 ⁇ as a set of jobs that are currently instructed from the job information storage unit 3040 in FIG. 66, and further, for each job, the name of data used by the job, The minimum unit processing amount and the maximum unit processing amount are acquired. Further, the model generation unit 301 acquires ⁇ n2, n4 ⁇ as a set of identifiers of data servers that store the processing target data from the data location storage unit 3070 in FIG. 45 and the server state storage unit 3060 in FIG. ⁇ N1, n3 ⁇ is acquired as a set of identifiers of possible processing servers 330. Further, the model generation unit 301 obtains each processable amount information of the server n1 and the server n3 related to each job from the server state storage unit 3060 of FIG.
  • the model generation unit 301 generates model information based on the acquired sets and information stored in the input / output communication path information storage unit 3080 in FIG.
  • FIG. 67 is a diagram illustrating model information generated in the fifth embodiment.
  • a pointer to the next element in the row including the identifiers of the processing servers n1 and n3 is used as a second identifier (n1 ′ and n3 ′) of each processing server n1 and n3. Is set.
  • names (MyJob1 ′, MyJob2 ′) indicating the end of each job (MyJob1, MyJob2) are set in the pointer to the next element in the row including the second identifiers of the processing servers n1 and n3.
  • FIG. 68 is a diagram showing a conceptual model constructed from the model information shown in FIG.
  • the value given to each side on the conceptual model shown in FIG. 68 is the maximum value of the current data transfer amount per unit time on the communication path (upper limit value of the transfer amount constraint condition), or the start point of the side
  • the maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.
  • the determination unit 303 of the master server 300 determines the flow rate function f based on the model information of FIG. 67 so that the job processing time is minimized.
  • the determination unit 303 constructs the network model shown in FIG. 69A based on the model information of FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 (MyJob1) is set, and an end point t2 corresponding to the start point s2 (MyJob2) is set.
  • the determination unit 303 gives a flow of 50 MB / s to the route (s1, MyJob1, n2, n1, n1 ′, t1) as shown in FIG. 69B. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 69C.
  • the determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 69C, and performs a flow of 50 MB / s on the path (s2, MyJob2, n2, n3, n3 ′, t2) as illustrated in FIG. 69D. Give in addition. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 69E.
  • the determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 69E, and, as shown in FIG. 69F, determines a flow of 100 MB / s on the path (s2, MyJob2, n4, n3, n3 ′, t2). Give in addition. As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 69G.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 70 is a diagram illustrating data flow information in the fifth embodiment.
  • the determining unit 303 transmits the processing program to the processing servers n1 and n3 based on the data flow information determined in this way.
  • the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.
  • FIG. 71 is a diagram conceptually illustrating data transmission / reception performed in the fifth embodiment.
  • the processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2.
  • the process execution unit p1 executes the process of the acquired data.
  • the processing server n3 acquires data in the processing data storage unit 342 of the data server n2 and data in the processing data storage unit 342 of the data server n4.
  • the process execution unit p3 executes a process for each acquired data.
  • Example 6 shows a specific example of the above-described sixth embodiment.
  • the configuration of the distributed system 350 in the sixth embodiment is the same as that in the first embodiment (see FIG. 34). Further, the state of the input / output communication path information storage unit 3080 in the sixth embodiment is the same as that in the first embodiment (see FIG. 36).
  • the state of the job information storage unit 3040 in the sixth embodiment is the same as that in the fifth embodiment (see FIG. 66).
  • the state of the data location storage unit 3070 in the sixth embodiment is the same as that in the second embodiment (see FIG. 45).
  • FIG. 72 is a diagram illustrating information stored in the server state storage unit 3060 according to the sixth embodiment.
  • the server n1 one resource that can be used for processing remains, and 0.01 and 0.02 per unit processing amount (1 MB / s) for each processing of the jobs MyJob1 and MyJob2. Resources are used.
  • the server n2 0.5 resources that can be used for processing remain, and resources of 0.002 and 0.004 per unit processing amount (1 MB / s) are used for the processing of jobs MyJob1 and MyJob2.
  • the server n2 0.5 resources that can be used for processing remain, and resources of 0.002 and 0.004 per unit processing amount (1 MB / s) are used for the processing of jobs MyJob1 and MyJob2.
  • the job information storage unit 3040, server state storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 66, 72, 36, and 45.
  • request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) and the job MyJob2 using the processing target data (MyDataSet2) is transmitted to the master server 300 by the client 360. To do.
  • the operation of the distributed system 350 in this situation will be described.
  • the model generation unit 301 of the master server 300 acquires ⁇ MyJob1, MyJob2 ⁇ as a set of jobs that are currently instructed to execute from the job information storage unit 3040 in FIG. 66, and for each job, the data used by the job. Get the name, minimum unit throughput, and maximum unit throughput. Further, the model generation unit 301 processes ⁇ n2, n4 ⁇ from the data location storage unit 3070 in FIG. 45 and the server state storage unit 3060 in FIG. 72 as a set of identifiers of data servers storing the processing target data. ⁇ N1, n3 ⁇ is acquired as a set of identifiers of possible processing servers 330. In addition, the model generation unit 301 acquires the remaining resource amount, the processable amount information, and the processing load information regarding the server n1 and the server n3 from the server state storage unit 3060 in FIG.
  • the model generation unit 301 generates model information based on each set acquired in this way and information stored in the input / output channel information storage unit 3080 in FIG.
  • FIG. 73 is a diagram illustrating model information generated in the sixth embodiment.
  • FIG. 74 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value of each side on the conceptual model shown in FIG. 74 indicates the maximum value of the data transfer amount per unit time (the upper limit value of the transfer amount constraint condition).
  • the determination unit 303 of the master server 300 is based on the model information of FIG. 73 and the remaining resource amount, processable amount information, and processing load information regarding the servers n1 and n3 acquired from the server state storage unit 3060 of FIG.
  • the flow function f is determined so that the processing time of the job is minimized.
  • 75A to 75I are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the sixth embodiment.
  • the determination unit 303 constructs the network model shown in FIG. 75A based on the model information table shown in FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 is set, and an end point t2 corresponding to the start point s2 is set.
  • the determination unit 303 gives a flow of 50 MB / s to the route (s1, MyJob1, n2, n1, t1) as shown in FIG. 75B.
  • the determination unit 303 specifies the residual graph of the network illustrated in FIG. 75C.
  • the resource remaining amount of the processing server n3 remains 0.5.
  • the determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 75C and adds a flow of 100 MB / s to the path (s2, MyJob2, n4, n3, t2) as illustrated in FIG. 75D. Give in.
  • the determination unit 303 specifies the network residual graph shown in FIG. 75E.
  • the resource remaining amount of the processing server n1 is 0.5 as it was before.
  • the determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 75E, and additionally gives a flow of 50 MB / s to the path (s1, MyJob1, n2, n3, t1) as shown in FIG. 75F. .
  • the determination unit 303 specifies the network residual graph shown in FIG. 75G.
  • the resource remaining amount of the processing server n1 is 0.5 as it was before.
  • the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.
  • FIG. 76 is a diagram showing data flow information in the sixth embodiment.
  • the determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.
  • FIG. 77 is a diagram conceptually illustrating data transmission / reception performed in the sixth embodiment.
  • the processing server n1 acquires data in each processing data storage unit 342 of the data servers n2 and n4.
  • the process execution unit p1 executes each acquired data process.
  • the processing server n3 acquires data in the processing data storage units 342 of the data servers n2 and n4, respectively.
  • the process execution unit p3 executes each acquired data process.
  • Appendix 1 A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device
  • Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex.
  • a model generation unit that generates model information capable of constructing a conceptual model; and The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively.
  • the total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition is used to determine the flow rate of each side on the conceptual model and satisfy the flow rate of each side
  • a determination unit that selects each route on the model and determines a plurality of combinations of the processing device and the data device that stores data processed by the processing device in accordance with each vertex included in each selected route.
  • the model generation unit includes a plurality of data devices that store a plurality of data devices that store a plurality of data handled by the job from a vertex indicating the job and a vertex indicating the job based on information stored in the job information storage unit Generating the model information that can construct the conceptual model further including a plurality of sides that respectively reach the vertices of
  • the determining unit is configured to process the job, the data device storing at least one data handled by the job, and the at least one data according to each vertex included in each of the selected paths.
  • a plurality of combinations with a processing device are determined, and based on the information of the plurality of combinations and information stored in the data location storage unit, the processing device, an identifier of data processed by the processing device, and the processing device Generating information indicating a correspondence relationship with the data device storing data processed by The management apparatus according to attachment 1.
  • a data location storage unit that stores the identifier of the original data multiplexed into a plurality of replicated data and the identifier of each data device that stores the replicated data in association with each other;
  • the model generation unit further includes the conceptual model further including: a vertex indicating the original data; and a plurality of sides respectively extending from the vertex indicating the original data to a first vertex indicating each data device storing the duplicate data.
  • the determining unit is configured to store the original data, the data device storing at least one of the plurality of duplicate data, and at least one of the plurality of duplicate data according to each vertex included in each of the selected paths.
  • the processing device Determining a plurality of combinations with the processing device that processes one, and based on the information of the plurality of combinations and the information stored in the data location storage unit, the processing device and the data processed by the processing device Generating information indicating the correspondence between the identifier and the data device storing the data to be processed by the processing device;
  • the management device according to attachment 1 or 2.
  • the determination unit further includes the data processing amount per unit time for each of the selected routes in each of the correspondences corresponding to the routes, and generates a plurality of the correspondences having a common data identifier.
  • the amount of data corresponding to the amount of data processing per unit time included in the correspondence relationship of the original data indicated by the common identifier is transmitted to each processing device indicated by the plurality of correspondence relationships. Decide to process each The management device according to attachment 3.
  • the model generation unit is an edge from the second vertex indicating the processing device to the first vertex indicating the data device, the vertex indicating data, or the vertex indicating a job that handles a plurality of data, Generating the model information that can construct the conceptual model further including one or more sides for which the processing amount constraint is set;
  • the management device according to any one of appendices 1 to 4.
  • the model generation unit includes a plurality of start vertices for each job each handling a plurality of data, a respective end point vertex for each job, and a plurality of sides extending from the respective start point vertices to a vertex indicating each job.
  • a plurality of sides extending from the second vertex indicating the processing device to each of the end points, and the upper limit value of the processing amount constraint condition is a unit that the processing device can execute for each job.
  • Generating the model information capable of constructing the conceptual model further including a plurality of sides each set to a processing amount per time, The management device according to any one of appendices 1 to 4.
  • the model generation unit further includes a vertex indicating an intermediate device through which data stored in the data device is received by the processing device, and from the first vertex indicating the data device to the data device An edge that reaches the apex that indicates the nearest intermediate device, and the upper limit of the transfer amount restriction condition is set to the transferable amount per unit time from the data device to the nearest intermediate device, the intermediate The upper limit value of the transfer amount restriction condition is set to the transferable amount per unit time from the intermediate device to the other intermediate device, from the vertex indicating the device to the vertex indicating the other intermediate device.
  • generating the model information may construct the conceptual model further comprises at least one of the sides is set to transferable amount per unit time, The management device according to any one of appendices 1 to 6.
  • the model generation unit includes a vertex indicating the intermediate device, one or more vertexes indicating one or more input units to the intermediate device, one or more vertexes indicating one or more output units of the intermediate device, and data Comprises one or more sides connecting the input unit and the output unit to which can be transferred,
  • the management apparatus according to appendix 7.
  • the model generation unit further sets a lower limit value of a data transfer amount or a data processing amount in the transfer amount constraint condition or the processing amount constraint condition set in at least one side included in the conceptual model.
  • the management device according to any one of appendices 1 to 9.
  • a communication path information storage unit for storing input / output communication path information between each data device and each processing device;
  • a data location storage unit that stores an identifier of data to be processed and the data device that stores the data in association with each other;
  • a server status storage unit for storing the data processing capacity per unit time of each processing device;
  • the management device according to any one of appendices 1 to 10, comprising: The processing device that processes data acquired from the data device according to each combination of the processing device and the data device determined by the determination unit of the management device; A data device that transmits data to the processing device in accordance with each combination of the processing device and the data device determined by the determination unit of the management device; A distributed system.
  • At least one computer At least one computer A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device
  • Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex.
  • Generate model information that can build a conceptual model
  • the first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively.
  • determine the flow rate of each side on the conceptual model Select each path on the conceptual model that satisfies the flow rate of each side, Determining a plurality of combinations of the processing device and the data device storing data to be processed by the processing device according to each vertex included in each of the selected paths; Distributed processing management method.
  • the at least one computer stores a data location storage unit that associates an identifier of data with an identifier of the data device that stores the data, and a job information storage unit that stores information about a plurality of data handled by a job And comprising
  • the generation of the model information indicates a vertex indicating the job and a plurality of data devices storing a plurality of data handled by the job from the vertex indicating the job based on information stored in the job information storage unit.
  • the plurality of combinations are determined by processing the job, the data device storing at least one data handled by the job, and the at least one data according to each vertex included in each of the selected paths. Determining a plurality of combinations with the processing device
  • the at least one computer comprises: The data for storing the processing device, an identifier of data processed by the processing device, and data processed by the processing device based on the information of the plurality of combinations and information stored in the data location storage unit Generate information indicating the correspondence with the device,
  • the at least one computer includes a data location storage unit that stores an identifier of original data multiplexed into a plurality of replicated data and an identifier of each data device that stores the replicated data in association with each other;
  • the generation of the model information further includes a vertex indicating the original data, and a plurality of sides extending from the vertex indicating the original data to a first vertex indicating each data device storing the duplicate data.
  • Determining a plurality of combinations with the processing device for processing at least one of The at least one computer comprises: The data for storing the processing device, an identifier of data processed by the processing device, and data processed by the processing device based on the information of the plurality of combinations and information stored in the data location storage unit Generate information indicating the correspondence with the device,
  • the data processing amount per unit time for each of the selected routes is further included in each of the corresponding relationships corresponding to the respective routes, and the corresponding relationship having a common data identifier is determined.
  • each processing device indicated by the plurality of correspondence relations has a ratio corresponding to the data processing amount per unit time included in the correspondence relation of the original data indicated by the common identifier. Decide what to do with each data volume, The distributed processing management method according to attachment 14.
  • the generation of the model information includes each start point vertex for each job handling a plurality of data, each end point vertex for each job, and a plurality of start points from each start point vertex to a vertex indicating each job.
  • An edge and a plurality of edges extending from the second vertex indicating the processing device to each end vertex, and an upper limit value of the processing amount constraint condition is executable by the processing device for each job.
  • Generating the model information capable of constructing the conceptual model further including a plurality of sides each set to a processing amount per unit time; The distributed processing management method according to any one of appendices 12 to 15.
  • the generation of the model information further includes a vertex indicating an intermediate device through which data stored in the data device is received by the processing device, and the data from the first vertex indicating the data device.
  • An edge that reaches a vertex indicating the nearest intermediate device of the device, and an upper limit value of the transfer amount restriction condition is set to a transferable amount per unit time from the data device to the nearest intermediate device;
  • the edge from the vertex indicating the intermediate device to the vertex indicating the other intermediate device, and the upper limit value of the transfer amount constraint is set to the transferable amount per unit time from the intermediate device to the other intermediate device
  • an edge from the vertex indicating the nearest intermediate device of the processing device to the second vertex indicating the processing device, and the upper limit value of the transfer amount constraint condition is from the nearest intermediate device to the second vertex processing
  • Generating the model information may construct the conceptual model further comprises at least one of the sides is set to transferable per unit time to the location, The distributed processing management method according to any one of appendices 12
  • the generation of the model information includes a vertex indicating the intermediate device, one or more vertices indicating one or more input units to the intermediate device, and one or more vertices indicating one or more output units of the intermediate device; It is composed of one or more sides connecting the input unit and the output unit to which data can be transferred.
  • Appendix 19 A program for causing at least one computer to execute the management method according to any one of appendices 12 to 18.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

管理装置及び分散処理管理方法Management apparatus and distributed processing management method

 本発明は、データを格納するデータ装置とそのデータを処理する処理装置とが分散配置されてなる分散システムにおけるデータの分散処理の管理技術に関する。 The present invention relates to a technique for managing distributed processing of data in a distributed system in which data devices for storing data and processing devices for processing the data are distributedly arranged.

 下記特許文献1には、複数の計算機に格納されたデータを処理させる計算サーバを決定する分散システムであって、個々のデータを格納する計算機から最も近傍な利用可能計算サーバを逐次決定することによって、全てのデータの通信経路を決定する分散システムが提案されている。下記特許文献2には、複数のファイル転送要求が同時にあっても蓄積側においてファイルを入出力する速度が低下しない技術が提案されている。下記特許文献3には、複数のディスクに格納されたファイル群を一括で管理できるアドレス空間を提供する分散ファイルシステムが提案されている。 The following Patent Document 1 discloses a distributed system that determines a calculation server that processes data stored in a plurality of computers, and by sequentially determining the nearest available calculation server from the computer that stores the individual data. A distributed system for determining communication paths for all data has been proposed. Japanese Patent Application Laid-Open No. 2004-228688 proposes a technique that does not reduce the speed of file input / output on the storage side even when there are a plurality of file transfer requests simultaneously. Japanese Patent Application Laid-Open Publication No. 2003-259259 proposes a distributed file system that provides an address space that can collectively manage a group of files stored on a plurality of disks.

 下記特許文献4には、分散データベ-スシステムにおけるネットワ-クの負荷を軽減するために、データベースのデータをクライアントに転送する際に、データ転送時間を考慮して、或る計算機に配置されていた中継サーバを他の計算機に移動させることが提案されている。下記特許文献5には、ファイルが転送される各転送経路の回線速度と負荷状況に応じて、ファイルを分割し、その分割されたファイルを転送する方法が提案されている。下記特許文献6には、様々な速度が指定されるストリーム入出力要求に対して、使用効率の良い資源の割り当てを短時間で決定するストリーム処理装置が提案されている。下記特許文献7には、I/Oノードを使用して計算ノードがファイルシステムへアクセスするコンピュータシステムにおいて、ジョブ毎に、当該ジョブの実行を停止させることなく、I/Oノードを動的に割り当てることで、I/O資源を効率的に利用する手法が提案されている。 In Patent Document 4 below, in order to reduce the network load in a distributed database system, when transferring database data to a client, it is arranged in a certain computer in consideration of the data transfer time. It has been proposed to move a relay server to another computer. Patent Document 5 below proposes a method of dividing a file according to the line speed and load status of each transfer path through which the file is transferred, and transferring the divided file. Patent Document 6 below proposes a stream processing apparatus that determines allocation of resources with high use efficiency in a short time in response to stream input / output requests for which various speeds are specified. Patent Document 7 below dynamically assigns an I / O node for each job without stopping execution of the job in a computer system in which a computing node accesses a file system using an I / O node. Thus, a method for efficiently using I / O resources has been proposed.

米国特許第7650331号明細書US Pat. No. 7,650,331 特開2008-117091号公報JP 2008-117091 A 特開2010-140507号公報JP 2010-140507 A 特開平8-202726号公報JP-A-8-202726 特許第3390406号公報Japanese Patent No. 3390406 特開平8-147234号公報JP-A-8-147234 特許第4569846号公報Japanese Patent No. 4569846

 上述の各技術では、分散処理システムにおけるデータ転送時間の低減のみにフォーカスが当てられているため、システム全体での処理時間が必ずしも低減されるとは言い難い。なお、上述の特許文献3及び7では、複数のデータサーバに格納されたデータを一元的に扱う方法やファイルシステムにアクセスするために必要なI/Oノードの占有率を決定する方法が提案されているに過ぎない。 In each of the above-described technologies, since the focus is only on reducing the data transfer time in the distributed processing system, it is difficult to say that the processing time in the entire system is necessarily reduced. In Patent Documents 3 and 7 mentioned above, a method for centrally handling data stored in a plurality of data servers and a method for determining an I / O node occupancy necessary for accessing a file system are proposed. It ’s just that.

 本発明は、上述のような事情に鑑みてなされたものであり、データを格納するデータ装置とそのデータを処理する処理装置とが分散配置されてなる分散システムにおけるシステム全体でのデータ処理時間を低減させる技術を提供する。 The present invention has been made in view of the circumstances as described above, and reduces the data processing time of the entire system in a distributed system in which data devices for storing data and processing devices for processing the data are distributed. Provide technology to reduce.

 本発明の各態様では、上述した課題を解決するために、それぞれ以下の構成を採用する。 Each aspect of the present invention employs the following configurations in order to solve the above-described problems.

 第1の態様は、管理装置に関する。第1態様に係る管理装置は、データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成するモデル生成部と、前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決め、該各辺の流量を満たす該概念モデル上の各経路を選択し、選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する決定部と、を備える。 The first aspect relates to a management device. The management device according to the first aspect includes a plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and each of the data devices. A plurality of first sides from each first vertex to each second vertex, each of which is set with each transfer amount restriction condition including an upper limit value of the data transferable amount per unit time to the processing device, Each processing amount constraint condition that includes the data processing capacity per unit time of the processing device as an upper limit is set, from each of the second vertices to at least one third vertex of the subsequent stage from each of the second vertices A model generation unit that generates model information capable of constructing a conceptual model including at least one second side; and the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. About each route above Using the total of the data processing amount per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition set in the first side and the second side included in the path, the concept Determine the flow rate of each side on the model, select each path on the conceptual model that satisfies the flow rate of each side, and according to each vertex included in each selected path, by the processing device and the processing device And a determination unit that determines a plurality of combinations with the data device that stores data to be processed.

 第2の態様は、分散処理管理方法に関する。第2態様に係る分散処理管理方法は、少なくとも1つのコンピュータが、データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成し、前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決定し、前記各辺の流量を満たす前記概念モデル上の各経路を選択し、前記選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する、ことを含む。 The second aspect relates to a distributed processing management method. In the distributed processing management method according to the second aspect, at least one computer includes a plurality of first vertices indicating a plurality of data devices that store data, and a plurality of second vertices indicating a plurality of processing devices that process data. A plurality of transfer amount constraint conditions each including a data transferable amount per unit time from each data device to each processing device as an upper limit value. Each processing amount constraint condition including the first side of each of the processing units and the data processing capacity per unit time of each processing device as an upper limit value is set, respectively, from each second vertex to the subsequent stage from each second vertex Generating model information capable of constructing a conceptual model including at least one second side reaching at least one third vertex, and each of the first vertex, the second vertex, the first side, and the second side Including the concept The amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition set for the first side and the second side included in the route for each route on Dell Using the sum, determine the flow rate of each side on the conceptual model, select each route on the conceptual model that satisfies the flow rate on each side, and depending on each vertex included in each selected route Determining a plurality of combinations of the processing device and the data device storing data processed by the processing device.

 なお、本発明の他の態様としては、上記第1態様における各構成を少なくとも1つのコンピュータに実現させる管理プログラムであってもよいし、このようなプログラムを格納するコンピュータが読み取り可能な記録媒体であってもよい。この記録媒体は、非一時的な有形の媒体を含む。 Another aspect of the present invention may be a management program that causes at least one computer to implement each configuration in the first aspect, or a computer-readable recording medium that stores such a program. There may be. This recording medium includes a non-transitory tangible medium.

 上記各態様によれば、データを格納するデータ装置とそのデータを処理する処理装置とが分散配置されてなる分散システムにおけるシステム全体でのデータ処理時間を低減させる技術を提供することができる。 According to each aspect described above, it is possible to provide a technique for reducing the data processing time in the entire system in a distributed system in which a data device for storing data and processing devices for processing the data are distributedly arranged.

 上述した目的、およびその他の目的、特徴および利点は、以下に述べる好適な実施の形態、およびそれに付随する以下の図面によってさらに明らかになる。 The above-described object and other objects, features, and advantages will be further clarified by a preferred embodiment described below and the following drawings attached thereto.

第1実施形態における分散システムの構成例を概念的に示す図である。It is a figure which shows notionally the structural example of the distribution system in 1st Embodiment. 分散システムの各構成例を概念的に示す図である。It is a figure which shows notionally each structural example of a distributed system. 分散システムの各構成例を概念的に示す図である。It is a figure which shows notionally each structural example of a distributed system. 分散システムの各構成例を概念的に示す図である。It is a figure which shows notionally each structural example of a distributed system. コンピュータ間の単位時間当たりの転送可能量を示す図である。It is a figure which shows the transferable amount per unit time between computers. コンピュータの単位時間当たりの処理可能量を示す図である。It is a figure which shows the processable amount per unit time of a computer. 第1実施形態における分散システムの各装置の処理構成例を概念的に示す図である。It is a figure which shows notionally the process structural example of each apparatus of the distribution system in 1st Embodiment. データ所在格納部に格納される情報の例を示す図である。It is a figure which shows the example of the information stored in a data location storage part. 入出力通信路情報格納部に格納される情報の例を示す図である。It is a figure which shows the example of the information stored in an input / output channel information storage part. サーバ状態格納部に格納される情報の例を示す図である。It is a figure which shows the example of the information stored in a server state storage part. モデル情報の例を示す図である。It is a figure which shows the example of model information. モデル情報により構築される概念モデルの例を示す図である。It is a figure which shows the example of the conceptual model constructed | assembled by model information. データフロー情報の例を示す図である。It is a figure which shows the example of data flow information. 決定情報の例を示す図である。It is a figure which shows the example of decision information. 分散システムの動作例の全体概要を示すフローチャートである。It is a flowchart which shows the general | schematic outline | summary of the operation example of a distributed system. 工程(S401)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 1st Embodiment in process (S401). 工程(S404)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 1st Embodiment in process (S404). 工程(S404-10)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。10 is a flowchart showing detailed operations of the master server of the first embodiment in step (S404-10). 工程(S404-20)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。7 is a flowchart showing detailed operations of the master server of the first embodiment in a step (S404-20). 工程(S405)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 1st Embodiment in a process (S405). 工程(S406)における第1実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 1st Embodiment in process (S406). 第2実施形態における分散システムの各装置の処理構成例を概念的に示す図である。It is a figure which shows notionally the process structural example of each apparatus of the distribution system in 2nd Embodiment. ジョブ情報格納部に格納される情報の例を示す図である。It is a figure which shows the example of the information stored in a job information storage part. 工程(S401)における第2実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 2nd Embodiment in process (S401). 工程(S404)における第2実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 2nd Embodiment in process (S404). 工程(S404-30)における第2実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 2nd Embodiment in process (S404-30). 第2実施形態の第1変形例におけるサーバ状態格納部が格納する情報の例を示す図である。It is a figure which shows the example of the information which the server state storage part in the 1st modification of 2nd Embodiment stores. 図22に示される工程(S404-20)に関する、第2実施形態の第1変形例におけるマスタサーバの詳細動作を示すフローチャートである。FIG. 23 is a flowchart showing a detailed operation of the master server in the first modified example of the second embodiment regarding the step (S404-20) shown in FIG. 工程(S404)における第3実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 3rd Embodiment in process (S404). 工程(S404-40)における第3実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 3rd Embodiment in process (S404-40). 工程(S406)における第3実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 3rd Embodiment in process (S406). 第4実施形態における分散システムの各装置の処理構成例を概念的に示す図である。It is a figure which shows notionally the process structural example of each apparatus of the distribution system in 4th Embodiment. 工程(S404)における第4実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 4th Embodiment in process (S404). 工程(S404-12-10)における第4実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 4th Embodiment in process (S404-12-10). 工程(S404-12-10)における第4実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 4th Embodiment in process (S404-12-10). 工程(S404-12-10)における第4実施形態のマスタサーバの詳細動作を示すフローチャートである。It is a flowchart which shows detailed operation | movement of the master server of 4th Embodiment in process (S404-12-10). 工程(S404-20)(図22参照)における第5実施形態のマスタサーバの詳細動作を示すフローチャートである。25 is a flowchart showing detailed operations of the master server of the fifth embodiment in step (S404-20) (see FIG. 22). 第6実施形態におけるサーバ状態格納部が格納する情報の例を示す図である。It is a figure which shows the example of the information which the server state storage part in 6th Embodiment stores. 実施例1における分散システムの構成例を概念的に示す図である。1 is a diagram conceptually illustrating a configuration example of a distributed system in Embodiment 1. FIG. 実施例1におけるサーバ状態格納部に格納される情報を示す図である。It is a figure which shows the information stored in the server state storage part in Example 1. FIG. 実施例1における入出力通信路情報格納部に格納される情報を示す図である。It is a figure which shows the information stored in the input-output communication path information storage part in Example 1. FIG. 実施例1におけるデータ所在格納部に格納される情報を示す図である。It is a figure which shows the information stored in the data location storage part in Example 1. FIG. 実施例1において生成されたモデル情報を示す図である。It is a figure which shows the model information produced | generated in Example 1. FIG. 図38により示されるモデル情報により構築される概念モデルを示す図である。It is a figure which shows the conceptual model constructed | assembled by the model information shown by FIG. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 1, and the determination process of data flow information. 実施例1におけるデータフロー情報を示す図である。It is a figure which shows the data flow information in Example 1. FIG. 実施例1において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 1. FIG. 実施例2におけるジョブ情報格納部に格納される情報を示す図である。FIG. 10 is a diagram illustrating information stored in a job information storage unit according to the second embodiment. 実施例2におけるサーバ状態格納部に格納される情報を示す図である。It is a figure which shows the information stored in the server state storage part in Example 2. FIG. 実施例2におけるデータ所在格納部に格納される情報を示す図である。It is a figure which shows the information stored in the data location storage part in Example 2. FIG. 実施例2において生成されるモデル情報を示す図である。It is a figure which shows the model information produced | generated in Example 2. FIG. 図46に示されるモデル情報により構築される概念モデルを示す図である。It is a figure which shows the conceptual model constructed | assembled by the model information shown by FIG. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 2, and the determination process of data flow information. 実施例2において生成されたデータフロー情報を示す図である。It is a figure which shows the data flow information produced | generated in Example 2. FIG. 実施例2において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 2. FIG. 実施例3におけるデータ所在格納部に格納される情報を示す図である。It is a figure which shows the information stored in the data location storage part in Example 3. FIG. 第3実施例において生成されるモデル情報を示す図である。It is a figure which shows the model information produced | generated in 3rd Example. 図53により示されるモデル情報から構築される概念モデルを示す図である。It is a figure which shows the conceptual model constructed | assembled from the model information shown by FIG. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in 3rd Example, and the determination process of data flow information. 実施例3におけるデータフロー情報を示す図である。It is a figure which shows the data flow information in Example 3. 実施例3において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 3. FIG. 実施例4における分散システムの構成を概念的に示す図である。FIG. 10 is a diagram conceptually illustrating a configuration of a distributed system in a fourth embodiment. 実施例4におけるサーバ状態格納部に格納される情報を示す図である。It is a figure which shows the information stored in the server state storage part in Example 4. FIG. 実施例4における入出力通信路情報格納部に格納される情報を示す図である。It is a figure which shows the information stored in the input-output communication path information storage part in Example 4. FIG. 実施例4において生成されるモデル情報を示す図である。It is a figure which shows the model information produced | generated in Example 4. FIG. 図61により示されるモデル情報から構築される概念モデルを示す図である。FIG. 62 is a diagram showing a conceptual model constructed from model information shown in FIG. 61. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 4, and the determination process of data flow information. 実施例4におけるデータフロー情報を示す図である。It is a figure which shows the data flow information in Example 4. FIG. 実施例4において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 4. FIG. 実施例5におけるジョブ情報格納部に格納される情報を示す図である。FIG. 10 is a diagram illustrating information stored in a job information storage unit according to the fifth embodiment. 実施例5において生成されるモデル情報を示す図である。It is a figure which shows the model information produced | generated in Example 5. FIG. 図67により示されるモデル情報から構築される概念モデルを示す図である。FIG. 68 is a diagram showing a conceptual model constructed from model information shown in FIG. 67. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 5, and the determination process of data flow information. 実施例5におけるデータフロー情報を示す図である。It is a figure which shows the data flow information in Example 5. 実施例5において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 5. FIG. 実施例6におけるサーバ状態格納部に格納される情報を示す図である。It is a figure which shows the information stored in the server state storage part in Example 6. FIG. 実施例6において生成されるモデル情報を示す図である。It is a figure which shows the model information produced | generated in Example 6. FIG. 図73により示されるモデル情報から構築される概念モデルを示す図である。It is a figure which shows the conceptual model constructed | assembled from the model information shown by FIG. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。It is a figure which shows notionally the determination process of the flow function f by the flow increase method in the maximum flow problem in Example 6, and the determination process of data flow information. 実施例6におけるデータフロー情報を示す図である。It is a figure which shows the data flow information in Example 6. 実施例6において実施されるデータ送受信を概念的に示す図である。It is a figure which shows notionally the data transmission / reception implemented in Example 6. FIG.

 以下、本発明の実施の形態について説明する。以下の各実施形態では、複数のデータサーバに格納されたデータを複数の処理サーバで並列に処理を行う分散システムが例示されるが、本発明は、その他、分散処理を行うデータベースシステムやバッチ処理システムにも適用可能である。以下に挙げる各実施形態は例示であり、本発明は以下の各実施形態の構成に限定されない。 Hereinafter, embodiments of the present invention will be described. In each of the following embodiments, a distributed system that processes data stored in a plurality of data servers in parallel by a plurality of processing servers is exemplified, but the present invention also includes a database system that performs distributed processing and batch processing. It is also applicable to the system. Each embodiment described below is an exemplification, and the present invention is not limited to the configuration of each embodiment described below.

 [第1実施形態]
 図1Aは、第1実施形態における分散システムの構成例を概念的に示す図である。以下、図1Aを参照しながら、第1実施形態における分散システム350の構成概要及び動作概要、並びに、第1実施形態と関連技術との相違点について説明する。
[First Embodiment]
FIG. 1A is a diagram conceptually illustrating a configuration example of a distributed system in the first embodiment. Hereinafter, a configuration overview and an operation overview of the distributed system 350 in the first embodiment, and differences between the first embodiment and related technologies will be described with reference to FIG. 1A.

 分散システム350は、相互にネットワーク370によって接続される、マスタサーバ300、ネットワークスイッチ320、複数の処理サーバ330#1から330♯n、複数のデータサーバ340#1から340♯nなどを含む。なお、分散システム350は、クライアント360、他のサーバ399等を含んでもよい。以降、データサーバ340♯1から340♯nは総称してデータサーバ340と表記される場合もあり、処理サーバ330♯1から330♯nは総称して処理サーバ330と表記される場合もある。 The distributed system 350 includes a master server 300, a network switch 320, a plurality of processing servers 330 # 1 to 330 # n, a plurality of data servers 340 # 1 to 340 # n, and the like that are connected to each other by a network 370. The distributed system 350 may include a client 360, another server 399, and the like. Hereinafter, the data servers 340 # 1 to 340 # n may be collectively referred to as the data server 340, and the processing servers 330 # 1 to 330 # n may be collectively referred to as the processing server 330.

 データサーバ340は、処理サーバ330による処理対象となり得るデータを格納する。処理サーバ330は、データサーバ340からそのデータを受信して、受信したデータに対して処理プログラムを実行することにより当該データを処理する。 The data server 340 stores data that can be processed by the processing server 330. The processing server 330 receives the data from the data server 340 and processes the received data by executing a processing program.

 クライアント360は、データ処理開始をマスタサーバ300に要求するための情報である要求情報を送信する。要求情報は、処理プログラムとその処理プログラムが使用するデータを示す情報を含む。 The client 360 transmits request information that is information for requesting the master server 300 to start data processing. The request information includes information indicating a processing program and data used by the processing program.

 マスタサーバ300は、データサーバ340が格納するデータのうちの一以上のデータを処理する処理サーバ330をデータ毎に決定する。マスタサーバ300は、決定された各処理サーバ330について、処理対象のデータとそのデータを記憶しているデータサーバ340とを示す情報及び単位時間当たりのデータ処理量を示す情報を含む決定情報をそれぞれ生成する。データサーバ340及び処理サーバ330は、当該決定情報に基づいてデータの送受信を行い、処理サーバ330は受信されたデータを処理する。 The master server 300 determines a processing server 330 for processing one or more of the data stored in the data server 340 for each data. For each processing server 330 determined, the master server 300 includes determination information including information indicating data to be processed and the data server 340 storing the data, and information indicating the data processing amount per unit time. Generate. The data server 340 and the processing server 330 transmit and receive data based on the determination information, and the processing server 330 processes the received data.

 ここで、マスタサーバ300、処理サーバ330、データサーバ340、クライアント360は、それぞれ個別に、専用の装置で実現されてもよいし、汎用のコンピュータで実現されてもよい。また、一台の専用装置又は一台のコンピュータにより、マスタサーバ300、処理サーバ330、データサーバ340、クライアント360のうちの複数が実現されてもよい。以下、ハードウェアとしての一台の専用装置又は一台のコンピュータは総称して一台のコンピュータ装置と表記される。多くの場合、処理サーバ330及びデータサーバ340が当該一台のコンピュータ装置により実現される。 Here, the master server 300, the processing server 330, the data server 340, and the client 360 may be individually realized by dedicated devices or may be realized by general-purpose computers. In addition, a plurality of the master server 300, the processing server 330, the data server 340, and the client 360 may be realized by one dedicated device or one computer. Hereinafter, one dedicated device or one computer as hardware is collectively referred to as one computer device. In many cases, the processing server 330 and the data server 340 are realized by the one computer device.

 当該一台のコンピュータ装置は、例えば、バスで相互に接続される、CPU(Central Processing Unit)、メモリ、入出力インタフェース(I/F)等を有する。メモリは、RAM(Random Access Memory)、ROM(Read Only Memory)、ハードディスク、可搬型記憶媒体等である。入出力I/Fは、ネットワーク370を介して他の装置と通信を行う通信装置等と接続される。入出力I/Fは、表示装置や入力装置等のようなユーザインタフェース装置と接続されてもよい。なお、本実施形態は、マスタサーバ300、処理サーバ330、データサーバ340、クライアント360のハードウェア構成を限定しない。 The single computer device includes, for example, a CPU (Central Processing Unit), a memory, an input / output interface (I / F), and the like that are connected to each other via a bus. The memory is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like. The input / output I / F is connected to a communication device or the like that communicates with other devices via the network 370. The input / output I / F may be connected to a user interface device such as a display device or an input device. Note that this embodiment does not limit the hardware configurations of the master server 300, the processing server 330, the data server 340, and the client 360.

 図1B、図2A及び図2Bは、分散システム350の各構成例を概念的に示す図である。これらの図では、処理サーバ330及びデータサーバ340は、コンピュータと表記されており、ネットワーク370は、スイッチを経由するデータ送受信経路として表記されている。なお、マスタサーバ300は図示されていない。なお、スイッチとは、ハブ、ルータ等のようなネットワーク機器である。 1B, FIG. 2A, and FIG. 2B are diagrams conceptually illustrating each configuration example of the distributed system 350. FIG. In these drawings, the processing server 330 and the data server 340 are represented as computers, and the network 370 is represented as a data transmission / reception path via a switch. Note that the master server 300 is not shown. The switch is a network device such as a hub or a router.

 図1Bにおいて、分散システム350は、例えば、複数のコンピュータ111及び112と、それらを相互に接続するスイッチ101から103を含む。図1Bの例では、複数のコンピュータ111及びスイッチ102はラック121に収容され、複数のコンピュータ112及びスイッチ103はラック122に収容されている。更に、ラック121及び122はデータセンタ131に収容されており、データセンタ131及び132の間は、拠点間通信網141にて接続されている。このように、図1Bは、スイッチとコンピュータをスター型に接続した分散システム350を例示する。 1B, the distributed system 350 includes, for example, a plurality of computers 111 and 112 and switches 101 to 103 that connect them to each other. In the example of FIG. 1B, a plurality of computers 111 and switches 102 are accommodated in a rack 121, and a plurality of computers 112 and switches 103 are accommodated in a rack 122. Furthermore, the racks 121 and 122 are accommodated in the data center 131, and the data centers 131 and 132 are connected by the inter-base communication network 141. Thus, FIG. 1B illustrates a distributed system 350 in which switches and computers are connected in a star configuration.

 一方、図2A及び図2Bは、カスケード接続されたスイッチにより構成された分散システム350を例示する。図2A及び図2Bは、それぞれ、データサーバ340と処理サーバ330との間のデータ送受信の一例を示す。この例によれば、コンピュータ207から210がデータサーバ340として機能し、コンピュータ207及び209が処理サーバ330としても機能する。ここでは、例えば、コンピュータ210がマスタサーバ300として機能する。 On the other hand, FIGS. 2A and 2B illustrate a distributed system 350 configured by cascade-connected switches. 2A and 2B show examples of data transmission / reception between the data server 340 and the processing server 330, respectively. According to this example, the computers 207 to 210 function as the data server 340, and the computers 207 and 209 also function as the processing server 330. Here, for example, the computer 210 functions as the master server 300.

 図2A及び図2Bにおいて、スイッチ201及び202で接続されたコンピュータのうち、コンピュータ207及び209以外のコンピュータは、他の処理を実行中であることから、更なるデータ処理のために利用され得ない。利用不可能なコンピュータ208は、処理対象のデータ211及び212を記憶用ディスク204に記憶している。同様に、利用不可能なコンピュータ210は、処理対象のデータ213を記憶用ディスク206に記憶している。一方、利用可能なコンピュータ207は、処理プロセス214を実行しており、利用可能なコンピュータ209は、処理プロセス215を実行している。 2A and 2B, among the computers connected by the switches 201 and 202, computers other than the computers 207 and 209 cannot be used for further data processing because they are executing other processing. . The unusable computer 208 stores processing target data 211 and 212 in the storage disk 204. Similarly, the unusable computer 210 stores the processing target data 213 in the storage disk 206. On the other hand, the available computer 207 executes the processing process 214, and the available computer 209 executes the processing process 215.

 図3は、コンピュータ間の単位時間当たりの転送可能量を示す図である。図3に示される表220によれば、上述のような処理対象のデータを他のコンピュータに転送する際の単位時間当たりの転送量が示されている。この例において、各処理プロセスは、割り当てられたデータに対して必要な処理を並列に実行可能であるとする。 FIG. 3 is a diagram showing the transferable amount per unit time between computers. According to the table 220 shown in FIG. 3, the transfer amount per unit time when transferring the data to be processed as described above to another computer is shown. In this example, it is assumed that each processing process can perform necessary processing on the allocated data in parallel.

 図3の表220は具体的には次の内容を示す。コンピュータ208から処理対象のデータをコンピュータ207に転送する場合の、単位時間当たりの転送可能量は50メガバイト毎秒(MB/s)である。コンピュータ208とコンピュータ209とに関する当該転送可能量は50MB/sであり、コンピュータ210とコンピュータ207とに関する当該転送可能量は50MB/sであり、コンピュータ210とコンピュータ209とに関する当該転送可能量は100MB/sである。 3 specifically shows the following contents. When data to be processed is transferred from the computer 208 to the computer 207, the transferable amount per unit time is 50 megabytes per second (MB / s). The transferable amount for the computer 208 and the computer 209 is 50 MB / s, the transferable amount for the computer 210 and the computer 207 is 50 MB / s, and the transferable amount for the computer 210 and the computer 209 is 100 MB / s. s.

 図4は、コンピュータの単位時間当たりの処理可能量を示す図である。図4に示される表221によれば、コンピュータ207が処理対象のデータを処理するための単位時間当たりの処理可能量は50MB/sであり、コンピュータ209に関する当該処理可能量は150MB/sである。 FIG. 4 is a diagram showing the processable amount per unit time of the computer. According to the table 221 shown in FIG. 4, the processable amount per unit time for the computer 207 to process data to be processed is 50 MB / s, and the processable amount for the computer 209 is 150 MB / s. .

 ここで、データ処理のスループットは、処理対象のデータを転送する経路の単位時間当たりの転送可能量と、処理を行うコンピュータの単位時間当たりの処理可能量との間のどちらか小さい値となる。図2Aでは、処理対象のデータ211は、データ転送経路216を介して伝送されて、利用可能なコンピュータ207で処理され、処理対象のデータ213は、データ転送経路217を介して伝送されて、利用可能なコンピュータ209で処理される。なお、処理対象のデータ212は、いずれの処理プロセスにも割り当てられず、待機状態となる。 Here, the throughput of the data processing is a smaller value between the transferable amount per unit time of the path for transferring the processing target data and the processable amount per unit time of the computer that performs the processing. In FIG. 2A, the processing target data 211 is transmitted via the data transfer path 216 and processed by the available computer 207, and the processing target data 213 is transmitted via the data transfer path 217 for use. Processed by possible computer 209. Note that the processing target data 212 is not assigned to any processing process and is in a standby state.

 一方、図2Bでは、処理対象のデータ211は、データ転送経路230を介して伝送されて、利用可能なコンピュータ207で処理され、処理対象のデータ212は、データ転送経路231を介して伝送されて、利用可能なコンピュータ209で処理され、処理対象のデータ213は、データ転送経路232を介して伝送されて、利用可能なコンピュータ209で処理される。 On the other hand, in FIG. 2B, the processing target data 211 is transmitted via the data transfer path 230 and processed by the available computer 207, and the processing target data 212 is transmitted via the data transfer path 231. The data 213 to be processed is transmitted through the data transfer path 232 and processed by the available computer 209.

 図2Aにおけるデータ処理の総スループットは、処理対象のデータ211に関するスループット(50MB/s)と、処理対象のデータ213に関するスループット(100MB/s)との和となり、150MB/sである。処理対象のデータ211に関するスループット(50MB/s)は、データ転送経路216の当該転送可能量50MB/sとコンピュータ207の当該処理可能量50MB/sとの間の小さいほうの値である。処理対象のデータ213に関するスループット(100MB/s)は、データ転送経路217の当該転送可能量100MB/sとコンピュータ209の当該処理可能量150MB/sの小さいほうの値である。 The total throughput of data processing in FIG. 2A is 150 MB / s, which is the sum of the throughput (50 MB / s) related to the data 211 to be processed and the throughput (100 MB / s) related to the data 213 to be processed. The throughput (50 MB / s) regarding the data 211 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 216 and the processable amount 50 MB / s of the computer 207. The throughput (100 MB / s) related to the processing target data 213 is a smaller value of the transferable amount 100 MB / s of the data transfer path 217 and the processable amount 150 MB / s of the computer 209.

 図2Bにおけるデータ処理の総スループットは、処理対象のデータ211に関するスループット(50MB/s)と、処理対象のデータ212に関するスループット(50MB/s)と、処理対象のデータ213に関するスループット(100MB/s)との和となり、200MB/sである。処理対象のデータ211に関するスループット(50MB/s)は、データ転送経路230の当該転送可能量50MB/sとコンピュータ207の当該処理可能量50MB/sとの間の小さいほうの値である。処理対象のデータ212に関するスループット(50MB/s)は、データ転送経路231の当該転送可能量50MB/sとコンピュータ209の当該処理可能量150MB/sとの間の小さいほうの値である。処理対象のデータ213に関するスループット(100MB/s)は、データ転送経路232の当該転送可能量100MB/sとコンピュータ209の当該処理可能量150MB/sとの間の小さいほうの値である。 The total throughput of the data processing in FIG. 2B is the throughput (50 MB / s) regarding the processing target data 211, the throughput regarding the processing target data 212 (50 MB / s), and the throughput regarding the processing target data 213 (100 MB / s). The sum is 200 MB / s. The throughput (50 MB / s) regarding the data 211 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 230 and the processable amount 50 MB / s of the computer 207. The throughput (50 MB / s) regarding the data 212 to be processed is a smaller value between the transferable amount 50 MB / s of the data transfer path 231 and the processable amount 150 MB / s of the computer 209. The throughput (100 MB / s) regarding the data 213 to be processed is a smaller value between the transferable amount 100 MB / s of the data transfer path 232 and the processable amount 150 MB / s of the computer 209.

 以上より、図2Bにおけるデータ処理は、図2Aにおけるデータ処理に比べて総スループットが大きく、効率的である。 As described above, the data processing in FIG. 2B has a higher total throughput and is more efficient than the data processing in FIG. 2A.

 ところで、上述のような第1実施形態に関連する技術として、ネットワーク構成に基づく距離(例えば、ホップ数等)に基づいて、各処理対象のデータを逐次的に処理サーバに割り当てる分散システムが存在する。このような分散システムでは、図2Aに示されるような非効率なデータ割り当てが行われる場合がある。このような関連技術では、コンピュータ間の転送可能量や、処理を実行するコンピュータの処理可能量を考慮することなく、ネットワーク構成に基づく距離のみで当該割り当てが行われるからである。 By the way, as a technique related to the first embodiment as described above, there is a distributed system that sequentially assigns data to be processed to processing servers based on a distance (for example, the number of hops) based on a network configuration. . In such a distributed system, inefficient data allocation as shown in FIG. 2A may be performed. This is because in such a related technique, the allocation is performed only by the distance based on the network configuration without considering the transferable amount between computers and the processable amount of the computer executing the processing.

 第1実施形態における分散システム350は、図2A及び図2Bに例示した状況において、図2Bに示されるような効率的なデータ割り当てを行う。以下、第1実施形態における分散システム350について更に詳細を説明する。 The distributed system 350 in the first embodiment performs efficient data allocation as shown in FIG. 2B in the situation illustrated in FIGS. 2A and 2B. Hereinafter, the details of the distributed system 350 in the first embodiment will be described.

 〔第1実施形態における処理構成例〕
 図5は、第1実施形態における分散システム350の各装置の処理構成例を概念的に示す図である。これら各処理部は、個々に又は複数組み合わせられて、ハードウェア構成要素として実現されてもよいし、ソフトウェア構成要素として実現されてもよいし、ハードウェア構成要素及びソフトウェア構成要素の組み合わせにより実現されてもよい。ハードウェア構成要素とは、例えば、フィールド・プログラマブル・ゲートアレイ(FPGA)、特定用途向け集積回路(ASIC)、ゲートアレイ、論理ゲートの組み合わせ、信号処理回路、アナログ回路等のようなハードウェア回路である。ソフトウェア構成要素とは、1又は複数のメモリ上のデータ(プログラム)が1又は複数のプロセッサ(例えば、CPU(Central Processing Unit)、DSP(Digital Signal Processor)等)で実行されることにより実現される、タスク、プロセス、関数のようなソフトウェア部品(断片)である。なお、当該一台のコンピュータ装置が、マスタサーバ300、処理サーバ330、データサーバ340の中の複数ノードを実現する場合には、実現される複数ノードの各処理構成を実現し、データ送受信部のように重複する処理構成を共用するようにしてもよい。
[Processing configuration example in the first embodiment]
FIG. 5 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the first embodiment. Each of these processing units may be realized individually or in combination and realized as a hardware component, a software component, or a combination of a hardware component and a software component. May be. A hardware component is a hardware circuit such as a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a gate array, a combination of logic gates, a signal processing circuit, an analog circuit, etc. is there. A software component is realized by executing data (program) on one or more memories by one or more processors (for example, a CPU (Central Processing Unit), a DSP (Digital Signal Processor), etc.). Software parts (fragments) such as tasks, processes and functions. When the single computer device realizes a plurality of nodes in the master server 300, the processing server 330, and the data server 340, each processing configuration of the realized plurality of nodes is realized, and the data transmission / reception unit As described above, overlapping processing configurations may be shared.

 <処理サーバ330>
 処理サーバ330は、処理サーバ管理部331、処理実行部332、処理プログラム格納部333、データ送受信部334等を有する。
<Processing server 330>
The processing server 330 includes a processing server management unit 331, a processing execution unit 332, a processing program storage unit 333, a data transmission / reception unit 334, and the like.

 ===処理サーバ管理部331===
 処理サーバ管理部331は、マスタサーバ300からの処理割り当てに従って、処理実行部332に処理を実行させ、更に、現在実行中の処理の状態を管理する。具体的には、処理サーバ管理部331は、マスタサーバ300から、データの識別子、そのデータが格納されているデータサーバ340の処理データ格納部342の識別子等を含む決定情報を受信する。処理サーバ管理部331は、受信された決定情報を処理実行部332に渡す。決定情報は、処理実行部332ごとに生成されても良い。また、決定情報には、処理実行部332を示す識別子が含まれており、処理サーバ管理部331は、決定情報に含まれる識別子で識別される処理実行部332に当該決定情報を渡しても良い。
=== Processing Server Management Unit 331 ===
The processing server management unit 331 causes the processing execution unit 332 to execute processing in accordance with the processing assignment from the master server 300, and further manages the status of the processing currently being executed. Specifically, the processing server management unit 331 receives determination information including an identifier of data, an identifier of the processing data storage unit 342 of the data server 340 in which the data is stored, from the master server 300. The processing server management unit 331 passes the received decision information to the processing execution unit 332. The determination information may be generated for each processing execution unit 332. The determination information includes an identifier indicating the process execution unit 332, and the process server management unit 331 may pass the determination information to the process execution unit 332 identified by the identifier included in the determination information. .

 処理サーバ管理部331は、処理実行部332がデータを処理する際に用いる処理プログラムの実行状態に関する情報を保持する。処理サーバ管理部331は、この処理プログラムの実行状態に関する情報を、当該処理プログラムの実行状態の変化に応じて更新する。処理プログラムの実行状態には、例えば、実行前状態、実行中状態、実行完了状態等がある。実行前状態とは、データを処理実行部332に割り当てる処理は終了したが、当該処理実行部332が、そのデータの処理を未だ実行していない状態を示す。実行中状態とは、処理実行部332がそのデータを実行している状態を示す。実行完了状態とは、処理実行部332がそのデータの処理を完了した状態を示す。処理プログラムの実行状態として、処理実行部332に割り当てられたデータの総量に対する、その処理実行部332による処理済みのデータ量の割合に基づいて定められる状態が利用されてもよい。 The processing server management unit 331 holds information regarding the execution state of the processing program used when the processing execution unit 332 processes data. The processing server management unit 331 updates the information regarding the execution state of the processing program according to the change in the execution state of the processing program. The execution state of the processing program includes, for example, a pre-execution state, a running state, and an execution completion state. The pre-execution state indicates a state where the process of assigning data to the process execution unit 332 has been completed, but the process execution unit 332 has not yet executed the process of the data. The in-execution state indicates a state in which the process execution unit 332 is executing the data. The execution completion state indicates a state in which the process execution unit 332 has completed processing the data. As the execution state of the processing program, a state determined based on the ratio of the amount of data processed by the processing execution unit 332 to the total amount of data allocated to the processing execution unit 332 may be used.

 ===処理実行部332===
 処理実行部332は、処理サーバ管理部331の指示に従って、データ送受信部334を介してデータサーバ340から処理対象のデータを受信し、処理対象のデータに対し処理を実行する。具体的には、処理実行部332は、処理サーバ管理部331から、データの識別子とそのデータが格納されているデータサーバ340の処理データ格納部342の識別子とを受け取る。処理実行部332は、受け取った処理データ格納部342の識別子に対応するデータサーバ340に対し、当該データ識別子が示すデータの送信を要求する。処理実行部332は、当該データサーバ340から、その要求に応じて送信されるデータを受信し、そのデータに対し処理を実行する。処理実行部332は、複数の処理を並列に実行するために、処理サーバ330内に複数存在しても良い。
=== Process Execution Unit 332 ===
The processing execution unit 332 receives processing target data from the data server 340 via the data transmission / reception unit 334 in accordance with an instruction from the processing server management unit 331, and executes processing on the processing target data. Specifically, the process execution unit 332 receives from the process server management unit 331 the data identifier and the identifier of the process data storage unit 342 of the data server 340 in which the data is stored. The process execution unit 332 requests the data server 340 corresponding to the received identifier of the process data storage unit 342 to transmit data indicated by the data identifier. The process execution unit 332 receives data transmitted in response to the request from the data server 340, and executes a process on the data. A plurality of processing execution units 332 may exist in the processing server 330 in order to execute a plurality of processes in parallel.

 ===処理プログラム格納部333===
 処理プログラム格納部333は、他のサーバ399又はクライアント360から処理プログラムを受信し、その処理プログラムを格納する。
=== Processing Program Storage Unit 333 ===
The processing program storage unit 333 receives a processing program from another server 399 or client 360 and stores the processing program.

 ===データ送受信部334===
 データ送受信部334は、他の処理サーバ330やデータサーバ340とデータの送受信を行う。
=== Data Transmission / Reception Unit 334 ===
The data transmission / reception unit 334 transmits / receives data to / from another processing server 330 or the data server 340.

 このように、処理サーバ330は、処理対象のデータを、マスタサーバ300から指定されたデータサーバ340から、データサーバ340のデータ送受信部343、ネットワークスイッチ320のデータ送受信部322、及び、処理サーバ330のデータ送受信部334を介して受信する。処理サーバ330の処理実行部332は、受信された処理対象のデータを処理する。処理サーバ330がデータサーバ340と同一のコンピュータ装置で実現されている場合、処理サーバ330は、処理対象のデータを、処理データ格納部342から直接取得してもよい。また、データサーバ340のデータ送受信部343と処理サーバ330のデータ送受信部334とが、ネットワークスイッチ320のデータ送受信部322を介さず、直接通信してもよい。 As described above, the processing server 330 sends data to be processed from the data server 340 designated by the master server 300 to the data transmission / reception unit 343 of the data server 340, the data transmission / reception unit 322 of the network switch 320, and the processing server 330. The data transmission / reception unit 334 receives the data. The process execution unit 332 of the process server 330 processes the received data to be processed. When the processing server 330 is realized by the same computer device as the data server 340, the processing server 330 may directly acquire processing target data from the processing data storage unit 342. Further, the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 334 of the processing server 330 may directly communicate without passing through the data transmission / reception unit 322 of the network switch 320.

 <データサーバ340>
 データサーバ340は、データサーバ管理部341、処理データ格納部342、データ送受信部343等を有する。
<Data server 340>
The data server 340 includes a data server management unit 341, a processing data storage unit 342, a data transmission / reception unit 343, and the like.

 ===データサーバ管理部341===
 データサーバ管理部341は、マスタサーバ300に対して、処理データ格納部342が格納するデータの所在情報を送信する。処理データ格納部342は、分散システム350において一意に識別されるデータを格納する。
=== Data Server Management Unit 341 ===
The data server management unit 341 transmits location information of data stored in the processing data storage unit 342 to the master server 300. The processing data storage unit 342 stores data uniquely identified in the distributed system 350.

 ===処理データ格納部342===
 処理データ格納部342は、処理サーバ330に処理されるデータを、各データの識別子と共に格納する。処理データ格納部342は、ハード・ディスク・ドライブ(Hard Disc Drive; HDD)やソリッド・ステート・ドライブ(Solid State Drive; SSD)、USBメモリ(Universal Serial Bus flash drive)、RAM(Random Access Memory)ディスクなどのような記憶媒体上に実現される。処理データ格納部342に格納されるデータは、処理サーバ330が出力したもの又は出力中のものであっても良い。また、処理データ格納部342に格納されるデータは、処理データ格納部342が他のサーバ等から受信したものでも、処理データ格納部342が可搬型記憶媒体等から読み込んだものでも良い。
=== Processing Data Storage Unit 342 ===
The processing data storage unit 342 stores data to be processed by the processing server 330 together with an identifier of each data. The processing data storage unit 342 includes a hard disk drive (HDD), a solid state drive (SSD), a USB memory (Universal Serial Bus flash drive), and a RAM (Random Access Memory) disk. And so on. The data stored in the processing data storage unit 342 may be data output by the processing server 330 or data being output. The data stored in the processing data storage unit 342 may be received by the processing data storage unit 342 from another server or the like, or may be read by the processing data storage unit 342 from a portable storage medium or the like.

 ===データ送受信部343===
 データ送受信部343は、他の処理サーバ330や他のデータサーバ340とデータの送受信を行う。
=== Data Transmission / Reception Unit 343 ===
The data transmission / reception unit 343 transmits / receives data to / from another processing server 330 or another data server 340.

 <ネットワークスイッチ320>
 ネットワークスイッチ320は、データ送受信部322を有する。
 データ送受信部322は、処理サーバ330及びデータサーバ340の間で送受信されるデータを中継する。
<Network switch 320>
The network switch 320 has a data transmission / reception unit 322.
The data transmission / reception unit 322 relays data transmitted / received between the processing server 330 and the data server 340.

 <マスタサーバ300>
 マスタサーバ300は、データ所在格納部3070、サーバ状態格納部3060、入出力通信路情報格納部3080、モデル生成部301、決定部303等を有する。
<Master server 300>
The master server 300 includes a data location storage unit 3070, a server state storage unit 3060, an input / output channel information storage unit 3080, a model generation unit 301, a determination unit 303, and the like.

 ===データ所在格納部3070===
 データ所在格納部3070は、データの識別子と、そのデータを格納している格納装置(データサーバ340又はその処理データ格納部342)の識別子とを対応付けた状態で格納する。
=== Data Location Storage Unit 3070 ===
The data location storage unit 3070 stores the identifier of the data and the identifier of the storage device (the data server 340 or the processing data storage unit 342) storing the data in association with each other.

 データは、ディレクトリやデータの構造を規定する構造プログラムにおいて、識別名によって明示的に指定されても、指定された処理プログラムの出力結果等、他の処理結果に基づいて指定されても良い。構造プログラムは、処理プログラムが処理の対象とするデータを規定する情報である。構造プログラムは、あるデータを示す情報(名称や識別子)を入力として受け取り、その入力に対応するデータが格納されているディレクトリ名及び当該データを構成するファイルを示すファイル名を出力する。構造プログラムは、ディレクトリ名又はファイル名の一覧表などであっても良い。 Data may be explicitly specified by an identification name in a structure program that defines the structure of a directory or data, or may be specified based on other processing results such as an output result of a specified processing program. The structure program is information that defines data to be processed by the processing program. The structure program receives information (name or identifier) indicating certain data as an input, and outputs a directory name in which data corresponding to the input is stored and a file name indicating a file constituting the data. The structure program may be a list of directory names or file names.

 処理プログラムが引数として受け取る情報の単位が、分散ファイルシステム(Distributed File System)における個々の分散ファイルである場合、当該データは各分散ファイルとなる。処理プログラムが引数として受け取る情報の単位が、行又はレコードである場合、当該データは、分散ファイル中の複数の行又は複数のレコードとなる。処理プログラムが引数として受け取る情報の単位が、リレーショナルデータベースにおけるテーブルの「行」である場合、当該データは、或るテーブルの集合から所定の検索によって得られる行の集合、又は、当該或るテーブルの集合から或る属性の範囲検索によって得られた行の集合等になる。当該データが、C++やJAVA(登録商標)等のプログラムのMapやVector等のコンテナであっても、そのコンテナの要素であってもよい。更に、当該データが行列であっても、その行列の行、列、或いは行列要素であっても良い。 When the unit of information received as an argument by the processing program is an individual distributed file in the distributed file system (Distributed File System), the data is each distributed file. When the unit of information received as an argument by the processing program is a row or a record, the data is a plurality of rows or a plurality of records in the distributed file. When the unit of information received as an argument by the processing program is a “row” of a table in a relational database, the data is a set of rows obtained by a predetermined search from a set of tables or A set of rows obtained by a range search of a certain attribute from the set is obtained. The data may be a container such as Map or Vector of a program such as C ++ or JAVA (registered trademark), or may be an element of the container. Furthermore, even if the data is a matrix, it may be a row, column, or matrix element of the matrix.

 一以上のデータの識別子がデータ所在格納部3070に登録されることにより、処理対象のデータが定まる。処理対象のデータの名称は、データの識別子と当該データの格納装置の識別子と対応付けられて、データ所在格納部3070に格納される。各データは、複数の部分集合(部分データ)に分割され、その複数の部分集合がそれぞれ複数の格納装置に分散配置されていても良い。また、或るデータが各々2以上の格納装置に多重化されて配置されていても良い。この場合、一つのデータから多重化されたデータは総称して分散データとも呼ばれる。処理サーバ330は、多重化されたデータを処理するために、分散データの何れか一つを処理データとして入力すれば良い。 The data to be processed is determined by registering one or more data identifiers in the data location storage unit 3070. The name of the data to be processed is stored in the data location storage unit 3070 in association with the identifier of the data and the identifier of the data storage device. Each data may be divided into a plurality of subsets (partial data), and the plurality of subsets may be distributed in a plurality of storage devices. Further, certain data may be multiplexed and arranged in two or more storage devices. In this case, data multiplexed from one data is also collectively referred to as distributed data. The processing server 330 may input any one of the distributed data as the processing data in order to process the multiplexed data.

 図6は、データ所在格納部3070に格納される情報の例を示す図である。図6に示されるように、データ所在格納部3070は、データ名称3071、分散形態3073、データ記述3074又はデータ名称3077が対応付けられた情報であるデータ所在情報を複数格納する。 FIG. 6 is a diagram illustrating an example of information stored in the data location storage unit 3070. As shown in FIG. 6, the data location storage unit 3070 stores a plurality of data location information, which is information associated with a data name 3071, a distributed form 3073, a data description 3074, or a data name 3077.

 分散形態3073は、データの格納形態を示す情報である。データ(例えば、MyDataSet1)が単一形態で配置されている場合、そのデータに対応する行(データ所在情報)の分散形態3073に「単一」が設定される。データ(例えば、MyDataSet2)が分散形態で配置されている場合、そのデータに対応する行の情報(データ所在情報)の分散形態3073に「分散配置」が設定される。データ(例えば、MyDataSet3)が多重化されて配置されている場合、そのデータに対応する行の情報(データ所在情報)の分散形態3073に「n重化(1/n)」(nは2以上の整数)が設定される。 The distribution form 3073 is information indicating a data storage form. When data (for example, MyDataSet1) is arranged in a single form, “single” is set in the distributed form 3073 of the row (data location information) corresponding to the data. When data (for example, MyDataSet2) is arranged in a distributed form, “distributed arrangement” is set in the distributed form 3073 of the row information (data location information) corresponding to the data. When data (for example, MyDataSet3) is multiplexed and arranged, “n-duplication (1 / n)” (n is 2 or more) in the distribution form 3073 of the information (data location information) of the row corresponding to the data Integer).

 データ記述3074は、データ識別子3075、データの格納装置(データサーバ340又はその処理データ格納部342)の識別子3076、及び、処理状況3078を含む。データ識別子3075は、各データの格納装置においてそのデータを一意に示す識別子である。データ識別子3075によって特定される情報は、対象データの種類に応じて決まる。例えば、データがファイルの場合、データ識別子3075はファイル名を指定する情報である。データがデータベースのレコードの場合、データ識別子3075は、レコードを抽出するようなSQL(Structured Query Language)を指定する情報であっても良い。 The data description 3074 includes a data identifier 3075, an identifier 3076 of a data storage device (data server 340 or its processing data storage unit 342), and a processing status 3078. The data identifier 3075 is an identifier that uniquely indicates the data in each data storage device. The information specified by the data identifier 3075 is determined according to the type of target data. For example, when the data is a file, the data identifier 3075 is information for specifying a file name. When the data is a database record, the data identifier 3075 may be information specifying SQL (Structured Query Language) for extracting the record.

 格納装置の識別子3076は、各データを格納するデータサーバ340又は処理データ格納部342の識別子である。識別子3076は、分散システム350内における一意の情報でも良いし、各機器に割り当てられたIP(Internet Protocol)アドレスでも良い。 The storage device identifier 3076 is an identifier of the data server 340 or the processing data storage unit 342 for storing each data. The identifier 3076 may be unique information in the distributed system 350, or may be an IP (Internet Protocol) address assigned to each device.

 処理状況3078は、データ識別子3075で特定されるデータの処理状況を示す情報である。処理状況3078には、当該データの全てが未処理であることを示す「未処理」や、当該データが処理サーバ330によって処理中であることを示す「処理中」、当該データの全てが処理済みであることを示す「処理済」が設定される。処理状況3078は、当該データの処理の進捗を示す情報(例えば、50MB目以降は未処理である等)であっても良い。また、多重化などの場合で、全てのデータ識別子が示すデータの処理状況が等しい場合、まとめて記述しても良い。処理状況3078は、処理サーバ330による処理の進捗状況に応じて、マスタサーバ300等によって更新される。 The processing status 3078 is information indicating the processing status of the data specified by the data identifier 3075. In the processing status 3078, “unprocessed” indicating that all of the data is unprocessed, “processing” indicating that the data is being processed by the processing server 330, and all of the data has been processed. “Processed” is set to indicate that. The processing status 3078 may be information indicating the progress of processing the data (for example, unprocessed after the 50th MB). Further, in the case of multiplexing or the like, when the processing statuses of the data indicated by all the data identifiers are equal, they may be described together. The processing status 3078 is updated by the master server 300 or the like according to the progress status of processing by the processing server 330.

 データの一部又は全てのデータが多重化されているとき、当該データに対応するデータの名称3071に対応付けられて、「分散配置」であることを示す記述(分散形態3073)、及びデータの名称3077(SubSet1等)が格納される。このとき、データ所在格納部3070は、前述の部分データ名3077のそれぞれをデータの名称3071として、それぞれ分散形態3073及びデータ記述3074と対応付けて格納する(例えば、図6の5行目)。データ(例えば、SubSet1)が多重化(例えば二重化)されている場合、当該データの名称3071は、分散形態3073、及びデータに含まれる多重化データ毎のデータ記述3074と対応付けられて、データ所在格納部3070に格納される。当該データ記述3074は、多重化されたデータを格納する格納装置の識別子3076及び当該格納装置内においてデータを一意に示す識別子(データ識別子3075)を含む。 When a part or all of the data is multiplexed, a description (distributed form 3073) indicating “distributed arrangement” associated with the data name 3071 corresponding to the data, and the data The name 3077 (SubSet1 etc.) is stored. At this time, the data location storage unit 3070 stores each of the partial data names 3077 as the data name 3071 in association with the distribution form 3073 and the data description 3074 (for example, the fifth line in FIG. 6). When data (for example, SubSet1) is multiplexed (for example, duplexed), the name 3071 of the data is associated with the distribution form 3073 and the data description 3074 for each multiplexed data included in the data, and the data location It is stored in the storage unit 3070. The data description 3074 includes an identifier 3076 of a storage device that stores the multiplexed data and an identifier (data identifier 3075) that uniquely indicates the data in the storage device.

 データ所在格納部3070の各行の情報(各データ所在情報)は、対応するデータの処理が完了した際に、マスタサーバ300、処理サーバ330又はデータサーバ340によって削除される。また、データ所在格納部3070の各行の情報(各データ所在情報)の削除の代わりに、各行の情報(各データ所在情報)に対してデータの処理完了と未完了を表す情報が追加されることで、データの処理の完了が記録されても良い。 Information of each row (each data location information) in the data location storage unit 3070 is deleted by the master server 300, the processing server 330, or the data server 340 when processing of the corresponding data is completed. Further, instead of deleting the information on each row (each data location information) in the data location storage unit 3070, information indicating completion and incomplete data processing is added to the information on each row (each data location information). Thus, completion of data processing may be recorded.

 なお、分散システム350が扱うデータの分散形態の種類が一種類である場合、データ所在格納部3070は、分散形態3073を含まなくてもよい。以下、説明を簡単にするために、原則的にデータの分散形態の種類が上述した何れか一種類の形態であると仮定される。複数の分散形態の組み合わせに対応するためには、マスタサーバ300、データサーバ340及び処理サーバ330は、分散形態3073の記述に基づいて、以降説明する処理を切り替えるようにすればよい。 In addition, when the type of data distributed form handled by the distributed system 350 is one type, the data location storage unit 3070 may not include the distributed form 3073. Hereinafter, in order to simplify the explanation, it is assumed that in principle, the type of data distribution is one of the above-described types. In order to deal with a combination of a plurality of distributed forms, the master server 300, the data server 340, and the processing server 330 may switch processes described below based on the description of the distributed form 3073.

 ===入出力通信路情報格納部3080===
 図7は、入出力通信路情報格納部3080に格納される情報の例を示す図である。入出力通信路情報格納部3080は、分散システム350を構成する各入出力通信路について、通信路ID3081、可用帯域3082、入力元装置ID3083及び出力先装置ID3084を対応付けた情報である入出力通信路情報をそれぞれ格納する。通信路ID3081は、入出力通信が発生する機器間の入出力通信路の識別子である。可用帯域3082は、入出力通信路で現在利用可能な帯域情報である。可用帯域は、一般に単位時間当たりのデータの転送可能量を示す。帯域情報は実測値であっても推測値であっても良い。入力元装置ID3083は、入出力通信路にデータを入力する装置の識別子である。出力先装置ID3084は、入出力通信路がデータを出力する装置の識別子である。入力元装置ID3083及び出力先装置ID3084で示される装置の識別子は、データサーバ340、処理サーバ330、ネットワークスイッチ320、及び処理データ格納部342等に割り当てられた、分散システム350内の一意の識別子でも良いし、各機器に割り当てられたIPアドレスでも良い。
=== Input / Output Communication Path Information Storage Unit 3080 ===
FIG. 7 is a diagram illustrating an example of information stored in the input / output communication path information storage unit 3080. The input / output communication path information storage unit 3080 is input / output communication that is information that associates the communication path ID 3081, the usable bandwidth 3082, the input source apparatus ID 3083, and the output destination apparatus ID 3084 with respect to each input / output communication path configuring the distributed system 350. Stores road information. The communication path ID 3081 is an identifier of an input / output communication path between devices in which input / output communication occurs. The available bandwidth 3082 is bandwidth information currently available on the input / output communication path. The available bandwidth generally indicates the amount of data that can be transferred per unit time. The band information may be an actual measurement value or an estimated value. The input source device ID 3083 is an identifier of a device that inputs data to the input / output communication path. The output destination device ID 3084 is an identifier of a device from which the input / output communication path outputs data. The device identifiers indicated by the input source device ID 3083 and the output destination device ID 3084 are unique identifiers in the distributed system 350 assigned to the data server 340, the processing server 330, the network switch 320, the processing data storage unit 342, and the like. It may be an IP address assigned to each device.

 入出力通信路は、データサーバ340のデータ送受信部343と処理サーバ330のデータ送受信部334との間の通信路であってもよいし、データサーバ340における処理データ格納部342とデータ送受信部343との間の通信路であってもよいし、データサーバ340のデータ送受信部343とネットワークスイッチ320のデータ送受信部322との間の通信路であっても良い。また、入出力通信路は、ネットワークスイッチ320のデータ送受信部322と処理サーバ330のデータ送受信部334との間の通信路であってもよいし、ネットワークスイッチ320のデータ送受信部322間の通信路であってもよい。ネットワークスイッチ320のデータ送受信部322を介さずに、直接データサーバ340のデータ送受信部343と処理サーバ330のデータ送受信部334との間で通信路が構成されている場合、当該入出力通信路もその通信路に含まれる。以降、このような入出力通信路を単に通信路とも表記する。 The input / output communication path may be a communication path between the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 334 of the processing server 330, or the processing data storage unit 342 and the data transmission / reception unit 343 in the data server 340. Or a communication path between the data transmission / reception unit 343 of the data server 340 and the data transmission / reception unit 322 of the network switch 320. The input / output communication path may be a communication path between the data transmission / reception unit 322 of the network switch 320 and the data transmission / reception unit 334 of the processing server 330, or a communication path between the data transmission / reception unit 322 of the network switch 320. It may be. When a communication path is configured between the data transmission / reception unit 343 of the direct data server 340 and the data transmission / reception unit 334 of the processing server 330 without using the data transmission / reception unit 322 of the network switch 320, the input / output communication path is also It is included in the communication path. Hereinafter, such an input / output communication path is also simply referred to as a communication path.

 ===サーバ状態格納部3060===
 図8は、サーバ状態格納部3060に格納される情報の例を示す図である。サーバ状態格納部3060は、分散システム350内で稼働している各処理サーバ330及び各データサーバ340について、サーバID3061、負荷情報3062、構成情報3063、処理データ格納部情報3064及び処理可能量情報3065を対応付けた情報である処理サーバ状態情報をそれぞれ格納する。
=== Server State Storage 3060 ===
FIG. 8 is a diagram illustrating an example of information stored in the server state storage unit 3060. The server status storage unit 3060 includes a server ID 3061, load information 3062, configuration information 3063, processing data storage unit information 3064, and processable amount information 3065 for each processing server 330 and each data server 340 operating in the distributed system 350. Each of the processing server status information, which is information associated with each other, is stored.

 サーバID3061は、処理サーバ330又はデータサーバ340の識別子である。処理サーバ330及びデータサーバ340の識別子は、分散システム350において一意の識別子でも良いし、それぞれに割り当てられたIPアドレスでも良い。負荷情報3062は、処理サーバ330又はデータサーバ340の処理負荷に関する情報を含む。負荷情報3062は、例えば、CPUの使用率、メモリ使用量、ネットワーク使用帯域等である。 The server ID 3061 is an identifier of the processing server 330 or the data server 340. The identifiers of the processing server 330 and the data server 340 may be unique identifiers in the distributed system 350, or may be IP addresses assigned to them. The load information 3062 includes information regarding the processing load of the processing server 330 or the data server 340. The load information 3062 is, for example, a CPU usage rate, a memory usage amount, a network usage band, and the like.

 構成情報3063は、処理サーバ330又はデータサーバ340の構成の状態情報を含む。構成情報3063は、例えば、処理サーバ330における、CPU周波数、コア数、及び、メモリ量等のハードウェアの仕様、OS(Operating System)等のソフトウェアの仕様等である。処理データ格納部情報3064は、データサーバ340が有する処理データ格納部342の識別子を含む。処理可能量情報3065は、処理サーバ330が単位時間当たりに処理可能なデータ量を示す。 Configuration information 3063 includes configuration status information of the processing server 330 or the data server 340. The configuration information 3063 is, for example, hardware specifications such as the CPU frequency, the number of cores, and the memory amount in the processing server 330, and software specifications such as an OS (Operating System). The processing data storage unit information 3064 includes an identifier of the processing data storage unit 342 included in the data server 340. The processable amount information 3065 indicates the amount of data that can be processed by the processing server 330 per unit time.

 サーバ状態格納部3060、データ所在格納部3070及び入出力通信路情報格納部3080に格納される情報は、スイッチ320、処理サーバ330、データサーバ340等から送信される状態通知によって更新されてもよいし、マスタサーバ300が問い合わせて得られた応答情報によって更新されてもよい。 Information stored in the server status storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 may be updated by status notifications transmitted from the switch 320, the processing server 330, the data server 340, and the like. The master server 300 may be updated with response information obtained through an inquiry.

 ここで、サーバ状態格納部3060、データ所在格納部3070及び入出力通信路情報格納部3080に格納される情報の更新処理の例について説明する。スイッチ320は、自身の各ポートの通信のスループット及び各ポートの接続先の装置の識別子(MAC(Media Access Control)アドレス、IPアドレス等)を示す情報を生成し、生成された情報を上記状態通知としてマスタサーバ300に送信する。マスタサーバ300では、サーバ状態格納部3060、データ所在格納部3070、入出力通信路情報格納部3080が、その状態通知として送られた情報に基づいて、格納されている情報をそれぞれ更新する。他の例としては、処理サーバ330が、ネットワークインタフェースのスループットを示す情報、処理対象のデータの処理実行部332への割当状況を示す情報、及び処理実行部332の使用状況を示す情報を生成し、生成された情報を上記状態通知としてマスタサーバ300に送信してもよい。他の例として、データサーバ340が、自身の処理データ格納部342(ディスク)やネットワークインタフェースのスループットを示す情報、及び自身が格納しているデータ要素の一覧を示す情報を生成し、生成された情報を上記状態通知としてマスタサーバ300に送信してもよい。マスタサーバ300は、上述のような状態通知を要求する情報を、スイッチ320、処理サーバ330、及び、データサーバ340に送信することで、上述のような状態通知を受信するようにしてもよい。 Here, an example of update processing of information stored in the server state storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 will be described. The switch 320 generates information indicating the communication throughput of each port of itself and the identifier (MAC (Media Access Control) address, IP address, etc.) of the connection destination device of each port, and notifies the generated information of the status To the master server 300. In the master server 300, the server status storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 update the stored information based on the information sent as the status notification. As another example, the processing server 330 generates information indicating the throughput of the network interface, information indicating the allocation status of the processing target data to the processing execution unit 332, and information indicating the usage status of the processing execution unit 332. The generated information may be transmitted to the master server 300 as the state notification. As another example, the data server 340 generates information indicating the throughput of its own processing data storage unit 342 (disk) or network interface, and information indicating a list of data elements stored in the data server 340. Information may be transmitted to the master server 300 as the status notification. The master server 300 may receive the status notification as described above by transmitting information requesting the status notification as described above to the switch 320, the processing server 330, and the data server 340.

 サーバ状態格納部3060、データ所在格納部3070、及び、入出力通信路情報格納部3080により格納される情報は、クライアント360や分散システム350の管理者によって予め与えられていても良い。また、これらの情報は、分散システム350を探索するクローラ等のプログラムによって収集されても良い。また、入出力通信路情報格納部3080及びデータ所在格納部3070は、分散ハッシュテーブル等の技術により、分散された装置に備えられていても良い。 Information stored in the server state storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080 may be given in advance by the administrator of the client 360 or the distributed system 350. Also, these pieces of information may be collected by a program such as a crawler that searches the distributed system 350. Further, the input / output communication path information storage unit 3080 and the data location storage unit 3070 may be provided in a distributed device by a technique such as a distributed hash table.

 ===モデル生成部301===
 モデル生成部301は、サーバ状態格納部3060、データ所在格納部3070及び入出力通信路情報格納部3080から取得される情報に基づいてモデル情報を生成する。モデル情報は、各データサーバ340から各処理サーバ330への各通信路を示す情報と、各データサーバ340から各処理サーバ330への単位時間当たりのデータ転送可能量を上限値として含む転送量制約条件と、処理サーバの単位時間当たりのデータ処理可能量を上限値として含む処理量制約条件とを含む。
=== Model Generation Unit 301 ===
The model generation unit 301 generates model information based on information acquired from the server state storage unit 3060, the data location storage unit 3070, and the input / output communication path information storage unit 3080. The model information includes information indicating each communication path from each data server 340 to each processing server 330, and a transfer amount constraint including, as an upper limit, a data transferable amount per unit time from each data server 340 to each processing server 330. And a processing amount constraint condition that includes the data processing capacity per unit time of the processing server as an upper limit value.

 図9Aは、モデル情報の例を示す図である。モデル情報500の各行(各エントリ)には、識別子、流量下限値、流量上限値及び次要素へのポインタが含まれる。識別子は、モデルに含まれるノードを特定するための情報である。モデルに含まれるノードには、データサーバ340及び処理サーバ330等のようなハードウェア要素に加えて、論理的なソフトウェア要素が設定されてもよい。本実施形態では、ハードウェア要素を示すノードとして、データサーバ340及び処理サーバ330が割り当てられるが、データサーバ340が有する記憶装置(データ装置)が割り当てられてもよいし、処理サーバ330が有するCPU等のような処理装置が割り当てられてもよい。論理的な要素については後述する。次の要素へのポインタには、対応する識別子が示すノードから繋がる他のノードを示す識別子が設定される。次の要素へのポインタには、各行を特定し得る行番号やメモリの番地情報が設定されてもよい。流量下限値及び流量上限値には、上記転送量制約条件又は上記処理量制約条件が設定される。 FIG. 9A is a diagram showing an example of model information. Each line (each entry) of the model information 500 includes an identifier, a lower limit value of the flow rate, an upper limit value of the flow rate, and a pointer to the next element. The identifier is information for specifying a node included in the model. In addition to hardware elements such as the data server 340 and the processing server 330, logical software elements may be set in the nodes included in the model. In this embodiment, the data server 340 and the processing server 330 are allocated as nodes indicating hardware elements, but a storage device (data device) included in the data server 340 may be allocated, or a CPU included in the processing server 330. A processing device such as may be assigned. The logical elements will be described later. In the pointer to the next element, an identifier indicating another node connected from the node indicated by the corresponding identifier is set. The pointer to the next element may be set with a line number that can identify each line or memory address information. The transfer amount restriction condition or the processing amount restriction condition is set in the flow rate lower limit value and the flow rate upper limit value.

 このようなモデル情報によれば、複数の頂点(ノード)と各頂点を結ぶ複数の辺とにより示される次のような概念モデルが構築され得る。この概念モデルは、ネットワークモデルやグラフ理論に基づく有向グラフ等と呼ばれる。各頂点は、図9Aのモデル情報における各ノードに対応する。各辺は、各頂点で示されるハードウェア要素間を接続する入出力通信路(通信路)、又は、対象データに対する処理自体に対応する。データ転送経路を示す各辺には、上記転送量制約条件として、例えば、入出力通信路の可用帯域が設定される。対象データに対する処理自体を示す各辺には、上記処理量制約条件が設定される。この概念モデルでは、入出力通信路は、辺と辺の端点となるノードとにより構成される部分グラフで示される。 According to such model information, the following conceptual model indicated by a plurality of vertices (nodes) and a plurality of sides connecting the vertices can be constructed. This conceptual model is called a directed graph based on a network model or graph theory. Each vertex corresponds to each node in the model information of FIG. 9A. Each side corresponds to an input / output communication path (communication path) that connects the hardware elements indicated by each vertex, or a process for target data itself. For each side indicating the data transfer path, for example, an available bandwidth of the input / output communication path is set as the transfer amount restriction condition. The processing amount restriction condition is set for each side indicating the processing for the target data itself. In this conceptual model, the input / output communication path is indicated by a subgraph composed of sides and nodes that are end points of the sides.

 図9Bは、モデル情報により構築される概念モデルの例を示す図である。この概念モデルでは、分散システム350で実行されるジョブの処理対象データを、データサーバDが送信し処理サーバPが受信するまでの全通信路が示される。データサーバDと処理サーバPとを接続する辺は、対応する通信路の可用帯域を属性値(転送量制約条件)として持つ。処理サーバPと論理的な頂点αとを接続する辺は、処理サーバPの単位時間当たりの可能処理量を属性値(処理量制約条件)として持つ。特に、可用帯域や可能処理量の制限がない辺は、制約条件がない、即ち、可用帯域や可能処理量が無限大として扱われる。このような制約条件がない辺における可用帯域や可能処理量は、無限大以外の特別な値として扱われてもよい。また、データサーバDを示す頂点の前段、データサーバDと処理サーバPとの間、及び、処理サーバPと頂点αとの間に複数の頂点や辺が存在してもよい。また、分散システム350で実行されるジョブとは、例えば、分散システム350が実行要求されたプログラム処理の一単位である。 FIG. 9B is a diagram showing an example of a conceptual model constructed by model information. In this conceptual model, all communication paths from the data server D transmitting the processing target data of the job executed in the distributed system 350 to the reception of the processing server P are shown. The side connecting the data server D and the processing server P has the available bandwidth of the corresponding communication path as an attribute value (transfer amount constraint condition). The side connecting the processing server P and the logical vertex α has the possible processing amount per unit time of the processing server P as an attribute value (processing amount constraint condition). In particular, a side where there is no restriction on the usable bandwidth or the possible processing amount is treated as having no constraint condition, that is, the usable bandwidth or the possible processing amount is infinite. The available bandwidth and the possible processing amount on the side where there is no such constraint condition may be treated as special values other than infinity. Also, a plurality of vertices and sides may exist before the vertex indicating the data server D, between the data server D and the processing server P, and between the processing server P and the vertex α. The job executed in the distributed system 350 is a unit of program processing requested to be executed by the distributed system 350, for example.

 なお、モデル情報の形態は、図9Aの符号500の形態に制限されない。例えば、モデル情報は、頂点や辺の情報を格納するデータフィールド群を参照で連結する、連結リストで実現されてもよい。また、モデル生成部301は、ハードウェア要素の動作状態に応じてモデル情報の生成方法を変えてもよい。例えば、モデル生成部301は、CPU使用率の高い処理サーバ330を利用不可能と判定し、そのような処理サーバ330をモデル情報の対象から除外してもよい。 Note that the form of the model information is not limited to the form of reference numeral 500 in FIG. 9A. For example, the model information may be realized by a linked list in which data field groups storing vertex and edge information are linked by reference. In addition, the model generation unit 301 may change the generation method of the model information according to the operation state of the hardware element. For example, the model generation unit 301 may determine that a processing server 330 with a high CPU usage rate cannot be used, and exclude such processing server 330 from the model information target.

 ===決定部303===
 決定部303は、モデル生成部301により生成されたモデル情報に基づいて、分散システム350で実行されるジョブの処理時間が最小となるような、当該モデル情報から構築される概念モデルにおける各辺の流量をそれぞれ決定する。各辺の流量は、例えば、単位時間当たりのデータ処理量(データ転送量)を示す。言い換えれば、決定部303は、データサーバ340を示す第1頂点、処理サーバ330を示す第2頂点、第1頂点から第2頂点に至る第1辺、及び、第2頂点から第2頂点より後段の第3頂点に至る第2辺をそれぞれ含む概念モデル上の各経路に関する、その経路に含まれる第1辺及び第2辺に設定される転送量制約条件及び処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決める。当該概念モデルにおける当該各経路は、処理対象の或るデータが、データサーバ340の処理データ格納部342から処理サーバ330方向へ送出されてから、その処理サーバ330で処理されるまでの各データフローを示す。各辺の流量は、例えば、概念モデル上の各経路に関する単位時間当たりのデータ処理量の合計が最大となるように決められる。また、各辺の流量は、データ通信路の負荷を軽減するために、概念モデル上の複数経路から、各経路に含まれる辺の数(通信ホップ数)が最小となる経路が絞り込まれ、その絞り込まれた各経路に関する単位時間当たりのデータ処理量の合計が最大となるように決められる。
=== Determining Unit 303 ===
Based on the model information generated by the model generation unit 301, the determination unit 303 determines each side of the concept model constructed from the model information that minimizes the processing time of the job executed by the distributed system 350. Determine each flow rate. The flow rate of each side indicates, for example, a data processing amount (data transfer amount) per unit time. In other words, the determination unit 303 includes the first vertex indicating the data server 340, the second vertex indicating the processing server 330, the first side from the first vertex to the second vertex, and the second vertex from the second vertex. Execution can be performed according to the transfer amount constraint condition and the processing amount constraint condition set for the first side and the second side included in the route for each route on the conceptual model including the second side to the third vertex of Using the total amount of data processing per unit time, the flow rate of each side on the conceptual model is determined. Each path in the conceptual model has a data flow from when certain data to be processed is sent from the processing data storage unit 342 of the data server 340 toward the processing server 330 until it is processed by the processing server 330. Indicates. The flow rate of each side is determined so that, for example, the total amount of data processing per unit time for each route on the conceptual model is maximized. In addition, in order to reduce the load on the data communication path, the flow of each side is narrowed down from the multiple paths on the conceptual model to the path that minimizes the number of edges (number of communication hops) included in each path. The total data processing amount per unit time for each narrowed path is determined to be the maximum.

 決定部303は、このように決定された各辺の流量を満たす概念モデル上の各経路を選択し、選択された各経路に含まれる各頂点に応じて、処理サーバ330と、この処理サーバ330により処理されるデータを格納するデータサーバ340との複数組み合わせを決定する。以降、当該概念モデルにおける上記第1辺及び上記第2辺を含む上記各経路、及び、各経路の流量を含む情報がデータフローFi又はデータフロー情報と表記される場合もある。決定部303は、このようなデータフロー情報を生成する。 The determination unit 303 selects each path on the conceptual model that satisfies the flow rate of each side determined in this way, and the processing server 330 and the processing server 330 according to each vertex included in each selected path. A plurality of combinations with the data server 340 that stores data to be processed is determined. Hereinafter, the information including the routes including the first side and the second side in the conceptual model and the flow rate of each route may be referred to as data flow Fi or data flow information. The determination unit 303 generates such data flow information.

 各辺の流量は、概念モデル上の全ての辺eにおいて以下の制約条件式を満たすような流量関数f(e)と表記することもできる。
 制約条件式:l(e)≦f(e)≦u(e)
 u(e)は、各辺eに設定された転送量制約条件又は処理量制約条件のうちの上限値(モデル情報の流量上限値)を出力する上方制約容量関数を示し、l(e)は、各辺eに設定された転送量制約条件又は処理量制約条件のうちの下限値(モデル情報の流量下限値)を出力する下方制約容量関数を示す。
The flow rate of each side can also be expressed as a flow rate function f (e) that satisfies the following constraint expression on all sides e on the conceptual model.
Constraint expression: l (e) ≦ f (e) ≦ u (e)
u (e) represents an upper limit capacity function that outputs an upper limit value (flow rate upper limit value of model information) of the transfer amount constraint condition or the processing amount constraint condition set for each side e, and l (e) represents The lower limit capacity function for outputting the lower limit value (flow rate lower limit value of the model information) of the transfer amount constraint condition or the processing amount constraint condition set for each side e is shown.

 即ち、決定部303は、分散システム350で実行されるジョブの処理時間が最小となるような流量関数fを決定する。流量関数fは、例えば、或る辺集合E'についてΣe∈E'(f(e))を目的関数として、目的関数を最大化することによって決定できる。目的関数の最大化は、線形計画法や最大流問題におけるフロー増加法、プリフロープッシュ法等を用いることによって実現できる。 That is, the determination unit 303 determines the flow function f that minimizes the processing time of the job executed in the distributed system 350. The flow function f can be determined, for example, by maximizing the objective function with ΣeεE ′ (f (e)) as an objective function for a certain edge set E ′. Maximization of the objective function can be realized by using a linear programming method, a flow increasing method in a maximum flow problem, a preflow push method, or the like.

 なお、決定部303は、流量関数fを決定するために、モデル情報から構築される当該概念モデルに論理的な頂点や辺を追加してもよい。例えば、決定部303は、一組の始点と終点を持つネットワークモデルに対する解法アルゴリズムを適用するために、論理的な始点と、始点とデータサーバ340を示す頂点を接続する辺の集合と、論理的な終点と、処理サーバ330を示す頂点と終点とを接続する辺の集合とを、当該概念モデルにそれぞれ追加してもよい。このような論理的な頂点及び辺は、モデル生成部301により、モデル情報に含められてもよい。 Note that the determination unit 303 may add logical vertices and edges to the conceptual model constructed from the model information in order to determine the flow function f. For example, in order to apply a solution algorithm to a network model having a set of start points and end points, the determination unit 303 includes a logical start point, a set of edges connecting the start point and the vertex indicating the data server 340, a logical An end point and a set of sides connecting the vertex and the end point indicating the processing server 330 may be added to the conceptual model. Such logical vertices and edges may be included in the model information by the model generation unit 301.

 以下の式(1)、(2)及び(3)は、最大フロー問題の形式でネットワークモデルが生成された場合の目的関数(式(1))及び制約関数(式(2)及び式(3))の例を示す。

Figure JPOXMLDOC01-appb-M000001
The following formulas (1), (2), and (3) are an objective function (formula (1)) and a constraint function (formula (2) and formula (3) when the network model is generated in the form of the maximum flow problem. )) Example.
Figure JPOXMLDOC01-appb-M000001

 Eはネットワークモデルを構成する辺の集合を示し、Vはネットワークモデルを構成する頂点の集合を示し、f(e)は辺eの流量関数を示し、sはネットワークモデルの始点を示し、tはネットワークモデルの終点を示し、δ-は或る頂点から出る辺の集合を示し、δ+は或る頂点に入る辺の集合を示し、u(e)は辺eの流量上限値を出力する上方制約容量関数を示し、l(e)は辺eの流量下限値を出力する下方制約容量関数を示す。 E represents a set of edges constituting the network model, V represents a set of vertices constituting the network model, f (e) represents a flow function of the edge e, s represents a starting point of the network model, and t represents Indicates the end point of the network model, δ− indicates a set of edges that exit from a certain vertex, δ + indicates a set of edges that enter a certain vertex, and u (e) is an upper constraint that outputs a flow rate upper limit value of the edge e The capacity function is shown, and l (e) is a downward-constrained capacity function that outputs the lower limit value of the flow rate of the side e.

 決定部303は、決定されたデータフロー情報が分散システム350で実行可能となるように、制約条件を追加してもよい。例えば、モデル情報に、転送されるデータと処理されるデータの対応関係が矛盾するフローを解に含めないという制約条件を加えた状態で、流量関数fが決定されてもよい。 The determination unit 303 may add a constraint condition so that the determined data flow information can be executed by the distributed system 350. For example, the flow function f may be determined in a state in which a constraint condition is not added to the model information that does not include a flow in which the correspondence between the transferred data and the processed data is inconsistent.

 図10は、データフロー情報の例を示す図である。上述したように、データフロー情報は、経路情報、及び、経路の流量の情報を含む。図10の例によれば、経路の流量として、単位時間当たりの処理量(単位処理量)が設定されている。また、経路情報は、その経路に含まれる各頂点(データサーバD1、処理サーバP1及び論理頂点α)の情報により示されている。また、データフローFiを特定するための識別子(Flow1)が設定されている。なお、経路情報が同一のデータフローが複数存在する場合には、それらが統合されてもよい。 FIG. 10 is a diagram showing an example of data flow information. As described above, the data flow information includes route information and information on the flow rate of the route. According to the example of FIG. 10, the processing amount per unit time (unit processing amount) is set as the flow rate of the route. The route information is indicated by information on each vertex (data server D1, processing server P1, and logical vertex α) included in the route. In addition, an identifier (Flow1) for specifying the data flow Fi is set. In addition, when there are a plurality of data flows having the same route information, they may be integrated.

 決定部303は、このように生成されたデータフロー情報に基づいて、処理サーバ330と、その処理サーバ330の処理対象データの取得先であるデータサーバ340との組み合わせが決定され、このような組み合わせを示す決定情報を生成する。生成された決定情報は、その組み合わせに含まれる各処理サーバ330により取得される。 The determination unit 303 determines a combination of the processing server 330 and the data server 340 from which the processing target data of the processing server 330 is acquired based on the data flow information generated as described above. The decision information indicating is generated. The generated determination information is acquired by each processing server 330 included in the combination.

 図11は、決定情報の例を示す図である。図11に示されるように、決定情報には、データサーバID、処理データ格納部ID、データID、受信データ特定情報、単位時間当たりのデータ処理量が含まれる。データサーバIDは、処理サーバ330が処理すべきデータを格納するデータサーバ340の識別子であり、処理データ格納部IDは、そのデータサーバ340の処理データ格納部342の識別子であり、データIDは、処理対象データの識別子である。 FIG. 11 is a diagram showing an example of decision information. As shown in FIG. 11, the determination information includes a data server ID, a processing data storage unit ID, a data ID, received data specifying information, and a data processing amount per unit time. The data server ID is an identifier of the data server 340 that stores data to be processed by the processing server 330, the processing data storage unit ID is an identifier of the processing data storage unit 342 of the data server 340, and the data ID is This is the identifier of the data to be processed.

 決定情報には、データIDが含まれていなくてもよい。この場合には、処理サーバ330は、データサーバID及び処理データ格納部IDで特定されるデータサーバ340の処理データ格納部342に格納されるデータから、ジョブの処理対象となっている未処理のデータを取得するようにしてもよい。 * The data ID may not be included in the decision information. In this case, the processing server 330 uses the data stored in the processing data storage unit 342 of the data server 340 specified by the data server ID and the processing data storage unit ID, to determine whether the processing target of the job has been processed. Data may be acquired.

 受信データ特定情報は、或るデータサーバ340の或る処理データ格納部342に格納されている処理対象データが複数の処理サーバ330で処理される場合や、複数のデータサーバ340に多重化されて格納されている処理対象データが複数の処理サーバ330で処理される場合などに、設定される。受信データ特定情報として、例えば、データ内の所定の区間を指定する情報(例えば、区間の開始位置、処理量)が設定される。なお、上述のような場合以外では、決定情報には、受信データ特定情報が設定されなくてもよい。 The received data identification information is multiplexed when the processing target data stored in a certain processing data storage unit 342 of a certain data server 340 is processed by a plurality of processing servers 330 or multiplexed to a plurality of data servers 340. This is set when the stored processing target data is processed by a plurality of processing servers 330. As the received data specifying information, for example, information specifying a predetermined section in the data (for example, the start position of the section, the processing amount) is set. In addition to the cases described above, the reception data specifying information may not be set in the determination information.

 決定情報に含まれ得る単位時間当たりのデータ処理量は、データフロー情報に含まれる単位処理量に基づいて設定される。決定情報に単位時間当たりのデータ処理量が含まれている場合には、処理サーバ330は、データサーバ340に対して、その決定情報で特定されるデータを、当該単位時間当たりのデータ処理量で転送するよう要求する。決定情報に単位時間当たりのデータ処理量が含まれていない場合には、処理サーバ330は、データサーバ340に対して、任意の処理量で転送するよう要求してもよい。 The data processing amount per unit time that can be included in the decision information is set based on the unit processing amount included in the data flow information. When the data processing amount per unit time is included in the determination information, the processing server 330 sends the data specified by the determination information to the data server 340 with the data processing amount per unit time. Request to transfer. If the data processing amount per unit time is not included in the determination information, the processing server 330 may request the data server 340 to transfer at an arbitrary processing amount.

 決定情報は、その組み合わせに含まれる各データサーバ340により取得されるようにしてもよい。この場合、決定情報には、データサーバIDに代えて、処理サーバ330の識別子である処理サーバIDが含まれれば良い。 The determination information may be acquired by each data server 340 included in the combination. In this case, the determination information may include the processing server ID, which is the identifier of the processing server 330, instead of the data server ID.

 処理サーバ330が当該決定情報に対応する処理プログラムを処理プログラム格納部333に格納していない場合、決定部303は、例えばクライアント360から受信された処理プログラムを処理サーバ330に配布しても良い。決定部303は、処理サーバ330に対して、決定情報に対応する処理プログラムを格納しているか否か問い合わせ、処理サーバ330が処理プログラムを格納していないと判定した場合に、クライアントから受信した処理プログラムを当該処理サーバ330に配布するようにしてもよい。 When the processing server 330 does not store the processing program corresponding to the determination information in the processing program storage unit 333, the determination unit 303 may distribute the processing program received from the client 360 to the processing server 330, for example. The determination unit 303 inquires of the processing server 330 whether or not the processing program corresponding to the determination information is stored, and when the processing server 330 determines that the processing program is not stored, the processing received from the client The program may be distributed to the processing server 330.

 また、上述した概念モデル、制約条件、目的関数を指定するための情報は、構造プログラム等に記述され、その構造プログラム等がクライアント360からマスタサーバ300に与えられても良い。また、当該概念モデル、制約条件、目的関数を指定するための情報は、起動パラメータ等としてクライアント360からマスタサーバ300に与えられても良い。マスタサーバ300が、データ所在格納部3070等を参照して当該概念モデルを決定しても良い。 Also, the information for designating the above-described conceptual model, constraint conditions, and objective function may be described in a structure program or the like, and the structure program or the like may be given from the client 360 to the master server 300. Further, information for designating the conceptual model, the constraint condition, and the objective function may be given from the client 360 to the master server 300 as an activation parameter or the like. The master server 300 may determine the conceptual model with reference to the data location storage unit 3070 and the like.

 マスタサーバ300は、モデル生成部301により生成されたモデル情報等や、決定部303が生成したデータフロー情報等をメモリ等に保存し、当該モデル情報やデータフロー情報をモデル生成部301や決定部303の入力に与えても良い。この場合、モデル生成部301や決定部303は、当該モデル情報やデータフロー情報をモデル生成や最適配置計算に利用しても良い。また、マスタサーバ300は、全ての概念モデル、制約条件、目的関数に対応可能に実現されてもよいし、特定の概念モデル等にのみ対応可能なように実現されてもよい。 The master server 300 stores the model information generated by the model generation unit 301 and the data flow information generated by the determination unit 303 in a memory or the like, and stores the model information and data flow information in the model generation unit 301 and the determination unit. You may give to the input of 303. In this case, the model generation unit 301 and the determination unit 303 may use the model information and data flow information for model generation and optimal arrangement calculation. Further, the master server 300 may be realized so as to be compatible with all conceptual models, constraint conditions, and objective functions, or may be realized so as to be compatible only with a specific conceptual model.

 〔第1実施形態における動作例〕
 以下、第1実施形態における分散システム350の動作例について説明する。
 図12は、分散システム350の動作例の全体概要を示すフローチャートである。
[Operation example in the first embodiment]
Hereinafter, an operation example of the distributed system 350 in the first embodiment will be described.
FIG. 12 is a flowchart showing an overall outline of an operation example of the distributed system 350.

 マスタサーバ300は、クライアント360から処理プログラムの実行要求である要求情報を受け取ると、次のような各情報をそれぞれ取得する(S401)。マスタサーバ300は、分散システム350内の入出力通信路情報の集合、処理対象データとそのデータを格納するデータサーバ340とを対応付けたデータ所在情報の集合、利用可能な処理サーバ330の識別子の集合を取得する。 When the master server 300 receives request information that is a request to execute a processing program from the client 360, the master server 300 acquires the following pieces of information (S401). The master server 300 includes a set of input / output communication path information in the distributed system 350, a set of data location information in which processing target data is associated with the data server 340 storing the data, and identifiers of usable processing servers 330. Get a set.

 マスタサーバ300は、取得された処理対象データの集合に未処理のデータが残っているか否か判定する(S402)。マスタサーバ300は、取得された処理対象データの集合に未処理のデータが残っていないと判定した場合(S402;No)、処理を終了する。 The master server 300 determines whether or not unprocessed data remains in the acquired set of processing target data (S402). When the master server 300 determines that unprocessed data does not remain in the acquired set of processing target data (S402; No), the process ends.

 一方、マスタサーバ300は、取得された処理対象データ集合に未処理のデータが残っていると判定した場合(S402;Yes)、更に、取得された利用可能な処理サーバ330の識別子の集合の中に、追加で処理を実行可能な処理サーバ330があるか否か判定する(S403)。マスタサーバ300は、追加で処理を実行可能な処理サーバ330が無いと判定した場合(S403;No)、上記(S401)に戻って処理をする。 On the other hand, when the master server 300 determines that unprocessed data remains in the acquired processing target data set (S402; Yes), the master server 300 further includes the acquired identifiers of the available processing servers 330. In addition, it is determined whether there is a processing server 330 that can additionally execute processing (S403). If the master server 300 determines that there is no additional processing server 330 that can execute the process (S403; No), the process returns to (S401) and performs the process.

 マスタサーバ300は、追加で処理を実行可能な処理サーバ330があると判定した場合(S403;Yes)、取得された処理サーバ330の識別子の集合と、データサーバ340の識別子の集合とをキーとして、入出力通信路情報及び処理サーバ状態情報を取得し、これらの情報に基づいてモデル情報を生成する(S404)。以降、追加で処理を実行可能な処理サーバ330は、利用可能な処理サーバ330とも表記される。 When the master server 300 determines that there is a processing server 330 that can additionally execute processing (S403; Yes), the master server 300 uses the acquired set of identifiers of the processing server 330 and the set of identifiers of the data server 340 as keys. The input / output communication path information and the processing server state information are acquired, and model information is generated based on these information (S404). Hereinafter, the processing server 330 that can additionally execute processing is also referred to as an available processing server 330.

 マスタサーバ300は、生成されたモデル情報に基づいて、所定の制約条件下で所定の目的関数が最大となるような処理サーバ330とデータサーバ340との各組み合わせを決定する(S405)。マスタサーバ300は、決定された各組み合わせを示すデータフロー情報を生成する。 The master server 300 determines each combination of the processing server 330 and the data server 340 that maximizes a predetermined objective function under predetermined constraint conditions based on the generated model information (S405). The master server 300 generates data flow information indicating each determined combination.

 マスタサーバ300により(S405)で決定された組み合わせに対応する、各処理サーバ330及び各データサーバ340は、処理対象データを送信及び受信し、各処理サーバ330は、受信された処理対象データを処理する(S406)。その後、分散システム350の処理は、工程(S401)に戻る。 Each processing server 330 and each data server 340 corresponding to the combination determined in (S405) by the master server 300 transmits and receives the processing target data, and each processing server 330 processes the received processing target data. (S406). Thereafter, the processing of the distributed system 350 returns to the step (S401).

 図13は、工程(S401)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。マスタサーバ300のモデル生成部301は、クライアント360からの要求情報で指定された処理対象データを格納するデータサーバ340の識別子の集合をデータ所在格納部3070から取得する(S401-1)。次に、モデル生成部301は、サーバ状態格納部3060から、データサーバ340の識別子の集合と、処理サーバ330の識別子の集合を取得する(S401-2)。なお、工程(S401-2)は、工程(S401-1)より前に実行されてもよい。 FIG. 13 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S401). The model generation unit 301 of the master server 300 acquires from the data location storage unit 3070 a set of identifiers of the data server 340 that stores the processing target data specified by the request information from the client 360 (S401-1). Next, the model generation unit 301 acquires a set of identifiers of the data server 340 and a set of identifiers of the processing server 330 from the server state storage unit 3060 (S401-2). Note that the step (S401-2) may be executed before the step (S401-1).

 図14は、工程(S404)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。マスタサーバ300のモデル生成部301は、入出力通信路情報格納部3080から、処理対象データを処理サーバ330が処理するための通信路を示す入出力通信路情報を取得する。モデル生成部301は、メモリ等に保持されるモデル情報(例えば、図9Aの符号500)に、取得された入出力通信路情報に基づいて、データサーバ340から処理サーバ330への通信路の情報を追加する(S404-10)。 FIG. 14 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404). The model generation unit 301 of the master server 300 acquires input / output communication path information indicating a communication path for the processing server 330 to process the processing target data from the input / output communication path information storage unit 3080. Based on the input / output communication path information acquired in the model information (for example, reference numeral 500 in FIG. 9A) stored in the memory or the like, the model generation unit 301 stores information on the communication path from the data server 340 to the processing server 330. Is added (S404-10).

 続いて、モデル生成部301は、当該モデル情報に、処理サーバ330から後段の論理的頂点への論理的な通信路情報を追加する(S404-20)。なお、工程(S404-20)は工程(S404-10)より前に実行されてもよい。 Subsequently, the model generation unit 301 adds logical communication path information from the processing server 330 to the subsequent logical vertex to the model information (S404-20). Note that the step (S404-20) may be executed before the step (S404-10).

 図15は、工程(S404-10)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。モデル生成部301は、当該要求情報に基づいてデータ所在格納部3070から取得された情報を参照することにより、処理対象データを格納する各データサーバDiについて、工程(S404-12)をそれぞれ実行する(S404-11)。モデル生成部301は、利用可能な各処理サーバPjについて、工程(S404-13)から工程(S404-15)をそれぞれ実行する(S404-12)。 FIG. 15 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404-10). The model generation unit 301 refers to the information acquired from the data location storage unit 3070 based on the request information, and executes the process (S404-12) for each data server Di storing the processing target data. (S404-11). The model generation unit 301 executes steps (S404-13) to (S404-15) for each available processing server Pj (S404-12).

 モデル生成部301は、当該モデル情報500に、データサーバDiの名称(又は識別子)を含む行を追加する(S404-13)。モデル生成部301は、その追加された行の次要素へのポインタに処理サーバPjの名称(又は識別子)を設定する(S404-14)。なお、モデル情報500の行の「識別子」及び「次要素へのポインタ」は、当該概念モデルにおける或るノードを特定し得る情報であればよい。 The model generation unit 301 adds a line including the name (or identifier) of the data server Di to the model information 500 (S404-13). The model generation unit 301 sets the name (or identifier) of the processing server Pj as a pointer to the next element of the added row (S404-14). The “identifier” and the “pointer to the next element” in the model information 500 may be information that can identify a certain node in the conceptual model.

 モデル生成部301は、当該追加行の流量上限値に、データサーバDiと処理サーバPjとの間の通信路の可用帯域を設定し、当該追加行の流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-15)。なお、工程(S404-14)の前に、工程(S404-15)が実行されてもよい。 The model generation unit 301 sets the usable bandwidth of the communication path between the data server Di and the processing server Pj to the flow rate upper limit value of the additional row, and sets the flow rate lower limit value of the additional row to 0 or more and the flow rate upper limit value. The following values are set (S404-15). Note that the step (S404-15) may be executed before the step (S404-14).

 図16は、工程(S404-20)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。モデル生成部301は、当該要求情報に基づいてサーバ状態格納部3060から取得された利用可能な各処理サーバPjについて、工程(S404-22)から工程(S404-26)をそれぞれ実行する(S404-21)。 FIG. 16 is a flowchart showing a detailed operation of the master server 300 of the first embodiment in the step (S404-20). The model generation unit 301 executes steps (S404-22) to (S404-26) for each available processing server Pj acquired from the server state storage unit 3060 based on the request information (S404-26). 21).

 モデル生成部301は、モデル情報500に、処理サーバPjの名称(又は識別子)を含む行を追加する(S404-22)。モデル生成部301は、処理サーバの後段に頂点が存在するか否か判定する(S404-23)。処理サーバの後段にある頂点とは、モデル情報500の中で、任意の処理サーバの名称(又は識別子)を含む行の次要素へのポインタを辿ることで到達できる行の識別子を指す。 The model generation unit 301 adds a line including the name (or identifier) of the processing server Pj to the model information 500 (S404-22). The model generation unit 301 determines whether or not a vertex exists in the subsequent stage of the processing server (S404-23). The vertex at the subsequent stage of the processing server refers to an identifier of a line that can be reached by following a pointer to the next element of the line including the name (or identifier) of an arbitrary processing server in the model information 500.

 モデル生成部301は、処理サーバの後段に頂点が存在しないと判定した場合(S404-23;No)、モデル情報500に含まれる識別子と一致しない任意の名称である識別子αを設定する(S404-24)。なお、モデル生成部301は、処理サーバの後段に頂点が存在すると判定した場合(S404-23;Yes)、工程(S404-24)を実行しない。続いて、モデル生成部301は、当該追加行の次要素へのポインタに識別子αを設定する(S404-25)。 When the model generation unit 301 determines that there is no vertex in the subsequent stage of the processing server (S404-23; No), the model generation unit 301 sets an identifier α that is an arbitrary name that does not match the identifier included in the model information 500 (S404-). 24). Note that if the model generation unit 301 determines that there is a vertex in the subsequent stage of the processing server (S404-23; Yes), the model generation unit 301 does not execute the process (S404-24). Subsequently, the model generation unit 301 sets the identifier α as a pointer to the next element of the added row (S404-25).

 モデル生成部301は、当該追加行の流量上限値に、処理サーバPjの単位時間当たりの可能処理量を設定し、かつ、当該追加行の流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-26)。なお、工程(S404-25)は、工程(S404-26)の後に実行されてもよい。また、工程(S404-26)は、工程(S404-22)の後のいずれかの時点で実行されてもよい。 The model generation unit 301 sets a possible processing amount per unit time of the processing server Pj to the upper limit flow rate value of the additional row, and a value that is greater than or equal to 0 and less than or equal to the upper limit flow rate of the additional row Is set (S404-26). Note that the step (S404-25) may be performed after the step (S404-26). Further, the step (S404-26) may be executed at any time after the step (S404-22).

 図17は、工程(S405)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。マスタサーバ300の決定部303は、上述のように生成されたモデル情報に基づいて構築され得る概念モデル(ここでは、有向グラフと表記する)を用いて、以下のように動作する。 FIG. 17 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S405). The determination unit 303 of the master server 300 operates as follows using a conceptual model (herein referred to as a directed graph) that can be constructed based on the model information generated as described above.

 決定部303は、当該有向グラフに基づいて、分散システム350により実行されるジョブの処理時間が最小となるように、各辺の流量(データフローFi)を決定する(S405-1)。各辺eの流量が流量関数f(e)で定義される場合には、決定部303は、ジョブの処理時間が最小となるように流量関数f(e)を生成する。決定部303は、例えば、モデル情報から構築されるネットワークモデルを基に目的関数(或る辺集合E'についてΣe∈E'(f(e)))を最大化する。決定部303は、線形計画法や最大流問題におけるフロー増加法等を用いて、当該目的関数の最大化の処理を行う。最大流問題におけるフロー増加法を用いた動作の具体例が実施例1として後述される。 The determining unit 303 determines the flow rate (data flow Fi) of each side based on the directed graph so that the processing time of the job executed by the distributed system 350 is minimized (S405-1). When the flow rate of each side e is defined by the flow rate function f (e), the determination unit 303 generates the flow rate function f (e) so that the job processing time is minimized. For example, the determination unit 303 maximizes the objective function (ΣeεE ′ (f (e)) for a certain edge set E ′) based on the network model constructed from the model information. The determination unit 303 performs processing for maximizing the objective function using a linear programming method, a flow increase method in the maximum flow problem, or the like. A specific example of the operation using the flow increasing method in the maximum flow problem will be described later as a first embodiment.

 決定部303は、当該有向グラフにおける始点を示す頂点を頂点変数iに設定する(S405-2)。次に、決定部303は、メモリ上に経路情報配列と単位処理量とを格納する領域を確保し、単位処理量の値を無限大で初期化する(S405-3)。 The determining unit 303 sets the vertex indicating the starting point in the directed graph to the vertex variable i (S405-2). Next, the determination unit 303 secures an area for storing the path information array and the unit processing amount on the memory, and initializes the value of the unit processing amount to infinity (S405-3).

 決定部303は、頂点変数iにより示される頂点が当該有向グラフの終点であるか否か判定する(S405-4)。以降、頂点変数iにより示される頂点は、単に、頂点変数iと表記される。決定部303は、頂点変数iが当該有向グラフの終点でないと判定した場合(S405-4;No)、当該有向グラフにおいて頂点変数iから出る通信路の中で、流量がゼロでない通信路が存在するか否かを判定する(S405-5)。決定部303は、流量がゼロでない通信路が存在しない場合(S405-5;No)、処理を終了する。 The determination unit 303 determines whether the vertex indicated by the vertex variable i is the end point of the directed graph (S405-4). Hereinafter, the vertex indicated by the vertex variable i is simply expressed as the vertex variable i. When the determination unit 303 determines that the vertex variable i is not the end point of the directed graph (S405-4; No), is there a communication channel with a non-zero flow rate among the communication channels exiting from the vertex variable i in the directed graph? It is determined whether or not (S405-5). When there is no communication path with a non-zero flow rate (S405-5; No), the determination unit 303 ends the process.

 一方、決定部303は、流量がゼロでない通信路が存在する場合(S405-5;Yes)、その通信路を選択する(S405-6)。続いて、決定部303は、メモリ上に確保された経路情報配列に頂点変数iを追加する(S405-7)。 On the other hand, when there is a communication path whose flow rate is not zero (S405-5; Yes), the determination unit 303 selects the communication path (S405-6). Subsequently, the determination unit 303 adds the vertex variable i to the path information array secured on the memory (S405-7).

 決定部303は、メモリ上に確保された単位処理量が、工程(S405-6)で選択された通信路の流量より小さい又は等しいか否かを判定し(S405-8)、当該単位処理量が当該通信路の流量より大きい場合(S405-8;No)、当該通信路の流量でメモリ上に確保された単位処理量を更新する(S405-9)。なお、決定部303は、当該単位処理量が当該通信路の流量より小さい又は等しい場合(S405-8;Yes)、工程(S405-9)を実行しない。 The determination unit 303 determines whether the unit processing amount secured in the memory is smaller than or equal to the flow rate of the communication path selected in step (S405-6) (S405-8), and determines the unit processing amount. Is larger than the flow rate of the communication channel (S405-8; No), the unit processing amount secured in the memory is updated with the flow rate of the communication channel (S405-9). If the unit processing amount is smaller than or equal to the flow rate of the communication path (S405-8; Yes), the determination unit 303 does not execute the step (S405-9).

 決定部303は、工程(S405-6)で選択された通信路の他の端点となる頂点を頂点変数iに設定し(S405-10)、工程(S405-4)に戻り、実行する。 The determination unit 303 sets the vertex serving as the other end point of the communication path selected in the step (S405-6) to the vertex variable i (S405-10), returns to the step (S405-4), and executes it.

 決定部303は、工程(S405-4)において、頂点変数iが当該有向グラフの終点であると判定した場合(S405-4;Yes)、経路情報配列に格納された経路情報と単位処理量とからデータフロー情報を生成し、そのデータフロー情報をメモリに格納する(S405-11)。ここで生成されたデータフロー情報の経路情報には、当該有向グラフにおける始点から終点へ至る1つの経路に含まれる、データサーバ340を示す頂点及び処理サーバ330を示す頂点が少なくとも設定される。また、当該データフロー情報の単位処理量には、当該有向グラフにおける始点から終点へ至る1つの経路によって示される単位時間当たりのデータ処理量が設定される。 When determining that the vertex variable i is the end point of the directed graph in the step (S405-4) (S405-4; Yes), the determination unit 303 determines from the path information stored in the path information array and the unit processing amount. Data flow information is generated, and the data flow information is stored in the memory (S405-11). In the path information of the data flow information generated here, at least a vertex indicating the data server 340 and a vertex indicating the processing server 330 included in one path from the start point to the end point in the directed graph are set. In addition, as the unit processing amount of the data flow information, the data processing amount per unit time indicated by one path from the start point to the end point in the directed graph is set.

 決定部303は、経路情報に含まれる各頂点間を結ぶ各辺の流量を、元の流量から当該単位処理量を減じた値で更新する(S405-12)。その後、決定部303は、工程(S405-2)に戻り、再度その工程を実行する。 The determination unit 303 updates the flow rate of each side connecting the vertices included in the route information with a value obtained by subtracting the unit processing amount from the original flow rate (S405-12). Thereafter, the determination unit 303 returns to the step (S405-2) and executes the step again.

 図18は、工程(S406)における第1実施形態のマスタサーバ300の詳細動作を示すフローチャートである。決定部303は、利用可能な処理サーバ330の集合内の、各処理サーバPjについて、工程(S406-2)をそれぞれ実行する(S406-1)。 FIG. 18 is a flowchart showing the detailed operation of the master server 300 of the first embodiment in the step (S406). The determination unit 303 executes the process (S406-2) for each processing server Pj in the set of available processing servers 330 (S406-1).

 決定部303は、処理サーバPjを含む経路情報の集合内の、各経路情報Fjについて、工程(S406-3)から工程(S406-4)をそれぞれ実行する(S406-2)。各経路情報Fjは、工程(S405)において生成されたデータフロー情報に含まれる。 The determining unit 303 executes steps (S406-3) to (S406-4) for each piece of route information Fj in the set of route information including the processing server Pj (S406-2). Each path information Fj is included in the data flow information generated in the step (S405).

 決定部303は、経路情報Fjから、処理対象データを格納しているデータサーバ340の識別子を抽出する(S406-3)。決定部303は、処理サーバPjに対して、処理プログラム及び決定情報を送信する(S406-4)。ここで処理プログラムとは、当該処理対象データを格納しているデータサーバ340に当該データの転送を指示するための処理プログラムである。データサーバ340及び処理対象データは、決定情報に含まれる情報によって特定される。 The determination unit 303 extracts the identifier of the data server 340 storing the processing target data from the path information Fj (S406-3). The determination unit 303 transmits the processing program and the determination information to the processing server Pj (S406-4). Here, the processing program is a processing program for instructing the data server 340 storing the processing target data to transfer the data. The data server 340 and the processing target data are specified by information included in the determination information.

 〔第1実施形態における作用及び効果〕
 第1実施形態における分散システム350では、各データサーバ340と各処理サーバ330との任意の組み合わせ全体から、分散システム350内の各入出力通信路の通信帯域と各処理サーバ330の処理能力とを考慮したモデル情報が生成される。そして、そのモデル情報により構築され得る概念モデル(ネットワークモデル)に基づいて、処理サーバ330とその処理サーバ330の処理対象となるデータの取得先であるデータサーバ340との組み合わせが決定され、その組み合わせに従って、分散システム350で実行されるジョブを構成する処理対象データの送受信及び処理が実行される。
[Operation and Effect in First Embodiment]
In the distributed system 350 in the first embodiment, the communication bandwidth of each input / output communication path in the distributed system 350 and the processing capability of each processing server 330 are determined from the entire arbitrary combination of each data server 340 and each processing server 330. Considered model information is generated. Then, based on a conceptual model (network model) that can be constructed from the model information, a combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined, and the combination Accordingly, transmission / reception and processing of the processing target data constituting the job executed in the distributed system 350 are executed.

 これにより、第1実施形態によれば、複数のデータサーバ340と複数の処理サーバ330とを備える分散システム350全体で、通信帯域や処理サーバ能力のボトルネックによる効率低下を回避し、実行されるジョブの処理時間を最小化することができる。 Thus, according to the first embodiment, the entire distributed system 350 including the plurality of data servers 340 and the plurality of processing servers 330 is executed while avoiding a decrease in efficiency due to a bottleneck of the communication bandwidth and processing server capability. Job processing time can be minimized.

 また、第1実施形態では、ネットワークモデルが分散システム350内の各入出力通信路の通信帯域を考慮して生成されるため、分散システム350において単位時間当たりの全処理サーバ330の総処理データ量を最大化するデータ転送経路に基づく処理サーバ330とデータサーバ340との組み合わせを決定することができる。 In the first embodiment, since the network model is generated in consideration of the communication bandwidth of each input / output communication path in the distributed system 350, the total amount of processing data of all the processing servers 330 per unit time in the distributed system 350. It is possible to determine a combination of the processing server 330 and the data server 340 based on the data transfer path that maximizes.

 [第1実施形態の第1変形例]
 第1実施形態は、マスタサーバ300が、決定部303で生成されたデータフロー情報を出力するように構成されてもよい。この場合、決定部303は、図17の工程(S405-11)の実行後、生成されデータフロー情報を出力する。決定部303は、例えば、図10の例に示される情報を出力する。なお、この出力形態は限定されない。データフロー情報は、ファイルに出力されてもよいし、他の装置へ送信されてもよいし、表示装置へ表示されてもよいし、印刷装置へ送られてもよい。
[First Modification of First Embodiment]
The first embodiment may be configured such that the master server 300 outputs the data flow information generated by the determination unit 303. In this case, the determination unit 303 outputs the generated data flow information after executing the step (S405-11) of FIG. For example, the determination unit 303 outputs the information shown in the example of FIG. This output form is not limited. Data flow information may be output to a file, transmitted to another device, displayed on a display device, or sent to a printing device.

 このように出力されたデータフロー情報は、より詳細なデータ処理の計画策定に利用することができる。例えば、マスタサーバ300がこのデータフロー情報にデータ転送経路の情報を加えることにより、分散システム350は、処理の状況に応じて、データの転送経路を動的に決定できるようになる。 The data flow information output in this way can be used for planning more detailed data processing. For example, when the master server 300 adds data transfer path information to the data flow information, the distributed system 350 can dynamically determine the data transfer path according to the processing status.

 [第2実施形態]
 以下、第2実施形態における分散システム350について、第1実施形態とは異なる内容を中心に説明する。第1実施形態と同じ内容については適宜省略される。第2実施形態における分散システム350では、マスタサーバ300が分散システム350に対して実行を要求された複数のプログラム処理を扱う。分散システムが実行要求されたプログラム処理の一単位は、ジョブと表記される。
[Second Embodiment]
Hereinafter, the distribution system 350 according to the second embodiment will be described focusing on the content different from the first embodiment. The same contents as those in the first embodiment are omitted as appropriate. In the distributed system 350 in the second embodiment, the master server 300 handles a plurality of program processes requested to be executed by the distributed system 350. A unit of program processing requested to be executed by the distributed system is expressed as a job.

 また、第2実施形態では、分散システム350に対して実行を要求されたプログラム処理において、処理対象データの部分に応じて単位時間当たりの処理量が変化する形態についてもサポートされる。このとき、ジョブは単位時間当たりの処理量が同一となる部分のデータの集合に置き換えて扱われる。このデータ集合を論理的データ集合と表記される場合がある。 Further, in the second embodiment, a mode in which the processing amount per unit time is changed according to the portion of the processing target data in the program processing requested to be executed by the distributed system 350 is also supported. At this time, the job is handled by being replaced with a set of data having the same processing amount per unit time. This data set may be expressed as a logical data set.

 図19は、第2実施形態における分散システム350の各装置の処理構成例を概念的に示す図である。第2実施形態におけるマスタサーバ300は、第1実施形態の構成に加えて、ジョブ情報格納部3040を更に有する。 FIG. 19 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the second embodiment. The master server 300 in the second embodiment further includes a job information storage unit 3040 in addition to the configuration of the first embodiment.

 ===ジョブ情報格納部3040===
 ジョブ情報格納部3040は、分散システム350に対して実行要求されたプログラム処理に関する構成情報を格納する。
=== Job Information Storage 3040 ===
The job information storage unit 3040 stores configuration information related to program processing requested to be executed by the distributed system 350.

 図20は、ジョブ情報格納部3040に格納される情報の例を示す図である。ジョブ情報格納部3040に格納される各行(各エントリ)は、ジョブID3041、データの名称3042、最低単位処理量3043、最大単位処理量3044を含む。ジョブID3041には、分散システム350が実行するジョブ毎に割り当てられた、分散システム350内において一意の識別子が設定される。データの名称3042には、当該ジョブが扱うデータの名称(識別子)が設定される。最低単位処理量3043には、当該ジョブが扱うデータである当該論理データ集合に指定された、単位時間当たりの処理量の最低値である。最大単位処理量3044は、当該論理データ集合に指定された、単位時間当たりの処理量の最大値である。 FIG. 20 is a diagram illustrating an example of information stored in the job information storage unit 3040. Each row (each entry) stored in the job information storage unit 3040 includes a job ID 3041, a data name 3042, a minimum unit processing amount 3043, and a maximum unit processing amount 3044. In the job ID 3041, a unique identifier in the distributed system 350 assigned for each job executed by the distributed system 350 is set. In the data name 3042, the name (identifier) of data handled by the job is set. The minimum unit processing amount 3043 is the minimum value of the processing amount per unit time specified for the logical data set that is data handled by the job. The maximum unit processing amount 3044 is the maximum value of the processing amount per unit time specified for the logical data set.

 1つのジョブが複数の論理データ集合を扱う場合は、ジョブ情報格納部3040には、1つのジョブIDを持つ複数の行が格納され、これら各行では、異なる、データの名称3042、最低単位処理量3043及び最大単位処理量3044がそれぞれ格納されてもよい。 When one job handles a plurality of logical data sets, the job information storage unit 3040 stores a plurality of rows having one job ID, and in each of these rows, a different data name 3042 and minimum unit processing amount are stored. 3043 and the maximum unit processing amount 3044 may be stored, respectively.

 第2実施形態では、マスタサーバ300のモデル生成部301は、ジョブ情報格納部3040に格納されるジョブ構成情報をモデル情報500に更に反映させる。この反映動作については、以下の動作例の項において説明する。これにより、モデル情報から構築され得る概念モデルでは、第1実施形態における構成に加えて、データサーバ340を示す頂点の前段に、ジョブを示す論理的な頂点、そのジョブの頂点からデータサーバ340へと至る論理的な通信路を示す辺、ジョブの頂点より前段の論理的な頂点、及び、ジョブより前段の頂点からジョブの頂点へ至る論理的な通信路を示す辺が少なくとも追加される。 In the second embodiment, the model generation unit 301 of the master server 300 further reflects the job configuration information stored in the job information storage unit 3040 in the model information 500. This reflection operation will be described in the following operation example section. As a result, in the conceptual model that can be constructed from the model information, in addition to the configuration in the first embodiment, the logical vertex indicating the job is placed in the previous stage of the vertex indicating the data server 340, and the vertex of the job is transferred to the data server 340. Are added at least to the edge indicating the logical communication path leading to the job, the logical vertex preceding the job vertex, and the edge indicating the logical communication path from the vertex preceding the job to the job vertex.

 〔第2実施形態における動作例〕 [Operation example in the second embodiment]

 図21は、工程(S401)における第2実施形態のマスタサーバ300の詳細動作を示すフローチャートである。図21では、第1実施形態の詳細動作を示す図13に、工程(S401-0)が加えられている。工程(S401-0)では、モデル生成部301は、ジョブ情報格納部3040から、実行中のジョブの集合を取得する。 FIG. 21 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S401). In FIG. 21, step (S401-0) is added to FIG. 13 showing the detailed operation of the first embodiment. In the step (S401-0), the model generation unit 301 acquires a set of jobs being executed from the job information storage unit 3040.

 図22は、工程(S404)における第2実施形態のマスタサーバ300の詳細動作を示すフローチャートである。図22では、第1実施形態の詳細動作を示す図14に、工程(S404-30)が加えられている。工程(S404-30)では、モデル生成部301は、モデル情報500に、ジョブ情報格納部3040から取得されたジョブ集合の中の各ジョブへの論理的な通信路情報と、各ジョブから各ジョブで処理されるデータを格納しているデータサーバ340への論理的な通信路情報を追加する(S404-30)。なお、図22で示される各工程(S404-30)、(S404-10)及び(S404-20)は、それぞれ順序が入れ替えられても良い。 FIG. 22 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S404). In FIG. 22, the step (S404-30) is added to FIG. 14 showing the detailed operation of the first embodiment. In the step (S404-30), the model generation unit 301 includes, in the model information 500, logical communication path information to each job in the job set acquired from the job information storage unit 3040, and each job from each job. Logical communication path information to the data server 340 storing the data to be processed in (1) is added (S404-30). Note that the order of the steps (S404-30), (S404-10), and (S404-20) shown in FIG. 22 may be changed.

 図23は、工程(S404-30)における第2実施形態のマスタサーバ300の詳細動作を示すフローチャートである。マスタサーバ300のモデル生成部301は、取得したジョブの集合の中の各ジョブJiについて、工程(S404-32)以降を実行する(S404-31)。 FIG. 23 is a flowchart showing a detailed operation of the master server 300 of the second embodiment in the step (S404-30). The model generation unit 301 of the master server 300 executes the process (S404-32) and subsequent steps (S404-31) for each job Ji in the acquired job set.

 モデル生成部301は、ジョブJiの前段に頂点が存在するか否か判定する(S404-32)。ジョブJiの前段の頂点は、モデル情報500の中で、次要素へのポインタに或るジョブを示す情報(ジョブ名称)が設定されている行の識別子に対応する。モデル生成部301は、ジョブの前段に頂点が存在しない場合(S404-32;No)、識別子βを設定する(S404-34)。識別子βは、モデル情報500に含まれる識別子と一致しない任意の名称である。一方、モデル生成部301は、ジョブJiの前段に頂点が存在すると判定した場合(S404-32;Yes)、前段の頂点の識別子βを取得する(S404-33)。 The model generation unit 301 determines whether or not there is a vertex in the previous stage of the job Ji (S404-32). The top vertex of the job Ji corresponds to an identifier of a line in the model information 500 in which information (job name) indicating a certain job is set as a pointer to the next element. The model generation unit 301 sets an identifier β when there is no vertex in the previous stage of the job (S404-32; No) (S404-34). The identifier β is an arbitrary name that does not match the identifier included in the model information 500. On the other hand, if the model generation unit 301 determines that there is a vertex in the previous stage of the job Ji (S404-32; Yes), the model generation unit 301 acquires the identifier β of the previous stage (S404-33).

 モデル生成部301は、モデル情報500に、βを識別子として含む行を追加する(S404-35)。モデル生成部301は、当該追加行の次要素へのポインタにジョブJiの名称を設定する(S404-36)。モデル生成部301は、当該追加行の流量上限値及び流量下限値に、ジョブJiに割り当てられた最大単位処理量及び最低単位処理量を設定する(S404-37)。 The model generation unit 301 adds a line including β as an identifier to the model information 500 (S404-35). The model generation unit 301 sets the name of the job Ji as a pointer to the next element of the added row (S404-36). The model generation unit 301 sets the maximum unit processing amount and the minimum unit processing amount assigned to the job Ji to the upper limit flow rate and lower limit flow rate of the additional row (S404-37).

 モデル生成部301は、ジョブJiが扱うデータを格納する各データサーバDjについて、工程(S404-39)から工程(S404-3B)をそれぞれ実行する(S404-38)。 The model generation unit 301 executes steps (S404-39) to (S404-3B) for each data server Dj that stores data handled by the job Ji (S404-38).

 モデル生成部301は、モデル情報500に、識別子がジョブJiを示す行を追加する(S404-39)。モデル生成部301は、当該追加行の次要素へのポインタに、データサーバDjの名称(又は識別子)を設定する(S404-3A)。モデル生成部301は、当該追加行の流量上限値に、ジョブJiがデータサーバDjに対して割り当て可能な転送量を設定し、当該追加行の流量下限値に0以上かつ流量上限値以下の値を設定する(S404-3B)。ジョブJiがデータサーバDjに対して割り当て可能な転送量は、例えば、ジョブJiが扱う各データに対して指定される要求処理量を示し、ユーザによって与えられても良いし、分散システム350により決定されてもよい。 The model generation unit 301 adds a line whose identifier indicates job Ji to the model information 500 (S404-39). The model generation unit 301 sets the name (or identifier) of the data server Dj as a pointer to the next element of the added row (S404-3A). The model generation unit 301 sets a transfer amount that can be allocated to the data server Dj by the job Ji to the flow rate upper limit value of the additional row, and a value that is greater than or equal to 0 and less than or equal to the flow rate upper limit value of the flow rate lower limit value of the additional row. Is set (S404-3B). The transfer amount that can be allocated to the data server Dj by the job Ji indicates, for example, the requested processing amount specified for each data handled by the job Ji, and may be given by the user or determined by the distributed system 350 May be.

 〔第2実施形態の作用及び効果〕
 第2実施形態では、第1実施形態における、分散システム350内の各入出力通信路の通信帯域と各処理サーバ330の処理能力とに関する制約に加えて、各ジョブに指定される単位処理量の制約、各ジョブで扱われる各データに指定される単位処理量の制約が加味されたモデル情報が生成され、そのモデル情報に基づき構築され得るネットワークモデル(概念モデル)から、処理サーバ330とその処理サーバ330の処理対象となるデータの取得先であるデータサーバ340との組み合わせが決定される。
[Operation and Effect of Second Embodiment]
In the second embodiment, in addition to the restrictions on the communication bandwidth of each input / output communication path and the processing capability of each processing server 330 in the distributed system 350 in the first embodiment, the unit processing amount specified for each job The processing server 330 and its processing are generated from a network model (conceptual model) that can be generated based on model information in which constraints and unit processing amount constraints specified for each data handled in each job are added. A combination with the data server 340 that is the acquisition destination of data to be processed by the server 330 is determined.

 これにより、第2実施形態によれば、分散システム350で実行されるジョブに対して指定された単位処理量が考慮されて、当該ジョブを構成する処理対象データの送受信及び処理が実行されるため、そのジョブの処理時間を最小化することができる。 Thereby, according to the second embodiment, the unit processing amount specified for the job executed in the distributed system 350 is taken into consideration, and transmission / reception and processing of the processing target data constituting the job are executed. , The processing time of the job can be minimized.

 また、第2実施形態では、各ジョブに優先度が設定されている場合に、各優先度が、各ジョブに指定された単位処理量のジョブ間の比率として設定され得る。よって、第2実施形態によれば、各ジョブに優先度が設定されている場合においても、設定された優先度の制約を満たし、かつ、全体として処理時間を最小とするように、処理対象データの送受信及び処理を実行することができる。 Further, in the second embodiment, when priority is set for each job, each priority can be set as a ratio between jobs of a unit processing amount specified for each job. Therefore, according to the second embodiment, even when a priority is set for each job, the processing target data is set so as to satisfy the set priority constraint and minimize the processing time as a whole. Transmission / reception and processing can be executed.

 [第2実施形態の第1変形例]
 第2実施形態の第1変形例におけるマスタサーバ300は、モデル情報500の処理サーバ330を示す識別子を含む行において、次要素へのポインタに、ジョブ毎の終端点を設定する。この場合、処理サーバ330を示す識別子を含む行の数は、ジョブの終端点の数と等しくなる。
[First Modification of Second Embodiment]
The master server 300 according to the first modification of the second embodiment sets a termination point for each job as a pointer to the next element in the line including the identifier indicating the processing server 330 of the model information 500. In this case, the number of rows including the identifier indicating the processing server 330 is equal to the number of job end points.

 また、モデル情報500における処理サーバ330を示す識別子を含む行の流量下限値及び流量上限値として、その処理サーバ330の各ジョブに対する単位時間当たりの可能処理量が設定されるようにしてもよい。図24は、第2実施形態の第1変形例におけるサーバ状態格納部3060が格納する情報の例を示す図である。図24に示されるように、サーバ状態格納部3060は、各処理サーバ330の処理可能量情報3065として、各ジョブについての処理可能量をそれぞれ格納する。 Further, the possible processing amount per unit time for each job of the processing server 330 may be set as the lower limit value and the upper limit value of the flow including the identifier indicating the processing server 330 in the model information 500. FIG. 24 is a diagram illustrating an example of information stored in the server state storage unit 3060 in the first modification of the second embodiment. As illustrated in FIG. 24, the server state storage unit 3060 stores the processable amount for each job as the processable amount information 3065 of each processing server 330.

 以下、第2実施形態の第1変形例におけるマスタサーバ300の詳細動作について、第2実施形態と異なる内容を中心に説明する。図25は、図22に示される工程(S404-20)に関する、第2実施形態の第1変形例におけるマスタサーバ300の詳細動作を示すフローチャートである。 Hereinafter, the detailed operation of the master server 300 in the first modification of the second embodiment will be described focusing on the content different from the second embodiment. FIG. 25 is a flowchart showing detailed operations of the master server 300 in the first modified example of the second embodiment regarding the step (S404-20) shown in FIG.

 モデル生成部301は、当該要求情報に基づいてサーバ状態格納部3060から取得された利用可能な各処理サーバPiについて、工程(S404-2B)をそれぞれ実行する(S404-2A)。 The model generation unit 301 executes the process (S404-2B) for each available processing server Pi acquired from the server state storage unit 3060 based on the request information (S404-2A).

 モデル生成部301は、各ジョブJjについて、工程(S404-2C)から工程(S404-2E)をそれぞれ実行する(S404-2B)。 The model generation unit 301 executes the steps (S404-2C) to (S404-2E) for each job Jj (S404-2B).

 モデル生成部301は、モデル情報500に、処理サーバPiの名称(又は識別子)を含む行を追加する(S404-2C)。モデル生成部301は、当該追加行の次要素へのポインタに、ジョブJjの終端点を示す識別子を設定する(S404-2D)。モデル生成部301は、当該追加行の流量上限値に、処理サーバPiのジョブJjに関する単位時間当たりの可能処理量を設定し、当該追加行の流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-2E)。なお、工程(S404-2E)は、工程(S404-2D)の前に実行されてもよい。 The model generation unit 301 adds a line including the name (or identifier) of the processing server Pi to the model information 500 (S404-2C). The model generation unit 301 sets an identifier indicating the end point of the job Jj as a pointer to the next element of the additional row (S404-2D). The model generation unit 301 sets a possible processing amount per unit time regarding the job Jj of the processing server Pi to the upper limit flow rate value of the additional row, and the flow rate lower limit value of the additional row is greater than or equal to 0 and less than or equal to the upper limit flow rate. A value is set (S404-2E). Note that the step (S404-2E) may be performed before the step (S404-2D).

 このように、第2実施形態の第1変形例によれば、処理サーバ330において処理可能量が異なるジョブも扱うことができ、処理サーバ330とデータサーバ340との組み合わせを決定するにあたり、各処理サーバ330におけるジョブ毎の処理可能量を加味することができる。よって、第2実施形態の第1変形例によれば、実行される各ジョブのシステム全体での処理時間をより精密に最小化することができる。 As described above, according to the first modification of the second embodiment, jobs having different processable amounts can be handled in the processing server 330, and each process is performed when determining the combination of the processing server 330 and the data server 340. The processable amount for each job in the server 330 can be taken into account. Therefore, according to the first modification of the second embodiment, it is possible to more accurately minimize the processing time of each job to be executed in the entire system.

 [第2実施形態の第2変形例]
 第2実施形態の第2変形例におけるマスタサーバ300は、モデル情報500の処理サーバ330を示す識別子を含む行において、次要素へのポインタに、ジョブを示す情報(名称)を設定する。この場合、処理サーバ330を示す識別子を含む行の数は、ジョブの数と等しくなる。第2実施形態の第2変形例では、上述の図25の工程(S404-2D)において、次要素へのポインタがジョブJjの名称(又は識別子)に設定される。
[Second Modification of Second Embodiment]
In the second modification of the second embodiment, the master server 300 sets information (name) indicating a job in the pointer to the next element in the line including the identifier indicating the processing server 330 in the model information 500. In this case, the number of rows including the identifier indicating the processing server 330 is equal to the number of jobs. In the second modification of the second embodiment, the pointer to the next element is set to the name (or identifier) of the job Jj in the above-described step of FIG. 25 (S404-2D).

 サーバ状態格納部3060が格納する情報は、第2実施形態の第1変形例と同様である。即ち、第2実施形態の第2変形例においても、モデル情報500における処理サーバ330を示す識別子を含む行の流量下限値及び流量上限値として、その処理サーバ330の各ジョブに対する単位時間当たりの可能処理量が設定される。 The information stored in the server state storage unit 3060 is the same as that in the first modification of the second embodiment. That is, also in the second modification example of the second embodiment, the possibility per unit time for each job of the processing server 330 is set as the flow rate lower limit value and the flow rate upper limit value of the line including the identifier indicating the processing server 330 in the model information 500. A processing amount is set.

 第2実施形態の第2変形例によれば、当該第1変形例に比べて、決定部303による各辺の流量の決定処理(図17の工程(S405-1)参照)を容易化及び高速化することができる。これは、第1変形例は、モデル情報を多品種フローでモデル化するのに対して、第2変形例は、モデル情報を循環フローでモデル化するため、概念モデルのアルゴリズムの適用の可能性を高めることができるからである。 According to the second modification of the second embodiment, the determination process of the flow rate of each side (see step (S405-1) in FIG. 17) by the determination unit 303 is easier and faster than the first modification. Can be This is because, in the first modification, model information is modeled by a multi-product flow, whereas in the second modification, model information is modeled by a circulation flow, the possibility of applying a conceptual model algorithm is possible. It is because it can raise.

 モデル情報を循環フローでモデル化する形態は、他の各実施形態にも適用可能である。この場合、モデル情報において、処理サーバ330を示す識別子を含む行の次要素へのポインタには、データサーバ340を示す識別子やデータを示す識別子が設定される。このモデル情報により構築される概念モデルでは、処理サーバ330を示す頂点から、データサーバ340を示す頂点又はデータを示す論理的頂点に至る辺が設けられる。 The form in which the model information is modeled by the circulation flow can be applied to other embodiments. In this case, in the model information, an identifier indicating the data server 340 and an identifier indicating the data are set in the pointer to the next element in the line including the identifier indicating the processing server 330. In the conceptual model constructed by this model information, an edge from the vertex indicating the processing server 330 to the vertex indicating the data server 340 or the logical vertex indicating data is provided.

 [第3実施形態]
 以下、第3実施形態における分散システム350について、第1実施形態及び第2実施形態とは異なる内容を中心に説明する。第1実施形態及び第2実施形態と同じ内容については適宜省略される。第3実施形態における分散システム350では、マスタサーバ300が、多重化された処理対象データも扱う。
[Third Embodiment]
Hereinafter, the distribution system 350 according to the third embodiment will be described focusing on the content different from the first embodiment and the second embodiment. The same contents as those in the first embodiment and the second embodiment are omitted as appropriate. In the distributed system 350 according to the third embodiment, the master server 300 also handles multiplexed processing target data.

 第3実施形態では、マスタサーバ300のデータ所在格納部3070は、データのサイズ情報を更に格納する。 In the third embodiment, the data location storage unit 3070 of the master server 300 further stores data size information.

 図26は、工程(S404)における第3実施形態のマスタサーバ300の詳細動作を示すフローチャートである。図26では、第1実施形態の詳細動作を示す図14に、工程(S404-40)が加えられている。工程(S404-40)では、モデル生成部301は、モデル情報500に、データからデータサーバへの論理的な通信路情報を追加する。なお、図26で示される各工程(S404-40)、(S404-10)及び(S404-20)はそれぞれ順序が入れ替えられても良い。 FIG. 26 is a flowchart showing a detailed operation of the master server 300 of the third embodiment in the step (S404). In FIG. 26, step (S404-40) is added to FIG. 14 showing the detailed operation of the first embodiment. In the step (S404-40), the model generation unit 301 adds logical communication path information from the data to the data server to the model information 500. Note that the order of the steps (S404-40), (S404-10), and (S404-20) shown in FIG. 26 may be changed.

 図27は、工程(S404-40)における第3実施形態のマスタサーバ300の詳細動作を示すフローチャートである。モデル生成部301は、当該要求情報に基づいて特定される処理対象データの集合内の各データdiについて、工程(S404-42)をそれぞれ実行する(S404-41)。 FIG. 27 is a flowchart showing detailed operations of the master server 300 of the third embodiment in the step (S404-40). The model generation unit 301 executes the process (S404-42) for each data di in the set of processing target data specified based on the request information (S404-41).

 モデル生成部301は、多重化されたデータdiを格納する各データサーバDjについて、工程(S404-43)から工程(S404-45)をそれぞれ実行する(S404-42)。ここで、多重化されたデータdiは、多重化数分存在する。 The model generation unit 301 executes steps (S404-43) to (S404-45) for each data server Dj that stores the multiplexed data di (S404-42). Here, the multiplexed data di exists for the number of multiplexed data.

 モデル生成部301は、diを識別子として含む行を追加する(S404-43)。モデル生成部301は、当該追加行の次要素へのポインタに、データサーバDjの名称(又は識別子)を設定する(S404-44)。モデル生成部301は、当該追加行における流量上限値及び流量下限値に、データサーバDjに対して指定されたデータdiの最大処理量及び最低処理量を設定する(S404-45)。データサーバDjに対してデータdiの最大処理量又は最低処理量もしくはその両方が指定されていない場合、流量上限値と流量下限値はマスタサーバ300が決定してもよい。この場合、マスタサーバ300は、例えば、流量上限値に無限大を設定し、流量下限値に0を設定する。 The model generation unit 301 adds a line including di as an identifier (S404-43). The model generation unit 301 sets the name (or identifier) of the data server Dj as a pointer to the next element of the added row (S404-44). The model generation unit 301 sets the maximum processing amount and the minimum processing amount of the data di specified for the data server Dj to the flow rate upper limit value and the flow rate lower limit value in the additional row (S404-45). When the maximum processing amount and / or the minimum processing amount of the data di is not designated for the data server Dj, the master server 300 may determine the upper limit flow rate and the lower limit flow rate. In this case, for example, the master server 300 sets infinity as the flow rate upper limit value and sets 0 as the flow rate lower limit value.

 これにより、多重化されたデータには、モデル生成部301によって生成されたモデル情報500において、共通の識別子diが付される。つまり、共通の識別子diが付された行は、データdiの多重化数分追加される。 Thus, a common identifier di is attached to the multiplexed data in the model information 500 generated by the model generation unit 301. That is, the lines with the common identifier di are added by the number of multiplexed data di.

 図28は、工程(S406)における第3実施形態のマスタサーバ300の詳細動作を示すフローチャートである。第3実施形態では、異なるデータサーバ340に多重化されて格納される各データについて各処理サーバ330がそれぞれ割り当てられる。これにより、複数のデータフロー情報に、多重化された同一データを示す同一頂点が含まれる場合がある。 FIG. 28 is a flowchart showing a detailed operation of the master server 300 of the third embodiment in the step (S406). In the third embodiment, each processing server 330 is assigned to each piece of data multiplexed and stored in different data servers 340. Thereby, the same vertex which shows the multiplexed same data may be contained in several data flow information.

 マスタサーバ300の決定部303は、処理対象データの集合内の、各データdiについて、工程(S406-2-1)及び工程(S406-3-1)をそれぞれ実行する(S406-1-1)。 The determination unit 303 of the master server 300 executes the step (S406-2-1) and the step (S406-3-1) for each data di in the set of processing target data (S406-1-1). .

 決定部303は、データdiを経路情報に含むデータフロー情報を特定し、特定された各データフロー情報に設定されている単位処理量を、当該各データフロー情報の経路情報に含まれる処理サーバ330毎に集計する。決定部303は、当該データdiをその集計された処理サーバ330毎の単位処理量の比で分割し、各分割データdiと、それを格納する各データサーバ340とを対応付ける(S406-2-1)。 The determination unit 303 identifies data flow information including the data di in the path information, and sets the unit processing amount set in each identified data flow information to the processing server 330 included in the path information of each data flow information. Aggregate every time. The deciding unit 303 divides the data di by the ratio of the unit processing amount for each of the aggregated processing servers 330, and associates each divided data di with each data server 340 that stores the divided data di (S406-2-1). ).

 決定部303は、データdiを含む経路情報の集合内の各経路情報fjについて、工程(S406-4-1)をそれぞれ実行する(S406-3-1)。決定部303は、経路情報fjに含まれる処理サーバPkに対して、処理プログラム及び決定情報を送付する(S406-4-1)。ここで処理プログラムとは、当該データdiを格納するデータサーバ340から、データdiの分割部分を転送するよう指示するための処理プログラムである。データサーバ340及びデータは、決定情報に含まれる情報によって特定される。 The determination unit 303 executes the step (S406-4-1) for each piece of route information fj in the set of route information including the data di (S406-3-1). The determination unit 303 sends the processing program and the determination information to the processing server Pk included in the route information fj (S406-4-1). Here, the processing program is a processing program for instructing to transfer a divided portion of the data di from the data server 340 storing the data di. The data server 340 and the data are specified by information included in the determination information.

 〔第3実施形態の作用及び効果〕
 第3実施形態における分散システム350では、処理対象データが多重化されている場合には、多重化後の各データについて処理サーバ330をそれぞれ割り当て、多重化後の各データの処理のための通信帯域及び処理サーバ330の処理能力を考慮したモデル情報が生成される。そして、そのモデル情報により構築され得る概念モデル(ネットワークモデル)に基づいて、処理サーバ330とその処理サーバ330の処理対象となるデータの取得先であるデータサーバ340との組み合わせが決定される。このとき、多重化後の各データはそれぞれ重複して転送及び処理されるわけではなく、通信帯域及び処理能力に応じた量に分割された各データが各処理サーバ330にそれぞれ割り当てられ、システム全体として多重化対象のデータが転送及び処理されるように制御される。
[Operation and Effect of Third Embodiment]
In the distributed system 350 according to the third embodiment, when the processing target data is multiplexed, a processing server 330 is assigned to each multiplexed data, and a communication band for processing each multiplexed data And model information considering the processing capability of the processing server 330 is generated. Then, based on a conceptual model (network model) that can be constructed from the model information, a combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined. At this time, each multiplexed data is not transferred and processed in duplicate, but each data divided into an amount corresponding to the communication bandwidth and processing capacity is allocated to each processing server 330, and the entire system The data to be multiplexed is controlled so as to be transferred and processed.

 従って、第3実施形態によれば、処理対象データが多重化されている場合であっても、分散システム350全体としてジョブの処理時間を最小化することができる。 Therefore, according to the third embodiment, even when the processing target data is multiplexed, it is possible to minimize the job processing time of the distributed system 350 as a whole.

 [第4実施形態]
 以下、第4実施形態における分散システム350について、第1実施形態から第3実施形態とは異なる内容を中心に説明する。第1実施形態から第3実施形態と同じ内容については適宜省略される。第4実施形態における分散システム350では、マスタサーバ300が、データサーバ340から処理サーバ330への間を、その間に含まれる中間機器及び通信路により更に細かく定義し、それらの詳細の制約情報(可用帯域)を考慮してモデル情報を生成する。よって、第4実施形態では、データサーバ340から処理サーバ330への間を通信路(入出通信路)に代えデータ転送経路と表記し、そのデータ転送経路を形成するものを通信路と表記する。
[Fourth Embodiment]
Hereinafter, the distributed system 350 according to the fourth embodiment will be described focusing on the contents different from those of the first to third embodiments. The same contents as those in the first to third embodiments are omitted as appropriate. In the distributed system 350 in the fourth embodiment, the master server 300 further defines between the data server 340 and the processing server 330 by intermediate devices and communication paths included therebetween, and detailed constraint information (available) Model information is generated in consideration of the bandwidth. Therefore, in the fourth embodiment, a portion between the data server 340 and the processing server 330 is referred to as a data transfer path instead of a communication path (input / output communication path), and a path that forms the data transfer path is referred to as a communication path.

 図29は、第4実施形態における分散システム350の各装置の処理構成例を概念的に示す図である。第4実施形態におけるネットワークスイッチ320は、第1実施形態の構成に加えて、スイッチ管理部321を更に有する。第4実施形態では、中間機器として、ネットワークスイッチ320が存在する。 FIG. 29 is a diagram conceptually illustrating a processing configuration example of each device of the distributed system 350 in the fourth embodiment. The network switch 320 in the fourth embodiment further includes a switch management unit 321 in addition to the configuration of the first embodiment. In the fourth embodiment, there is a network switch 320 as an intermediate device.

 <処理サーバ330>
 処理サーバ管理部331は、マスタサーバ300に対して、処理サーバ330のディスク可用帯域やネットワーク可用帯域等の状態情報を送信する。
<Processing server 330>
The processing server management unit 331 transmits status information such as the disk available bandwidth and the network available bandwidth of the processing server 330 to the master server 300.

 <データサーバ340>
 データサーバ管理部341は、マスタサーバ300に対して、データサーバ340のディスク可用帯域やネットワーク可用帯域等を含む状態情報を送信する。
<Data server 340>
The data server management unit 341 transmits status information including the disk available bandwidth and the network available bandwidth of the data server 340 to the master server 300.

 <ネットワークスイッチ320>
 スイッチ管理部321は、ネットワークスイッチ320に接続される通信路の可用帯域等の情報を取得し、データ送受信部322を介してマスタサーバ300に送信する。
<Network switch 320>
The switch management unit 321 acquires information such as an available bandwidth of a communication path connected to the network switch 320 and transmits the information to the master server 300 via the data transmission / reception unit 322.

 <マスタサーバ300>
 入出力通信路情報格納部3080は、データサーバ340から処理サーバ330へのデータ転送経路に含まれる、各機器間の通信路の情報を格納する。通信路情報には、接続元装置の識別子、接続先装置の識別子、当該通信路の可用帯域情報等が含まれる。
<Master server 300>
The input / output communication path information storage unit 3080 stores information on communication paths included in the data transfer path from the data server 340 to the processing server 330. The communication path information includes an identifier of the connection source apparatus, an identifier of the connection destination apparatus, usable bandwidth information of the communication path, and the like.

 モデル生成部301は、データサーバ340に格納されているデータが処理サーバ330で受信されるまでに経由する中間機器(例えばネットワークスイッチ320)を示す頂点を更に含むと共に、データサーバ340を示す頂点からそのデータサーバ340の最寄りの中間機器を示す頂点へ至る辺であって、転送量制約条件の上限値がデータサーバ340からその最寄りの中間機器への単位時間当たりの転送可能量に設定される辺、中間機器を示す頂点から他の中間機器を示す頂点へ至る辺であって、転送量制約条件の上限値がその中間機器から当該他の中間機器への単位時間当たりの転送可能量に設定される辺、及び、処理サーバ330の最寄りの中間機器を示す頂点から処理サーバ330を示す頂点へ至る辺であって、転送量制約条件の上限値がその最寄りの中間機器から処理サーバ330への単位時間当たりの転送可能量に設定される辺の少なくとも1つを更に含む概念モデルを構築し得るモデル情報を生成する。例えば、当該中間機器が1つ存在する場合には、その概念モデルは、その中間機器を示す頂点と、データサーバ340を示す頂点からその中間機器を示す頂点へ至る辺と、その中間機器を示す頂点から処理サーバ330を示す頂点へ至る辺とを含む。また、例えば、当該中間機器として、データサーバ340の最寄りの第1中間機器と、処理サーバ330の最寄りの第2中間機器とが存在する場合には、その概念モデルは、第1中間機器及び第2中間機器を示す2つの頂点と、データサーバ340を示す頂点から第1中間機器を示す頂点へ至る辺と、第1中間機器を示す頂点から第2中間機器を示す頂点へ至る辺と、第2中間機器を示す頂点から処理サーバ330を示す頂点へ至る辺とを含む。また、モデル生成部301は、後述の実施例4で示されるように、中間機器を示す頂点を、中間機器への1以上の入力部を示す1以上の頂点と、中間機器の1以上の出力部を示す1以上の頂点と、データが転送され得る入力部と出力部との間を結ぶ1以上の辺とで構成するようにしてもよい。 The model generation unit 301 further includes a vertex indicating an intermediate device (for example, the network switch 320) through which the data stored in the data server 340 is received by the processing server 330, and from the vertex indicating the data server 340. The edge that reaches the apex that indicates the nearest intermediate device of the data server 340, and the upper limit of the transfer amount restriction condition is set to the transferable amount per unit time from the data server 340 to the nearest intermediate device The upper limit value of the transfer amount restriction condition is set to the transferable amount per unit time from the intermediate device to the other intermediate device from the vertex indicating the intermediate device to the vertex indicating the other intermediate device. And the edge from the vertex indicating the nearest intermediate device of the processing server 330 to the vertex indicating the processing server 330, and the transfer amount restriction condition Upper limit generates a further model information may be constructed a conceptual model comprising at least one of the sides is set to transferable per unit time to the processing server 330 from the nearest intermediate devices. For example, when there is one intermediate device, the conceptual model indicates the vertex indicating the intermediate device, the edge from the vertex indicating the data server 340 to the vertex indicating the intermediate device, and the intermediate device. And an edge from the vertex to the vertex indicating the processing server 330. Further, for example, when there are a first intermediate device closest to the data server 340 and a second intermediate device closest to the processing server 330 as the intermediate device, the conceptual model is the first intermediate device and the second intermediate device. Two vertices indicating two intermediate devices, an edge from the vertex indicating the data server 340 to the vertex indicating the first intermediate device, an edge extending from the vertex indicating the first intermediate device to the vertex indicating the second intermediate device, 2 includes an edge extending from the vertex indicating the intermediate device to the vertex indicating the processing server 330. Further, as shown in Example 4 described later, the model generation unit 301 has a vertex indicating the intermediate device, one or more vertexes indicating one or more input units to the intermediate device, and one or more outputs of the intermediate device. You may make it comprise one or more vertices which show a part, and one or more sides which connect between the input part and output part which can transfer data.

 〔第4実施形態における動作例〕
 図30は、工程(S404)における第4実施形態のマスタサーバ300の詳細動作を示すフローチャートである。マスタサーバ300のモデル生成部301は、未処理の処理対象データを格納している各データサーバDiについて、工程(S404-12-10)をそれぞれ実行する(S404-11)。
[Operation Example in Fourth Embodiment]
FIG. 30 is a flowchart showing detailed operations of the master server 300 of the fourth embodiment in the step (S404). The model generation unit 301 of the master server 300 executes the process (S404-12-10) for each data server Di storing unprocessed processing target data (S404-11).

 モデル生成部301は、モデル情報500に、データサーバDiから処理サーバ330へのデータ転送経路に関する行を追加する(S404-12-10)。 The model generation unit 301 adds a line related to the data transfer path from the data server Di to the processing server 330 to the model information 500 (S404-12-10).

 図31A、図31B及び図31Cは、工程(S404-12-10)における第4実施形態のマスタサーバ300の詳細動作を示すフローチャートである。ここで、図31Aの処理の開始にあたり、デバイスIDiには、初期値として、データサーバDiの名称(又は識別子)が設定されている。 FIGS. 31A, 31B, and 31C are flowcharts showing detailed operations of the master server 300 of the fourth embodiment in the step (S404-12-10). Here, at the start of the processing of FIG. 31A, the name (or identifier) of the data server Di is set as the initial value in the device IDi.

 モデル生成部301は、入出力通信路情報格納部3080から、入力元装置IDに、デバイスIDiが設定されている行の情報(入出力通信路情報)を抽出する(S404-12-11)。モデル生成部301は、抽出された入出力通信路情報に含まれる出力先装置IDの集合を特定する(S404-12-12)。 The model generation unit 301 extracts information (input / output channel information) of a row in which the device IDi is set as the input source device ID from the input / output channel information storage unit 3080 (S404-12-11). The model generation unit 301 identifies a set of output destination device IDs included in the extracted input / output communication path information (S404-12-12).

 モデル生成部301は、デバイスIDiがスイッチを示すか否か判定する(S404-12-13)。モデル生成部301は、デバイスIDiがスイッチを示すと判定した場合(S404-12-13;Yes)、図31Bの処理を行う。図31Bについては後述する。 The model generation unit 301 determines whether or not the device IDi indicates a switch (S404-12-13). When the model generation unit 301 determines that the device IDi indicates a switch (S404-12-13; Yes), the model generation unit 301 performs the process of FIG. 31B. FIG. 31B will be described later.

 一方、モデル生成部301は、デバイスIDiがスイッチを示さないと判定した場合(S404-12-13;No)、更に、デバイスIDiを識別子として含む行が既にモデル情報500に設定されているか否か判定する(S404-12-14)。モデル生成部301は、デバイスIDiを識別子として含む行が既にモデル情報500に設定されていると判定した場合(S404-12-14;Yes)、図31Aの処理を終了する。 On the other hand, if the model generation unit 301 determines that the device IDi does not indicate a switch (S404-12-13; No), whether or not a line including the device IDi as an identifier has already been set in the model information 500. Determination is made (S404-12-14). If the model generation unit 301 determines that the line including the device IDi as an identifier has already been set in the model information 500 (S404-12-14; Yes), the model generation unit 301 ends the process of FIG. 31A.

 モデル生成部301は、デバイスIDiを識別子として含む行が未だモデル情報500に設定されていないと判定した場合(S404-12-14;No)、工程(S404-12-12)により特定された出力先装置IDの集合内の各出力先装置IDjについて、工程(S404-12-16)をそれぞれ実行する(S404-12-15)。 If the model generation unit 301 determines that the line including the device IDi as an identifier has not yet been set in the model information 500 (S404-12-14; No), the output specified by the step (S404-12-12) The process (S404-12-16) is executed for each output destination device IDj in the set of destination devices ID (S404-12-15).

 工程(S404-12-16)において、モデル生成部301は、出力先装置IDjがスイッチを示すか否か判定する。モデル生成部301は、出力先装置IDjがスイッチを示すと判定した場合(S404-12-16;Yes)、図31Cの処理を行う。図31Cについては後述する。 In step (S404-12-16), the model generation unit 301 determines whether the output destination device IDj indicates a switch. When the model generation unit 301 determines that the output destination device IDj indicates a switch (S404-12-16; Yes), the model generation unit 301 performs the process of FIG. 31C. FIG. 31C will be described later.

 一方、モデル生成部301は、出力先装置IDjがスイッチを示さないと判定した場合(S404-12-16;No)、モデル情報500に、デバイスIDiを識別子として含む行を追加する(S404-12-17)。モデル生成部301は、当該追加行における次要素へのポインタに出力先装置IDjを設定する(S404-12-18)。更に、モデル生成部301は、当該追加行における流量上限値に、デバイスIDiで示される装置と当該出力先装置IDjで示される装置との間の入出力通信路の可用帯域を設定し、当該追加行における流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-12-19)。なお、工程(S404-12-19)は、工程(S404-12-18)の前に実行されてもよい。 On the other hand, when the model generation unit 301 determines that the output destination device IDj does not indicate a switch (S404-12-16; No), the model generation unit 301 adds a line including the device IDi as an identifier to the model information 500 (S404-12). -17). The model generation unit 301 sets the output destination device IDj as a pointer to the next element in the added row (S404-12-18). Further, the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device indicated by the output destination device IDj as the flow rate upper limit value in the additional row, and adds the additional A value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the row (S404-12-19). Note that the step (S404-12-19) may be executed before the step (S404-12-18).

 続いて、モデル生成部301は、出力先装置IDjが処理サーバ330を示すか否か判定する(S404-12-1A)。モデル生成部301は、出力先装置IDjが処理サーバ330を示さないと判定した場合(S404-12-1A;No)、図31Aの処理を再帰的に実行する(S404-12-1B)。このとき、デバイスIDiには、出力先装置IDjが初期値として設定される。これにより、モデル情報500には、出力先装置IDjを識別子として含む行が追加される。なお、モデル生成部301は、出力先装置IDjが処理サーバ330を示すと判定した場合(S404-12-1A;Yes)、図31Aの処理の再帰実行を行わない。 Subsequently, the model generation unit 301 determines whether or not the output destination device IDj indicates the processing server 330 (S404-12-1A). When the model generation unit 301 determines that the output destination device IDj does not indicate the processing server 330 (S404-12-1A; No), the model generation unit 301 recursively executes the processing of FIG. 31A (S404-12-1B). At this time, the output destination device IDj is set as the initial value for the device IDi. As a result, a line including the output destination device IDj as an identifier is added to the model information 500. If the model generation unit 301 determines that the output destination device IDj indicates the processing server 330 (S404-12-1A; Yes), the model generation unit 301 does not perform recursive execution of the processing in FIG. 31A.

 以下、図31Bについて説明する。
 図31Aの工程(S404-12-13)において、デバイスIDiがスイッチを示すと判定された場合(S404-12-13;Yes)、モデル生成部301は、工程(S404-12-12)により特定された出力先装置IDの集合内の各出力先装置IDjについて、工程(S404-12-1D)をそれぞれ実行する(S404-12-1C)。以降、スイッチを示すデバイスIDiはスイッチiと表記される場合もある。
Hereinafter, FIG. 31B will be described.
If it is determined in step (S404-12-13) in FIG. 31A that the device IDi indicates a switch (S404-12-13; Yes), the model generation unit 301 identifies the device ID in step (S404-12-12). The process (S404-12-1D) is executed for each output destination device IDj in the set of output destination device IDs (S404-12-1C). Hereinafter, the device ID i indicating the switch may be referred to as a switch i.

 モデル生成部301は、スイッチiにおける、出力先装置IDjへの出力口の識別子を含む行がモデル情報500に存在するか否かを判定する(S404-12-1D)。モデル生成部301は、モデル情報にその出力口の識別子を含む行が存在しないと判定した場合(S404-12-1D;No)、モデル情報500に、スイッチiにおける出力先装置IDjへの出力口の識別子を含む行を追加する(S404-12-1E)。 The model generation unit 301 determines whether a line including the identifier of the output port for the output destination device IDj in the switch i exists in the model information 500 (S404-12-1D). When the model generation unit 301 determines that there is no line including the identifier of the output port in the model information (S404-12-1D; No), the model generation unit 301 stores the output port to the output destination device IDj in the switch i. A line including the identifier is added (S404-12-1E).

 モデル生成部301は、当該追加行における次要素へのポインタに、出力先装置IDjを設定する(S404-12-1F)。更に、モデル生成部301は、当該追加行における流量上限値に、デバイスIDiで示される装置と当該出力先装置IDjで示される装置との間の入出力通信路の可用帯域を設定し、当該追加行における流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-12-1G)。なお、工程(S404-12-1G)は、工程(S404-12-1F)の前に実行されてもよい。一方、モデル生成部301は、モデル情報500に当該出力口の識別子を含む行が存在すると判定した場合(S404-12-1D;Yes)、上述の工程(S404-12-1E)、(S404-12-1F)及び(S404-12-1G)を実行しない。 The model generation unit 301 sets the output destination device IDj as a pointer to the next element in the added row (S404-12-1F). Further, the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device indicated by the output destination device IDj as the flow rate upper limit value in the additional row, and adds the additional A value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the row (S404-12-1G). Note that the step (S404-12-1G) may be performed before the step (S404-12-1F). On the other hand, when the model generation unit 301 determines that there is a line including the identifier of the output port in the model information 500 (S404-12-1D; Yes), the above-described steps (S404-12-1E) and (S404- 12-1F) and (S404-12-1G) are not executed.

 モデル生成部301は、入力元装置IDが当該出力先装置IDjとは異なる各スイッチiにおける入力口の識別子kについて、工程(S404-12-1I)をそれぞれ実行する(S404-12-1H)。 The model generation unit 301 executes the step (S404-12-1I) for the input port identifier k in each switch i whose input source device ID is different from the output destination device IDj (S404-12-1H).

 モデル生成部301は、モデル情報500に対して、識別子kを含む行の中で、次要素へのポインタに当該出力先装置IDjへの出力口を示す識別子が設定されている行が存在するか否か判定する(S404-12-1I)。モデル生成部301は、モデル情報500に該当する行が存在しないと判定した場合(S404-12-1I;No)、モデル情報500に、kを識別子として含む行を追加する(S404-12-1J)。 Whether the model generation unit 301 has a line in the model information 500 that includes an identifier k indicating an output port to the output destination apparatus IDj in a pointer to the next element among lines including the identifier k. It is determined whether or not (S404-12-1I). When the model generation unit 301 determines that there is no corresponding row in the model information 500 (S404-12-1I; No), the model generation unit 301 adds a row including k as an identifier to the model information 500 (S404-12-1J). ).

 モデル生成部301は、当該追加行における次要素へのポインタに、スイッチiにおける当該出力先装置IDjへの出力口の識別子を設定する(S404-12-1K)。モデル生成部301は、当該追加行における流量上限値に無限大を設定し、当該追加行における流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-12-1L)。なお、工程(S404-12-1L)は、工程(S404-12-1K)の前に実行されてもよい。 The model generation unit 301 sets the identifier of the output port to the output destination device IDj in the switch i in the pointer to the next element in the additional row (S404-12-1K). The model generation unit 301 sets infinity to the upper limit of flow rate in the additional row, and sets a value that is greater than or equal to 0 and lower than or equal to the upper limit of flow rate in the additional row (S404-12-1L). Note that the step (S404-12-1L) may be executed before the step (S404-12-1K).

 一方、モデル情報500に該当する行が存在すると判定した場合(S404-12-1I;Yes)、上述の工程(S404-12-1J)、(S404-12-1K)及び(S404-12-1L)は実行されない。なお、工程(S404-12-1H)が示す処理ループは、工程(S404-12-1D)の前に実行されてもよい。 On the other hand, when it is determined that the corresponding row exists in the model information 500 (S404-12-1I; Yes), the above-described steps (S404-12-1J), (S404-12-1K) and (S404-12-1L) are performed. ) Is not executed. Note that the processing loop indicated by the step (S404-12-1H) may be executed before the step (S404-12-1D).

 続いて、モデル生成部301は、出力先装置IDjが処理サーバ330を示すか否か判定する(S404-12-1M)。モデル生成部301は、出力先装置IDjが処理サーバ330を示さないと判定した場合(S404-12-1M;No)、図31Aの処理を再帰的に実行する(S404-12-1N)。このとき、デバイスIDiは、出力先装置IDjが初期値として設定される。これにより、モデル情報500には、出力先装置IDjを識別子として含む行が追加される。一方、モデル生成部301は、出力先装置IDjが処理サーバ330を示すと判定した場合(S404-12-1M;Yes)、図31Aの処理の再帰実行を行わない。 Subsequently, the model generation unit 301 determines whether or not the output destination device IDj indicates the processing server 330 (S404-12-1M). When the model generation unit 301 determines that the output destination device IDj does not indicate the processing server 330 (S404-12-1M; No), the model generation unit 301 recursively executes the processing of FIG. 31A (S404-12-1N). At this time, the output destination device IDj is set as the initial value for the device IDi. As a result, a line including the output destination device IDj as an identifier is added to the model information 500. On the other hand, when the model generation unit 301 determines that the output destination device IDj indicates the processing server 330 (S404-12-1M; Yes), the model generation unit 301 does not perform recursive execution of the processing in FIG. 31A.

 以下、図31Cについて説明する。以降、スイッチを示す出力先装置IDjはスイッチjと表記される場合もある。 Hereinafter, FIG. 31C will be described. Hereinafter, the output destination device IDj indicating a switch may be referred to as a switch j.

 図31Aの工程(S404-12-16)において、出力先装置IDjがスイッチを示すと判定された場合(S404-12-16;Yes)、モデル生成部301は、出力先装置IDがデバイスIDiとは異なる各スイッチjにおける出力口の識別子kについて、工程(S404-12-1P)をそれぞれ実行する(S404-12-1O)。 In the step of FIG. 31A (S404-12-16), when it is determined that the output destination device IDj indicates a switch (S404-12-16; Yes), the model generation unit 301 determines that the output destination device ID is the device IDi. The process (S404-12-1P) is executed for each output port identifier k in each different switch j (S404-12-1O).

 モデル生成部301は、モデル情報500において、次要素へのポインタに識別子kが設定されている行が存在するか否か判定する(S404-12-1P)。モデル生成部301は、該当する行が存在しないと判定した場合(S404-12-1P;No)、モデル情報500に、スイッチjにおけるデバイスIDiからの入力口の識別子を含む行を追加する(S404-12-1Q)。 The model generation unit 301 determines whether there is a line in the model information 500 in which the identifier k is set as the pointer to the next element (S404-12-1P). When the model generation unit 301 determines that the corresponding row does not exist (S404-12-1P; No), the model generation unit 301 adds a row including the identifier of the input port from the device IDi in the switch j to the model information 500 (S404). -12-1Q).

 モデル生成部301は、当該追加行における次要素へのポインタに、識別子kを設定する(S404-12-1R)。更に、モデル生成部301は、当該追加行における流量上限値に無限大を設定し、当該追加行における流量下限値に0を設定する(S404-12-1S)。なお、工程(S404-12-1S)は、工程(S404-12-1R)の前に実行されてもよい。一方、モデル生成部301は、該当する行が存在すると判定した場合(S404-12-1M;Yes)、工程(S404-12-1Q)、(S404-12-1R)及び(S404-12-1S)を実行しない。 The model generation unit 301 sets the identifier k to the pointer to the next element in the added row (S404-12-1R). Further, the model generation unit 301 sets infinity to the upper limit value of the flow rate in the additional row, and sets 0 to the lower limit value of the flow rate in the additional row (S404-12-1S). Note that the step (S404-12-1S) may be executed before the step (S404-12-1R). On the other hand, when the model generation unit 301 determines that the corresponding row exists (S404-12-1M; Yes), the process (S404-12-1Q), (S404-12-1R), and (S404-12-1S) ) Is not executed.

 続いて、モデル生成部301は、モデル情報500に、デバイスIDiを識別子として含む行を追加する(S404-12-1T)。モデル生成部301は、当該追加行における次要素へのポインタに、デバイスIDiにおけるスイッチjからの入力口の識別子を設定する(S404-12-1U)。更に、モデル生成部301は、当該追加行における流量上限値に、デバイスIDiで示される装置と当該出力先装置IDjで示される装置(スイッチj)との間の入出力通信路の可用帯域を設定し、当該追加行における流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-12-1V)。なお、工程(S404-12-1V)は、工程(S404-12-1U)の前に実行されてもよい。 Subsequently, the model generation unit 301 adds a line including the device IDi as an identifier to the model information 500 (S404-12-1T). The model generation unit 301 sets the identifier of the input port from the switch j in the device IDi to the pointer to the next element in the additional row (S404-12-1U). Further, the model generation unit 301 sets the available bandwidth of the input / output communication path between the device indicated by the device IDi and the device (switch j) indicated by the output destination device IDj to the flow rate upper limit value in the additional row. Then, a value not less than 0 and not more than the upper limit of the flow rate is set as the lower limit of the flow rate in the additional row (S404-12-1V). Note that the step (S404-12-1V) may be executed before the step (S404-12-1U).

 モデル生成部301は、図31Aの処理を再帰的に実行する(S404-12-1W)。このとき、デバイスIDiは、出力先装置IDjが初期値として設定される。これにより、モデル情報500には、出力先装置IDjを識別子として含む行が追加される。 The model generation unit 301 recursively executes the process of FIG. 31A (S404-12-1W). At this time, the output destination device IDj is set as the initial value for the device IDi. As a result, a line including the output destination device IDj as an identifier is added to the model information 500.

 〔第4実施形態の作用及び効果〕
 第4実施形態では、各データサーバ340と各処理サーバ330との間のデータ転送経路を構成する機器及び通信路の可用帯域が更に考慮されて、処理サーバ330とその処理サーバ330の処理対象となるデータの取得先であるデータサーバ340との組み合わせが決定される。よって、第4実施形態によれば、実行される各ジョブのシステム全体での処理時間をより精密に最小化することができる。
[Operation and Effect of Fourth Embodiment]
In the fourth embodiment, the devices constituting the data transfer path between each data server 340 and each processing server 330 and the available bandwidth of the communication path are further considered, and the processing server 330 and the processing target of the processing server 330 are The combination with the data server 340 that is the data acquisition destination is determined. Therefore, according to the fourth embodiment, it is possible to more accurately minimize the processing time of each job to be executed in the entire system.

 [第5実施形態]
 以下、第5実施形態における分散システム350について、上述の各実施形態とは異なる内容を中心に説明する。上述の各実施形態と同じ内容については適宜省略される。第5実施形態における分散システム350では、マスタサーバ300が、処理サーバ330により複数のジョブの処理が並行して実行される場合の処理能力の低下を考慮してモデル情報を生成する。
[Fifth Embodiment]
Hereinafter, the distributed system 350 according to the fifth embodiment will be described focusing on the content different from the above-described embodiments. About the same content as each above-mentioned embodiment, it abbreviate | omits suitably. In the distributed system 350 according to the fifth embodiment, the master server 300 generates model information in consideration of a decrease in processing capability when processing of a plurality of jobs is executed by the processing server 330 in parallel.

 図32は、工程(S404-20)(図22参照)における第5実施形態のマスタサーバ300の詳細動作を示すフローチャートである。第5実施形態では、第2実施形態で生成される情報に加えて、複数のジョブの処理による処理サーバ330の処理能力の低下を示す辺がモデル情報に追加される。 FIG. 32 is a flowchart showing the detailed operation of the master server 300 of the fifth embodiment in the step (S404-20) (see FIG. 22). In the fifth embodiment, in addition to the information generated in the second embodiment, an edge indicating a decrease in processing capability of the processing server 330 due to processing of a plurality of jobs is added to the model information.

 モデル生成部301は、当該要求情報に基づいてサーバ状態格納部3060から取得された利用可能な各処理サーバPiについて、工程(S404-2G)から工程(S404-2J)をそれぞれ実行する(S404-2F)。 The model generation unit 301 executes steps (S404-2G) to (S404-2J) for each available processing server Pi acquired from the server state storage unit 3060 based on the request information (S404-). 2F).

 モデル生成部301は、モデル情報500に、処理サーバPiの名称(又は識別子)を含む行を追加する(S404-2G)。モデル生成部301は、当該追加行における次要素へのポインタに、処理サーバPiを示す第2の名称(又は識別子)を設定する(S404-2H)。処理サーバPiを示す第2の名称(又は識別子)とは、モデル情報500で一意となるPiと対応する名称(又は識別子)である。更に、モデル生成部301は、当該追加行における流量上限値に、処理サーバPiの単位時間当たりの可能処理量を設定し、当該追加行における流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-2I)。なお、工程(S404-2I)は、工程(S404-2H)の前に実行されてもよい。 The model generation unit 301 adds a line including the name (or identifier) of the processing server Pi to the model information 500 (S404-2G). The model generation unit 301 sets the second name (or identifier) indicating the processing server Pi to the pointer to the next element in the added row (S404-2H). The second name (or identifier) indicating the processing server Pi is a name (or identifier) corresponding to Pi that is unique in the model information 500. Further, the model generation unit 301 sets a possible processing amount per unit time of the processing server Pi in the flow rate upper limit value in the additional row, and a value that is greater than or equal to 0 and less than or equal to the flow rate upper limit value in the flow rate lower limit value in the additional row. Is set (S404-2I). Note that the step (S404-2I) may be performed before the step (S404-2H).

 モデル生成部301は、各ジョブJjについて、工程(S404-2K)から工程(S404-2M)をそれぞれ実行する(S404-2J)。 The model generation unit 301 executes the process (S404-2K) to the process (S404-2M) for each job Jj (S404-2J).

 モデル生成部301は、モデル情報500に、処理サーバPiを示す第2の名称(又は識別子)を含む行を追加する(S404-2K)。モデル生成部301は、当該追加行の次要素へのポインタに、ジョブJjの終端点を示す識別子を設定する(S404-2L)。更に、モデル生成部301は、当該追加行の流量上限値に、ジョブJjに対する処理サーバPiの単位時間当たりの可能処理量を設定し、当該追加行の流量下限値に、0以上かつ流量上限値以下の値を設定する(S404-2M)。なお、工程(S404-2M)は、工程(S404-2L)の前に実行されてもよい。 The model generation unit 301 adds a line including the second name (or identifier) indicating the processing server Pi to the model information 500 (S404-2K). The model generation unit 301 sets an identifier indicating the end point of the job Jj as a pointer to the next element of the additional row (S404-2L). Further, the model generation unit 301 sets a possible processing amount per unit time of the processing server Pi for the job Jj to the upper limit flow rate value of the additional row, and the flow rate lower limit value of the additional row is 0 or more and the upper limit flow rate value. The following values are set (S404-2M). Note that the step (S404-2M) may be performed before the step (S404-2L).

 〔第5実施形態の作用及び効果〕
 第5実施形態の分散システム350では、処理サーバ330の、並列に実行される各ジョブについての各処理能力が制約条件としてモデル情報に更に付加され、このモデル情報により構築され得る概念モデル(ネットワークモデル)に基づいて、処理サーバ330とその処理サーバ330の処理対象となるデータの取得先であるデータサーバ340との組み合わせが決定される。
[Operation and Effect of Fifth Embodiment]
In the distributed system 350 of the fifth embodiment, each processing capability of each job executed in parallel by the processing server 330 is further added to the model information as a constraint condition, and a conceptual model (network model) that can be constructed from this model information. ), The combination of the processing server 330 and the data server 340 that is the acquisition destination of data to be processed by the processing server 330 is determined.

 従って、第5実施形態によれば、複数のジョブの処理による処理能力の低下が考慮された分散システム350のデータ送受信及びデータ処理を実現することができ、システム全体で処理時間を最小とすることができる。 Therefore, according to the fifth embodiment, it is possible to realize data transmission / reception and data processing of the distributed system 350 in consideration of a decrease in processing capacity due to processing of a plurality of jobs, and to minimize processing time in the entire system. Can do.

 [第6実施形態]
 以下、第6実施形態における分散システム350について、上述の各実施形態とは異なる内容を中心に説明する。上述の各実施形態と同じ内容については適宜省略される。第6実施形態における分散システム350では、マスタサーバ300が、処理サーバ330により処理負荷の異なる複数のジョブの処理が並行して行われる場合の処理能力の低下を考慮してモデル情報を生成する。
[Sixth Embodiment]
Hereinafter, the distributed system 350 according to the sixth embodiment will be described focusing on the content different from the above-described embodiments. About the same content as each above-mentioned embodiment, it abbreviate | omits suitably. In the distributed system 350 according to the sixth embodiment, the master server 300 generates model information in consideration of a decrease in processing capability when processing of a plurality of jobs having different processing loads is performed in parallel by the processing server 330.

 <マスタサーバ300>
 図33は、第6実施形態におけるサーバ状態格納部3060が格納する情報の例を示す図である。図33に示されるように、サーバ状態格納部3060は、負荷情報3062として残リソース情報を格納し、更に新たな処理負荷情報3066を格納する。残リソース情報は、処理の実行時に処理サーバ330が使用するリソースの残量を示す。処理負荷情報は、処理の実行時に使用する処理サーバ330のリソースの量を示す。処理負荷情報は、例えば、単位処理量あたりのリソース使用量で表される。
<Master server 300>
FIG. 33 is a diagram illustrating an example of information stored in the server state storage unit 3060 according to the sixth embodiment. As illustrated in FIG. 33, the server state storage unit 3060 stores the remaining resource information as the load information 3062, and further stores new processing load information 3066. The remaining resource information indicates the remaining amount of resources used by the processing server 330 when the processing is executed. The processing load information indicates the amount of resources of the processing server 330 used when executing the processing. The processing load information is represented by, for example, a resource usage amount per unit processing amount.

 第6実施形態におけるマスタサーバ300の決定部303は、各辺の流量(流量関数f(e))を決定するにあたり(図17の工程(S405-1))、サーバ状態格納部3060から取得される残リソース情報(負荷情報3062)及び処理負荷情報3066を更に考慮する。 The determining unit 303 of the master server 300 according to the sixth embodiment acquires the flow rate (flow rate function f (e)) of each side (step (S405-1 in FIG. 17)) and is acquired from the server state storage unit 3060. The remaining resource information (load information 3062) and processing load information 3066 are further considered.

 〔第6実施形態の作用及び効果〕
 第6実施形態では、各処理サーバ330に関し、各ジョブの単位処理量当たりに費やされる処理負荷(リソース量)、及び、残負荷(リソース残量)がそれぞれ管理され、これらを制約条件にして概念モデルにおける経路が選択され、結果、各ジョブに関しデータをやりとりするデータサーバ340、処理サーバ330及び単位処理量の対応関係が決定される。即ち、第6実施形態では、処理負荷の異なるジョブが並列に処理される際の処理サーバ330の処理能力が更に加味される。
[Operation and Effect of Sixth Embodiment]
In the sixth embodiment, with respect to each processing server 330, the processing load (resource amount) and the remaining load (remaining resource amount) consumed per unit processing amount of each job are managed, and these are used as constraints as a concept. A route in the model is selected, and as a result, the correspondence between the data server 340, the processing server 330, and the unit processing amount for exchanging data regarding each job is determined. That is, in the sixth embodiment, the processing capability of the processing server 330 when jobs with different processing loads are processed in parallel is further added.

 従って、第6実施形態によれば、処理負荷の異なる複数のジョブの処理による処理能力の低下を考慮された分散システム350のデータ送受信及びデータ処理を実現することができ、システム全体で処理時間を最小とすることができる。 Therefore, according to the sixth embodiment, it is possible to realize data transmission / reception and data processing of the distributed system 350 in consideration of a decrease in processing capacity due to processing of a plurality of jobs having different processing loads. It can be minimized.

 なお、上述の第6実施形態では、サーバ状態格納部3060の処理負荷情報には、ジョブ毎の単位処理量当たりに費やされる処理負荷(リソース量)が格納されていたが、当該処理負荷情報には、ジョブ毎の区別をなくした単位処理量当たりに費やされる処理負荷が格納されてもよい。 In the sixth embodiment described above, the processing load (resource amount) spent per unit processing amount for each job is stored in the processing load information of the server state storage unit 3060. May store the processing load consumed per unit processing amount without distinguishing between jobs.

 以下に実施例を挙げ、上述の各実施形態を更に詳細に説明する。なお、本発明は以下の各実施例から何ら限定を受けない。 Hereinafter, examples will be given and the above-described embodiments will be described in more detail. The present invention is not limited in any way by the following examples.

 実施例1では、上述の第1実施形態の具体例を示す。
 図34は、実施例1における分散システム350の構成例を概念的に示す図である。実施例1における本分散システム350は、スイッチsw1及びsw2、サーバn1からn4を有し、サーバn1からn4はスイッチsw1及びsw2を介して相互に接続される。サーバn1からn4は、状況に応じて処理サーバ330及びデータサーバ340として機能する。サーバn1からn4は、処理データ格納部342として、ディスクD1からD4を有する。実施例1では、サーバn1からn4のいずれか1つがマスタサーバ300として機能する。サーバn1は利用可能な処理実行部332としてp1を有し、サーバn3は利用可能な処理実行部332としてp3を有する。
Example 1 shows a specific example of the first embodiment described above.
FIG. 34 is a diagram conceptually illustrating a configuration example of the distributed system 350 in the first embodiment. The distributed system 350 according to the first embodiment includes switches sw1 and sw2 and servers n1 to n4, and the servers n1 to n4 are connected to each other via the switches sw1 and sw2. The servers n1 to n4 function as the processing server 330 and the data server 340 depending on the situation. The servers n1 to n4 have disks D1 to D4 as the processing data storage unit 342. In the first embodiment, any one of the servers n1 to n4 functions as the master server 300. The server n1 has p1 as the usable process execution unit 332, and the server n3 has p3 as the usable process execution unit 332.

 図35は、実施例1におけるサーバ状態格納部3060に格納される情報を示す図である。実施例1では、サーバn1及びn3が利用可能な処理サーバ330となる。サーバn1の処理可能量は50MB/sであり、サーバn3の処理可能量は150MB/sである。 FIG. 35 is a diagram illustrating information stored in the server state storage unit 3060 according to the first embodiment. In the first embodiment, the servers n1 and n3 are available processing servers 330. The processable amount of the server n1 is 50 MB / s, and the processable amount of the server n3 is 150 MB / s.

 図36は、実施例1における入出力通信路情報格納部3080に格納される情報を示す図である。サーバn2からサーバn1へのデータ送信時の可用帯域、サーバn2からサーバn3へのデータ送信時の可用帯域、及び、サーバn4からサーバn1へのデータ送信時の可用帯域はそれぞれ50MB/sであり、サーバn4からサーバn3へのデータ送信時の可用帯域は100MB/sである。 FIG. 36 is a diagram illustrating information stored in the input / output communication path information storage unit 3080 according to the first embodiment. The available bandwidth at the time of data transmission from the server n2 to the server n1, the available bandwidth at the time of data transmission from the server n2 to the server n3, and the available bandwidth at the time of data transmission from the server n4 to the server n1 are 50 MB / s, respectively. The available bandwidth at the time of data transmission from the server n4 to the server n3 is 100 MB / s.

 図37は、実施例1におけるデータ所在格納部3070に格納される情報を示す図である。処理対象データ(MyDataSet1)は、ファイルda、db及びdcに分割され格納されており、ファイルda及びdbは、サーバn2のディスクD2内に格納され、ファイルdcは、サーバn4のディスクD4内に格納されている。このように、処理対象データ(MyDataSet1)は、分散配置されており、多重化されていないデータである。 FIG. 37 is a diagram illustrating information stored in the data location storage unit 3070 according to the first embodiment. The processing target data (MyDataSet1) is divided and stored in files da, db, and dc. The files da and db are stored in the disk D2 of the server n2, and the file dc is stored in the disk D4 of the server n4. Has been. As described above, the processing target data (MyDataSet1) is data that is distributed and not multiplexed.

 マスタサーバ300のサーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図35、図36及び図37に示す状態である場合に、クライアント360によって処理対象データ(MyDataSet1)を使用する処理プログラムの実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 When the server state storage unit 3060, the input / output communication path information storage unit 3080, and the data location storage unit 3070 of the master server 300 are in the states shown in FIG. 35, FIG. 36, and FIG. Assume that request information for requesting execution of a processing program using (MyDataSet1) is transmitted to the master server 300. Hereinafter, the operation of the distributed system 350 in this situation will be described.

 マスタサーバ300のモデル生成部301は、図37のデータ所在格納部3070から、処理対象データが格納されているデータサーバ340の識別子の集合として{n2、n4}を得る。モデル生成部301は、図35のサーバ状態格納部3060から、利用可能な処理サーバ330の識別子の集合として{n1、n3}を得る。モデル生成部301は、これら取得された{n1、n3}及び{n2、n4}に関し、サーバ状態格納部3060に格納されている情報及び入出力通信路情報格納部3080に格納されている情報に基づいて、モデル情報を生成する。 The model generation unit 301 of the master server 300 obtains {n2, n4} as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 in FIG. The model generation unit 301 obtains {n1, n3} as a set of identifiers of the available processing servers 330 from the server state storage unit 3060 in FIG. The model generation unit 301 adds the information stored in the server state storage unit 3060 and the information stored in the input / output communication path information storage unit 3080 with respect to the acquired {n1, n3} and {n2, n4}. Based on this, model information is generated.

 図38は、実施例1において生成されたモデル情報を示す図である。図39は、図38により示されるモデル情報により構築される概念モデル(有向グラフ、ネットワークモデル)を示す図である。図39で示される概念モデル上の各辺に付された値は、その通信路における現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)、又は、その辺の始点に対応する処理サーバにおける現在の単位時間当たりのデータ処理量の最大値(処理量制約条件の上限値)を示す。 FIG. 38 is a diagram illustrating model information generated in the first embodiment. FIG. 39 is a diagram showing a conceptual model (directed graph, network model) constructed by the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 39 is the maximum value of the current data transfer amount per unit time in the communication channel (the upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.

 マスタサーバ300の決定部303は、図38のモデル情報に基づいて、当該要求情報に対応するジョブの処理時間がシステム全体で最小となるように、当該概念モデルにおける各辺の流量(流量関数f)をそれぞれ決定する。実施例1では、分散システム350における単位時間当たりのデータ処理量が最大となるように流量関数fが決定される。図40Aから図40Gは、実施例1における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。 Based on the model information of FIG. 38, the determination unit 303 of the master server 300 determines the flow rate (flow rate function f of each side) in the conceptual model so that the processing time of the job corresponding to the request information is minimized in the entire system. ) Respectively. In the first embodiment, the flow function f is determined so that the data processing amount per unit time in the distributed system 350 is maximized. 40A to 40G are diagrams conceptually showing the flow function f determination process and the data flow information determination process by the flow increase method in the maximum flow problem in the first embodiment.

 決定部303は、図38のモデル情報に基づいて、図40Aに示されるようなネットワークモデルを構築する。なお、ネットワークモデルの構築とは、ソフトウェア要素としての何かしらのデータ形態を生成するといった意味のみを持つものではなく、モデル情報が用いられて流量関数fが決定されるまでの処理を説明するための単なる概念の意味でも用いられる。このネットワークモデルでは、始点sが設定され、終点tが設定される。 The determination unit 303 constructs a network model as shown in FIG. 40A based on the model information of FIG. Note that the construction of the network model does not only mean that some data form as a software element is generated, but is for explaining the processing until the flow rate function f is determined by using the model information. It is also used simply as a concept. In this network model, a start point s is set and an end point t is set.

 ここで、図40Bに示されるように、経路(s、n2、n1、t)に50MB/sのフロー(流量)が与えられると、決定部303は、図40Cに示されるネットワークの残余グラフを特定する。以降、残余グラフでは、流量0のフローは図示されない。 Here, as shown in FIG. 40B, when a flow (flow rate) of 50 MB / s is given to the route (s, n2, n1, t), the determination unit 303 displays the residual graph of the network shown in FIG. 40C. Identify. Hereinafter, the flow with the flow rate 0 is not shown in the residual graph.

 次に、決定部303は、図40Cに示される残余グラフからフロー増加路を特定し、その経路に対してフローを付与する。ここでは、図40Cに示される残余グラフに基づいて、図40Dに示されるように、経路(s、n2、n3、t)に50MB/sのフローが与えられる。結果、決定部303は、図40Eに示されるネットワークの残余グラフを特定する。 Next, the determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 40C and assigns a flow to the path. Here, based on the residual graph shown in FIG. 40C, a flow of 50 MB / s is given to the route (s, n2, n3, t) as shown in FIG. 40D. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 40E.

 決定部303は、図40Eに示される残余グラフからフロー増加路を特定し、その経路に対してフローを与える。決定部303は、図40Eに示される残余グラフに基づいて、図40Fに示されるように、経路(s、n4、n3、t)に100MB/sのフローを付与する。結果、決定部303は、図40Gに示されるネットワークの残余グラフを特定する。 The determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 40E and gives a flow to the path. Based on the residual graph shown in FIG. 40E, the determination unit 303 assigns a flow of 100 MB / s to the route (s, n4, n3, t) as shown in FIG. 40F. As a result, the determination unit 303 identifies the residual graph of the network illustrated in FIG. 40G.

 ここで、図40Gに示されるようにこれ以上のフロー増加路は存在しないため、決定部303は処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Here, since there is no more flow increase path as shown in FIG. 40G, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図41は、実施例1におけるデータフロー情報を示す図である。決定部303は、このように決定されたデータフロー情報に基づいて、処理プログラムをサーバn1及びn3に送信する。更に、決定部303は、処理サーバn1及びn3に、処理プログラムに対応する決定情報を送信することによって、データ受信と処理実行とを指示する。 FIG. 41 is a diagram showing data flow information in the first embodiment. The determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.

 図42は、実施例1において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得されたデータの処理を実行する。処理サーバn3は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータの処理を実行する。また、処理サーバn3は、データサーバn4の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータの処理を実行する。 FIG. 42 is a diagram conceptually illustrating data transmission / reception performed in the first embodiment. The processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes the process of the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p3 executes processing of the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n4. The process execution unit p3 executes processing of the acquired data.

 実施例2では、上述の第2実施形態の第1変形例の具体例を示す。実施例2における分散システム350の構成は実施例1と同様である(図34参照)。また、実施例2における入出力通信路情報格納部3080の状態についても実施例1と同様である(図36参照)。 Example 2 shows a specific example of the first modification of the second embodiment described above. The configuration of the distributed system 350 in the second embodiment is the same as that in the first embodiment (see FIG. 34). The state of the input / output communication path information storage unit 3080 in the second embodiment is also the same as that in the first embodiment (see FIG. 36).

 図43は、実施例2におけるジョブ情報格納部3040に格納される情報を示す図である。実施例2では、プログラムを実行する単位として、ジョブMyJob1とジョブMyJob2が投入される。ジョブMyJob1の最大単位処理量は25MB/sに設定され、ジョブMyJob1の最低単位処理量は設定されていない。一方、ジョブMyJob2の最低単位処理量が50MB/sに設定されており、ジョブMyJob2の最大単位処理量は設定されていない。 FIG. 43 is a diagram illustrating information stored in the job information storage unit 3040 according to the second embodiment. In the second embodiment, a job MyJob1 and a job MyJob2 are input as units for executing a program. The maximum unit processing amount of the job MyJob1 is set to 25 MB / s, and the minimum unit processing amount of the job MyJob1 is not set. On the other hand, the minimum unit processing amount of the job MyJob2 is set to 50 MB / s, and the maximum unit processing amount of the job MyJob2 is not set.

 図44は、実施例2におけるサーバ状態格納部3060に格納される情報を示す図である。実施例2では、サーバn1はジョブMyJob1及びMyJob2をそれぞれ50MB/sで処理することが可能であり、サーバn3はジョブMyJob1及びMyJob2をそれぞれ150MB/sで処理することが可能である。 FIG. 44 is a diagram illustrating information stored in the server state storage unit 3060 according to the second embodiment. In the second embodiment, the server n1 can process the jobs MyJob1 and MyJob2 at 50 MB / s, respectively, and the server n3 can process the jobs MyJob1 and MyJob2 at 150 MB / s, respectively.

 図45は、実施例2におけるデータ所在格納部3070に格納される情報を示す図である。図45に示されるように、データ所在格納部3070には、処理対象データMyDataSet1及びMyDataSet2についての各情報がそれぞれ格納されている。MyDataSet1はファイルdaに格納されており、そのファイルdaはサーバn2のディスクD2内に格納されている。MyDataSet2は、ファイルdb、dc、及びddに分割されて格納されており、ファイルdb及びdcは、サーバn2のディスクD2内に格納されており、ファイルddは、サーバn4のディスクD4内に格納されている。MyDataSet1及びMyDataSet2は、分散配置されており、多重化されていないデータである。 FIG. 45 is a diagram illustrating information stored in the data location storage unit 3070 according to the second embodiment. As shown in FIG. 45, the data location storage unit 3070 stores information about the processing target data MyDataSet1 and MyDataSet2. MyDataSet1 is stored in the file da, and the file da is stored in the disk D2 of the server n2. MyDataSet2 is divided into files db, dc, and dd and stored. The files db and dc are stored in the disk D2 of the server n2, and the file dd is stored in the disk D4 of the server n4. ing. MyDataSet1 and MyDataSet2 are data that are distributed and not multiplexed.

 マスタサーバ300のジョブ情報格納部3040、サーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図43、図44、図36、図45に示す状態である場合に、クライアント360によって、処理対象データ(MyDataSet1)を使用するジョブMyJob1の実行を要求し、かつ、処理対象データ(MyDataSet2)を使用するジョブMyJob2の実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 The job information storage unit 3040, server state storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 43, 44, 36, and 45. In this case, the request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) by the client 360 and requesting execution of the job MyJob2 using the processing target data (MyDataSet2) is received by the master server 300. Suppose that Hereinafter, the operation of the distributed system 350 in this situation will be described.

 マスタサーバ300のモデル生成部301は、図43のジョブ情報格納部3040から、実行が指示されているジョブの集合として{MyJob1、MyJob2}を得る。モデル生成部301は、各ジョブに関する、ジョブが使用するデータの名称、最低単位処理量及び最大単位処理量をそれぞれ取得する。 The model generation unit 301 of the master server 300 obtains {MyJob1, MyJob2} as a set of jobs instructed to be executed from the job information storage unit 3040 in FIG. The model generation unit 301 acquires the name of data used by the job, the minimum unit processing amount, and the maximum unit processing amount for each job.

 モデル生成部301は、図45のデータ所在格納部3070から、処理対象データが格納されているデータサーバ340の識別子の集合として{D2、D4}を得る。モデル生成部301は、図44のサーバ状態格納部3060から、データサーバ340の識別子の集合として{n2、n4}を取得し、利用可能な処理サーバ330の識別子の集合として{n1、n3}を得る。更に、モデル生成部301は、図44のサーバ状態格納部3060から、利用可能な処理サーバn1及びn3の処理可能量情報を得る。 The model generation unit 301 obtains {D2, D4} as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 of FIG. The model generation unit 301 acquires {n2, n4} as a set of identifiers of the data server 340 from the server state storage unit 3060 in FIG. 44, and uses {n1, n3} as a set of identifiers of the available processing servers 330. obtain. Further, the model generation unit 301 obtains the processable amount information of the available processing servers n1 and n3 from the server state storage unit 3060 in FIG.

 モデル生成部301は、このように取得された各集合及び図36の入出力通信路情報格納部3080に格納された情報に基づいて、モデル情報を生成する。図46は、実施例2において生成されるモデル情報を示す図である。図47は、図46に示されるモデル情報により構築される概念モデルを示す図である。図47で示される概念モデル上の各辺に付された値は、その通信路における現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)、又は、その辺の始点に対応する処理サーバにおける現在の単位時間当たりのデータ処理量の最大値(処理量制約条件の上限値)を示す。 The model generation unit 301 generates model information based on each set acquired in this way and information stored in the input / output channel information storage unit 3080 in FIG. FIG. 46 is a diagram illustrating model information generated in the second embodiment. FIG. 47 is a diagram showing a conceptual model constructed by the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 47 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.

 決定部303は、図46のモデル情報に基づいて、ジョブの処理時間が最小となるように流量関数fを決定する。図48Aから図48F及び図49Aから図49Jは、実施例2における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。 The determination unit 303 determines the flow function f based on the model information of FIG. 46 so that the job processing time is minimized. FIGS. 48A to 48F and FIGS. 49A to 49J are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the second embodiment.

 図48Aから図48Fは、下限流量制限を満たす初期フローの算出手順の一例を示す図である。まず、決定部303は、図46のモデル情報に基づいて、図48Aに示されるネットワークモデルを構築する。このネットワークモデルでは、始点s1及びs2が設定され、始点s1に対応する終点t1が設定され、始点s2に対応する終点t2が設定される。更に、決定部303は、図48Aに示されるネットワークモデルに対し、仮想始点s*及び仮想終点t*を設定する。決定部303は、流量制限が付与された辺の新たな流量上限値に、変更前の流量上限値と変更前の流量下限値との差分値を設定する。また、決定部303は、当該辺の新たな流量下限値に0を設定する。このような処理が図48Aに示されるネットワークモデルに対して行われることにより図48Bに示されるネットワークモデルが構築される。 48A to 48F are diagrams showing an example of an initial flow calculation procedure that satisfies the lower limit flow rate restriction. First, the determination unit 303 constructs the network model shown in FIG. 48A based on the model information of FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 is set, and an end point t2 corresponding to the start point s2 is set. Furthermore, the determination unit 303 sets a virtual start point s * and a virtual end point t * for the network model shown in FIG. 48A. The determination unit 303 sets a difference value between the flow rate upper limit value before the change and the flow rate lower limit value before the change in the new flow rate upper limit value on the side to which the flow rate restriction is given. Moreover, the determination part 303 sets 0 to the new flow volume lower limit value of the side. Such processing is performed on the network model shown in FIG. 48A, whereby the network model shown in FIG. 48B is constructed.

 決定部303は、下限流量制限が付与されているs2とMyJob2を接続する辺の終点と仮想始点s*との間、及び、s2と仮想終点t*との間をそれぞれ接続する。具体的には、前述の各頂点の間に、所定の流量上限値が設定された辺が追加される。この所定の流量上限値とは、下限流量制限が付与されている当該辺に設定されていた変更前の流量下限値である。また、決定部303は、終点t2と始点s2との間を接続する。具体的には終点t2と始点s2との間に、流量上限値が無限大である辺を追加する。このような処理が図48Bに示されたネットワークモデルに対して実行されることにより、図48Cに示されるネットワークモデルが構築される。 The determining unit 303 connects the end point of the side connecting the s2 to which the lower limit flow rate restriction is given and MyJob2 and the virtual start point s *, and the s2 and the virtual end point t *, respectively. Specifically, a side where a predetermined flow rate upper limit value is set is added between the aforementioned vertices. This predetermined flow rate upper limit value is a flow rate lower limit value before change that has been set on the side where the lower limit flow rate restriction is given. Further, the determination unit 303 connects the end point t2 and the start point s2. Specifically, a side where the upper limit of the flow rate is infinite is added between the end point t2 and the start point s2. Such processing is performed on the network model shown in FIG. 48B, whereby the network model shown in FIG. 48C is constructed.

 決定部303は、図48Cに示されるネットワークモデルに対して、s*から出る辺及びt*に入る辺の流量が飽和するs*-t*-フローを求める。なお、該当するフローが存在しないことは、下限流量制限を満たす解が元のネットワークモデルに存在しないことを示す。第2実施例では、図48Dに示される経路(s*、MyJob2、n4、n3、t2、s2、t*)がs*-t*-フローに該当する。 The determination unit 303 obtains an s * -t * -flow in which the flow rate of the side exiting from s * and the side entering t * is saturated for the network model shown in FIG. 48C. Note that the absence of the corresponding flow indicates that no solution satisfying the lower limit flow rate restriction exists in the original network model. In the second embodiment, the path (s *, MyJob2, n4, n3, t2, s2, t *) shown in FIG. 48D corresponds to the s * -t * -flow.

 決定部303は、ネットワークモデルから、追加された頂点及び辺を削除し、流量制限が付与されている当該辺の流量制限値を変更前の元の値に戻す。そして、決定部303は、下限流量制限が付与されている当該辺に対し、流量下限値の分だけフローを与える。具体的には、決定部303は、図48Aに示されるネットワークにおいて、図48Eに示されるように、追加された頂点及び辺が削除された現実の経路(s2、MyJob2、n4、n3、t2)に50MB/sのフローを与える。結果、図48Fに示されるネットワークの残余グラフが特定される。この経路(s2、MyJob2、n4、n3、t2)が、下限流量制限を満たす初期フロー(図49A)である。 The determining unit 303 deletes the added vertex and edge from the network model, and returns the flow restriction value of the edge to which the flow restriction is given to the original value before the change. And the determination part 303 gives a flow only by the part of the flow volume lower limit with respect to the said edge | side where the lower limit flow volume restriction | limiting is provided. Specifically, in the network illustrated in FIG. 48A, the determination unit 303, as illustrated in FIG. 48E, the actual path (s2, MyJob2, n4, n3, t2) from which the added vertex and edge are deleted. Gives a flow of 50 MB / s. As a result, the residual graph of the network shown in FIG. 48F is specified. This path (s2, MyJob2, n4, n3, t2) is an initial flow (FIG. 49A) that satisfies the lower limit flow rate restriction.

 決定部303は、図49B(図48Fと同様)に示される残余グラフからフロー増加路を特定し、その経路に対してフローを与える。具体的には、決定部303は、図49Bに示される残余グラフに基づいて、図49Cに示されるような経路(s1、MyJob1、n2、n1、t1)に25MB/sのフローを与える。結果、決定部303は、図49Dに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49B (similar to FIG. 48F) and gives a flow to the path. Specifically, the determination unit 303 gives a flow of 25 MB / s to the route (s1, MyJob1, n2, n1, t1) as shown in FIG. 49C based on the residual graph shown in FIG. 49B. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 49D.

 決定部303は、図49Dに示される残余グラフからフロー増加路を特定し、その経路に対してフローを与える。決定部303は、図49Dに示される残余グラフに基づいて、図49Eに示されるように、経路(s2、MyJob2、n4、n3、t2)に50MB/sのフローを追加で与える。結果、決定部303は、図49Fに示されるネットワークの残余グラフを特定する。 The determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49D and gives a flow to the path. Based on the residual graph shown in FIG. 49D, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n4, n3, t2) as shown in FIG. 49E. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 49F.

 決定部303は、図49Fに示される残余グラフからフロー増加路を特定し、その経路に対してフローを与える。これにより、決定部303は、図49Fに示される残余グラフに基づいて、図49Gに示されるように、経路(s2、MyJob2、n2、n1、t2)に50MB/sのフローを追加で与える。結果、決定部303は、図49Hに示されるネットワークの残余グラフを特定する。 The determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49F and gives a flow to the path. Accordingly, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n2, n1, t2) as illustrated in FIG. 49G based on the residual graph illustrated in FIG. 49F. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 49H.

 決定部303は、図49Hに示される残余グラフからフロー増加路を特定し、その経路に対してフローを与える。決定部303は、図49Hに示される残余グラフに基づいて、図49Iに示されるように、経路(s2、MyJob2、n2、n3、t2)に50MB/sのフローを追加で与える。結果、決定部303は、図49Jに示されるネットワークの残余グラフを特定する。 The determining unit 303 identifies a flow increasing path from the residual graph shown in FIG. 49H and gives a flow to the path. Based on the residual graph shown in FIG. 49H, the determination unit 303 additionally gives a flow of 50 MB / s to the route (s2, MyJob2, n2, n3, t2) as shown in FIG. 49I. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 49J.

 ここで、図49Jに示されるようにこれ以上のフロー増加路は存在しないため、決定部303は処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Here, since there is no more flow increase path as shown in FIG. 49J, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図50は、実施例2において生成されたデータフロー情報を示す図である。決定部303は、このように決定されたデータフロー情報に基づいて、処理プログラムをサーバn1及びn3に送信する。更に、決定部303は、処理サーバn1及びn3に、処理プログラムに対応する決定情報を送信することによって、データ受信と処理実行とを指示する。 FIG. 50 is a diagram illustrating data flow information generated in the second embodiment. The determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.

 図51は、実施例2において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得されたデータに対しMyJob1の処理を実行する。また、処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得されたデータに対し、MyJob2の処理を実行する。一方、処理サーバn3は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータに対しMyJob2の処理を実行する。また、処理サーバn3は、データサーバn4の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータに対しMyJob2の処理を実行する。 FIG. 51 is a diagram conceptually illustrating data transmission / reception performed in the second embodiment. The processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes MyJob1 process on the acquired data. In addition, the processing server n1 acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes MyJob2 process on the acquired data. On the other hand, the processing server n3 acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p3 executes MyJob2 process on the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n4. The process execution unit p3 executes MyJob2 process on the acquired data.

 実施例3では、上述の第3実施形態の具体例を示す。実施例3における分散システム350の構成は実施例1と同様である(図34参照)。また、実施例3におけるサーバ状態格納部3060の状態についても実施例1と同様である(図35参照)。また、実施例3における入出力通信路情報格納部3080の状態についても実施例1と同様である(図36参照)。 Example 3 shows a specific example of the above-described third embodiment. The configuration of the distributed system 350 in the third embodiment is the same as that in the first embodiment (see FIG. 34). Further, the state of the server state storage unit 3060 in the third embodiment is the same as that in the first embodiment (see FIG. 35). The state of the input / output communication path information storage unit 3080 in the third embodiment is also the same as that in the first embodiment (see FIG. 36).

 図52は、実施例3におけるデータ所在格納部3070に格納される情報を示す図である。図52によれば、処理対象データ(MyDataSet1)はファイルda及びデータdbに分割され格納されている。ファイルdaは、多重化されていない単一のファイルであり、サーバn2のディスクD2内に格納されている。一方、データdbはファイルdb1及びファイルdb2に2重化されており、ファイルdb1はサーバn2のディスクD2に格納され、ファイルdb2はサーバn4のディスクD4に格納されている。 FIG. 52 is a diagram illustrating information stored in the data location storage unit 3070 according to the third embodiment. According to FIG. 52, the processing target data (MyDataSet1) is divided and stored into a file da and data db. The file da is a single file that is not multiplexed and is stored in the disk D2 of the server n2. On the other hand, the data db is duplicated into a file db1 and a file db2, the file db1 is stored on the disk D2 of the server n2, and the file db2 is stored on the disk D4 of the server n4.

 マスタサーバ300のサーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図35、図36及び図52に示される状態である場合に、クライアント360によって処理対象データ(MyDataSet1)を使用する処理プログラムの実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 When the server state storage unit 3060, the input / output communication path information storage unit 3080, and the data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 35, 36, and 52, the processing target is processed by the client 360. Assume that request information for requesting execution of a processing program that uses data (MyDataSet1) is transmitted to the master server 300. Hereinafter, the operation of the distributed system 350 in this situation will be described.

 モデル生成部301は、図52のデータ所在格納部3070及び図35のサーバ状態格納部3060から、処理対象データが格納されているデータサーバ340の識別子の集合として{n2、n4}を取得し、処理可能な処理サーバ330の識別子の集合として{n1、n3}を取得する。モデル生成部301は、これら取得された各集合、図35のサーバ状態格納部3060に格納されている情報、図36の入出力通信路情報格納部3080に格納されている情報、及び、図52のデータ所在格納部3070に格納されている情報に基づいて、モデル情報を生成する。 The model generation unit 301 acquires {n2, n4} from the data location storage unit 3070 in FIG. 52 and the server state storage unit 3060 in FIG. {N1, n3} is acquired as a set of identifiers of processable servers 330 that can be processed. The model generation unit 301 includes each of the acquired sets, information stored in the server state storage unit 3060 in FIG. 35, information stored in the input / output communication path information storage unit 3080 in FIG. 36, and FIG. Model information is generated based on the information stored in the data location storage unit 3070.

 図53は、第3実施例において生成されるモデル情報を示す図である。図54は、図53により示されるモデル情報から構築される概念モデルを示す図である。図54で示される概念モデル上の各辺に付された値は、その通信路における現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)、又は、その辺の始点に対応する処理サーバにおける現在の単位時間当たりのデータ処理量の最大値(処理量制約条件の上限値)を示す。 FIG. 53 is a diagram showing model information generated in the third embodiment. FIG. 54 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 54 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown.

 決定部303は、図53に示されるモデル情報に基づいて、ジョブの処理時間が最小となるように流量関数fを決定する。図55Aから図55Gは、第3実施例における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。まず、決定部303は、図53のモデル情報に基づいて図55Aに示されるネットワークモデルを構築する。このネットワークモデルでは、始点sが設定され、終点tが設定される。決定部303は、図55Bに示されるように、経路(s、db、n2、n1、t)に50MB/sのフローを与える。結果、決定部303は、図55Cに示されるネットワークの残余グラフを特定する。 The determining unit 303 determines the flow function f based on the model information shown in FIG. 53 so that the job processing time is minimized. FIGS. 55A to 55G are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the third embodiment. First, the determination unit 303 constructs the network model shown in FIG. 55A based on the model information of FIG. In this network model, a start point s is set and an end point t is set. As shown in FIG. 55B, the determination unit 303 gives a flow of 50 MB / s to the route (s, db, n2, n1, t). As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 55C.

 決定部303は、図55Cに示される残余グラフからフロー増加路を特定し、図55Dに示されるように、経路(s、da、n2、n3、t)に50MB/sのフローを与える。結果、決定部303は、図55Eに示されるネットワークの残余グラフを特定する。 The determining unit 303 identifies the flow increasing path from the residual graph shown in FIG. 55C and gives a flow of 50 MB / s to the path (s, da, n2, n3, t) as shown in FIG. 55D. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 55E.

 決定部303は、図55Eに示される残余グラフからフロー増加路を特定し、図55Fに示されるように、経路(s、db、n4、n3、t)に100MB/sのフローを与える。結果、決定部303は、図55Gに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies the flow increasing path from the residual graph shown in FIG. 55E, and gives a flow of 100 MB / s to the path (s, db, n4, n3, t) as shown in FIG. 55F. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 55G.

 ここで、図55Gに示されるようにこれ以上のフロー増加路は存在しないため、決定部303は処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Here, since there is no more flow increase path as shown in FIG. 55G, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図56は、実施例3におけるデータフロー情報を示す図である。このようなデータフロー情報とデータ所在格納部3070に格納されているデータサイズの情報とに基づいて、決定部303は、2重化されているデータdbについて、各処理サーバ330が処理するデータを特定する。具体的には、決定部303は、データdbを経路情報に含むデータフロー情報(Flow1及びFlow3)を特定し、特定された各データフロー情報に設定されている単位処理量(50MB/s及び100MB/s)を、当該各データフロー情報の経路情報に含まれる処理サーバ330(n1及びn3)毎に集計する。ここでは、処理サーバn1及びn3に関する各単位処理量が50MB/s及び100MB/sとなる。決定部303は、データdbをその集計された処理サーバn1及びn3の各単位処理量の比(1:2)で分割し、分割データdb(db1;6GB×1/3=2GB)とそれを格納するデータサーバn2とを対応付け、分割データdi(db2:6GB×2/3=4GB)とそれを格納するデータサーバn4とを対応付ける。これにより、実施例3では、マスタサーバ300は、処理サーバn1に、データサーバn2に格納されるファイルdb1の0バイト目から2ギガバイト目までを処理させ、処理サーバn3に、データサーバn4に格納されるファイルdb2の2ギガバイト目から6ギガバイト目までを処理させることを決定する。この情報は処理プログラムに対応する決定情報に含まれる。 FIG. 56 is a diagram showing data flow information in the third embodiment. Based on the data flow information and the data size information stored in the data location storage unit 3070, the determination unit 303 determines the data to be processed by each processing server 330 for the duplicated data db. Identify. Specifically, the determination unit 303 identifies data flow information (Flow 1 and Flow 3) including the data db in the path information, and unit processing amounts (50 MB / s and 100 MB) set in each identified data flow information / S) for each processing server 330 (n1 and n3) included in the path information of the data flow information. Here, the unit processing amounts for the processing servers n1 and n3 are 50 MB / s and 100 MB / s. The determination unit 303 divides the data db by the unit processing amount ratio (1: 2) of the aggregated processing servers n1 and n3, and divides the data db (db1; 6 GB × 1/3 = 2 GB) and the data. The data server n2 to be stored is associated with each other, and the divided data di (db2: 6 GB × 2/3 = 4 GB) is associated with the data server n4 that stores the data. Thereby, in Example 3, the master server 300 causes the processing server n1 to process the 0th byte to the 2nd gigabyte of the file db1 stored in the data server n2, and stores the processing server n3 in the data server n4. It is determined that the second to sixth gigabytes of the file db2 to be processed are processed. This information is included in the decision information corresponding to the processing program.

 図57は、実施例3において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得されたデータの0バイト目から2ギガバイト目までに対し処理を実行する。処理サーバn3は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータの処理を実行する。処理サーバn3は、データサーバn4の処理データ格納部342内のデータを取得する。処理実行部p3は取得されたデータの2ギガバイト目から6ギガバイト目までに対し処理を実行する。 FIG. 57 is a diagram conceptually illustrating data transmission / reception performed in the third embodiment. The processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes a process on the acquired data from the 0th byte to the 2nd gigabyte. The processing server n3 acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p3 executes processing of the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n4. The process execution unit p3 executes a process on the acquired data from the 2nd to 6th gigabytes.

 実施例4では、上述の第4実施形態の具体例を示す。 Example 4 shows a specific example of the above-described fourth embodiment.

 図58は、実施例4における分散システム350の構成を概念的に示す図である。実施例4における本分散システム350は、スイッチsw、サーバn1からn4を有する。サーバn1からn4は、スイッチswを介して相互に接続されている。サーバn1からn4は、状況に応じて処理サーバ330及びデータサーバ340として機能する。サーバn1からn4は、処理データ格納部342として、ディスクD1からD4をそれぞれ有する。実施例4では、サーバn1からn4のいずれか1つがマスタサーバ300として機能する。サーバn1からn3は利用可能な処理実行部332としてp1からp3を有する。 FIG. 58 is a diagram conceptually illustrating the configuration of the distributed system 350 in the fourth embodiment. The distributed system 350 according to the fourth embodiment includes a switch sw and servers n1 to n4. Servers n1 to n4 are connected to each other via a switch sw. The servers n1 to n4 function as the processing server 330 and the data server 340 depending on the situation. The servers n1 to n4 have disks D1 to D4 as processing data storage units 342, respectively. In the fourth embodiment, any one of the servers n1 to n4 functions as the master server 300. The servers n1 to n3 have p1 to p3 as available processing execution units 332.

 図59は、実施例4におけるサーバ状態格納部3060に格納される情報を示す図である。実施例4では、サーバn1からn3が利用可能な処理サーバ330であり、サーバn1からn3の各処理可能量は、50MB/s、50MB/s及び100MB/sである。 FIG. 59 is a diagram illustrating information stored in the server state storage unit 3060 according to the fourth embodiment. In the fourth embodiment, the servers n1 to n3 are available processing servers 330, and the processable amounts of the servers n1 to n3 are 50 MB / s, 50 MB / s, and 100 MB / s, respectively.

 図60は、実施例4における入出力通信路情報格納部3080に格納される情報を示す図である。サーバn2のディスクD2及びサーバn4のディスクD4の各読み出し速度はそれぞれ100MB/sである。サーバn2からスイッチswへのデータ送信時の可用帯域及びサーバn4からスイッチswへのデータ送信時の可用帯域は100MB/sである。スイッチswからサーバn1へのデータ送信時の可用帯域、スイッチswからサーバn2へのデータ送信時の可用帯域、スイッチswからサーバn3へのデータ送信時の可用帯域は100MB/sである。 FIG. 60 is a diagram illustrating information stored in the input / output communication path information storage unit 3080 according to the fourth embodiment. Each reading speed of the disk D2 of the server n2 and the disk D4 of the server n4 is 100 MB / s. The available bandwidth at the time of data transmission from the server n2 to the switch sw and the available bandwidth at the time of data transmission from the server n4 to the switch sw are 100 MB / s. The available bandwidth at the time of data transmission from the switch sw to the server n1, the available bandwidth at the time of data transmission from the switch sw to the server n2, and the available bandwidth at the time of data transmission from the switch sw to the server n3 are 100 MB / s.

 実施例4におけるデータ所在格納部3070の状態は、実施例1と同様である(図37参照)。 The state of the data location storage unit 3070 in the fourth embodiment is the same as that in the first embodiment (see FIG. 37).

 マスタサーバ300のサーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図59、図60及び図37に示す状態である場合に、クライアント360によって処理対象データ(MyDataSet1)を使用する処理プログラムの実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 When the server state storage unit 3060, the input / output communication path information storage unit 3080, and the data location storage unit 3070 of the master server 300 are in the states shown in FIG. 59, FIG. 60, and FIG. Assume that request information for requesting execution of a processing program using (MyDataSet1) is transmitted to the master server 300. Hereinafter, the operation of the distributed system 350 in this situation will be described.

 マスタサーバ300のモデル生成部301は、図37のデータ所在格納部3070及び図59のサーバ状態格納部3060から、処理対象データが格納されているデータサーバ340の識別子の集合として{n2、n4}を取得し、利用可能な処理サーバ330の識別子の集合として{n1、n2、n3}を取得する。モデル生成部301は、これら取得された各集合、図59のサーバ状態格納部3060に格納されている情報、図60の入出力通信路情報格納部3080に格納されている情報に基づいて、モデル上布を生成する。 The model generation unit 301 of the master server 300 receives {n2, n4} as a set of identifiers of the data server 340 storing the processing target data from the data location storage unit 3070 in FIG. 37 and the server state storage unit 3060 in FIG. And obtain {n1, n2, n3} as a set of identifiers of the available processing servers 330. Based on each of these acquired sets, information stored in the server state storage unit 3060 in FIG. 59, and information stored in the input / output communication path information storage unit 3080 in FIG. Produce a top cloth.

 図61は、実施例4において生成されるモデル情報を示す図である。図62は、図61により示されるモデル情報から構築される概念モデルを示す図である。図62で示される概念モデル上の各辺に付された値は、その通信路における現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)、又は、その辺の始点に対応する処理サーバにおける現在の単位時間当たりのデータ処理量の最大値(処理量制約条件の上限値)を示す。マスタサーバ300の決定部303は、図61のモデル情報に基づいてジョブの処理時間が最小となるように流量関数fを決定する。図63Aから63Gは、実施例4における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。 FIG. 61 is a diagram illustrating model information generated in the fourth embodiment. FIG. 62 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 62 is the maximum value of the current data transfer amount per unit time in the communication channel (upper limit value of the transfer amount constraint condition) or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown. The determination unit 303 of the master server 300 determines the flow rate function f based on the model information of FIG. 61 so that the job processing time is minimized. FIGS. 63A to 63G are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the fourth embodiment.

 決定部303は、図61のモデル情報に基づいて図63Aに示されるネットワークモデルを構築する。このネットワークモデルでは、始点sが設定され、終点tが設定される。決定部303は、図63Bに示されるように、経路(s、D2、ON2、n2、t)に50MB/sのフローを与える。結果、決定部303は、図63Cに示されるネットワークの残余グラフを特定する。 The determination unit 303 constructs the network model shown in FIG. 63A based on the model information of FIG. In this network model, a start point s is set and an end point t is set. The determination unit 303 gives a flow of 50 MB / s to the route (s, D2, ON2, n2, t) as illustrated in FIG. 63B. As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 63C.

 決定部303は、図63Cに示される残余グラフからフロー増加路を特定し、図63Dに示されるように、経路(s、D2、ON2、i2sw、o1sw、n1、t)に50MB/sのフローを与える。結果、決定部303は、図63Eに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies the flow increasing path from the residual graph illustrated in FIG. 63C, and as illustrated in FIG. 63D, the flow of 50 MB / s on the path (s, D2, ON2, i2sw, o1sw, n1, t). give. As a result, the determination unit 303 identifies the residual graph of the network illustrated in FIG. 63E.

 決定部303は、図63Eに示される残余グラフからフロー増加路を特定し、図63Fに示されるように、経路(s、D4、ON4、i4sw、o3sw、n3、t)に50MB/sのフローを与える。結果、決定部303は、図63Gに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 63E, and a flow of 50 MB / s on the path (s, D4, ON4, i4sw, o3sw, n3, t) as illustrated in FIG. 63F. give. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 63G.

 ここで、図63Gに示されるようにこれ以上のフロー増加路は存在しないため、決定部303は、処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Here, since there is no more flow increase path as shown in FIG. 63G, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図64は、実施例4におけるデータフロー情報を示す図である。決定部303は、このように決定されたデータフロー情報に基づいて、処理プログラムを処理サーバn1からn3に送信する。更に、決定部303は、処理サーバn1からn3に、処理プログラムに対応する決定情報を送信することによって、データ受信と処理実行とを指示する。 FIG. 64 is a diagram showing data flow information in the fourth embodiment. The determination unit 303 transmits the processing program from the processing servers n1 to n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 to n3.

 図65は、実施例4において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得したデータの処理を実行する。処理サーバn2は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p2は取得したデータの処理を実行する。処理サーバn3は、データサーバn4の処理データ格納部342内のデータを取得する。処理実行部p3は取得したデータの処理を実行する。 FIG. 65 is a diagram conceptually illustrating data transmission / reception performed in the fourth embodiment. The processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes the process of the acquired data. The processing server n2 acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p2 executes the process of the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n4. The process execution unit p3 executes the process of the acquired data.

 実施例5では、上述の第5実施形態の具体例を示す。実施例5における分散システム350の構成は実施例1と同様である(図34参照)。また、実施例5における入出力通信路情報格納部3080の状態についても実施例1と同様である(図36参照)。実施例5におけるサーバ状態格納部3060の状態については実施例2と同様である(図44参照)。実施例5におけるデータ所在格納部3070の状態については実施例2と同様である(図45参照)。 Example 5 shows a specific example of the fifth embodiment described above. The configuration of the distributed system 350 in the fifth embodiment is the same as that in the first embodiment (see FIG. 34). The state of the input / output communication path information storage unit 3080 in the fifth embodiment is also the same as that in the first embodiment (see FIG. 36). The state of the server state storage unit 3060 in the fifth embodiment is the same as that in the second embodiment (see FIG. 44). The state of the data location storage unit 3070 in the fifth embodiment is the same as that in the second embodiment (see FIG. 45).

 図66は、実施例5におけるジョブ情報格納部3040に格納される情報を示す図である。実施例5では、プログラムを実行する単位として、ジョブMyJob1とジョブMyJob2が投入されている。実施例5では、ジョブMyJob1及びMyJob2には最低単位処理量及び最大単位処理量が設定されていない。 FIG. 66 is a diagram illustrating information stored in the job information storage unit 3040 according to the fifth embodiment. In the fifth embodiment, a job MyJob1 and a job MyJob2 are input as units for executing a program. In the fifth embodiment, the minimum unit processing amount and the maximum unit processing amount are not set for the jobs MyJob1 and MyJob2.

 マスタサーバ300のジョブ情報格納部3040、サーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図66、図44、図36及び図45に示す状態である場合に、クライアント360によって、処理対象データ(MyDataSet1)を使用するジョブMyJob1及び処理対象データ(MyDataSet2)を使用するジョブMyJob2の実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 The job information storage unit 3040, server status storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 66, 44, 36, and 45. In this case, it is assumed that request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) and the job MyJob2 using the processing target data (MyDataSet2) is transmitted to the master server 300. Hereinafter, the operation of the distributed system 350 in this situation will be described.

 モデル生成部301は、図66のジョブ情報格納部3040から、現在実行が指示されているジョブの集合として{MyJob1、MyJob2}を取得し、更に、各ジョブについて、ジョブが使用するデータの名称、最低単位処理量及び最大単位処理量をそれぞれ取得する。更に、モデル生成部301は、図45のデータ所在格納部3070及び図44のサーバ状態格納部3060から、処理対象データを格納するデータサーバの識別子の集合として{n2、n4}を取得し、処理可能な処理サーバ330の識別子の集合として{n1、n3}を取得する。また、モデル生成部301は、図44のサーバ状態格納部3060から、各ジョブに関するサーバn1及びサーバn3の各処理可能量情報をそれぞれ得る。 The model generation unit 301 acquires {MyJob1, MyJob2} as a set of jobs that are currently instructed from the job information storage unit 3040 in FIG. 66, and further, for each job, the name of data used by the job, The minimum unit processing amount and the maximum unit processing amount are acquired. Further, the model generation unit 301 acquires {n2, n4} as a set of identifiers of data servers that store the processing target data from the data location storage unit 3070 in FIG. 45 and the server state storage unit 3060 in FIG. {N1, n3} is acquired as a set of identifiers of possible processing servers 330. Further, the model generation unit 301 obtains each processable amount information of the server n1 and the server n3 related to each job from the server state storage unit 3060 of FIG.

 モデル生成部301は、取得された各集合、図36の入出力通信路情報格納部3080に格納された情報に基づいて、モデル情報を生成する。図67は、実施例5において生成されるモデル情報を示す図である。実施例5で生成されるモデル情報には、各処理サーバn1及びn3の識別子が含まる行の次要素へのポインタに、各処理サーバn1及びn3の第2の識別子(n1'及びn3')が設定されている。また、各処理サーバn1及びn3の第2の識別子が含まれる行の次要素へのポインタに、各ジョブ(MyJob1、MyJob2)の終端を示す名称(MyJob1'、MyJob2')が設定されている。 The model generation unit 301 generates model information based on the acquired sets and information stored in the input / output communication path information storage unit 3080 in FIG. FIG. 67 is a diagram illustrating model information generated in the fifth embodiment. In the model information generated in the fifth embodiment, a pointer to the next element in the row including the identifiers of the processing servers n1 and n3 is used as a second identifier (n1 ′ and n3 ′) of each processing server n1 and n3. Is set. In addition, names (MyJob1 ′, MyJob2 ′) indicating the end of each job (MyJob1, MyJob2) are set in the pointer to the next element in the row including the second identifiers of the processing servers n1 and n3.

 図68は、図67により示されるモデル情報から構築される概念モデルを示す図である。図68で示される概念モデル上の各辺に付された値は、その通信路における現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)、又は、その辺の始点に対応する処理サーバにおける現在の単位時間当たりのデータ処理量の最大値(処理量制約条件の上限値)を示す。マスタサーバ300の決定部303は、図67のモデル情報に基づいてジョブの処理時間が最小となるように流量関数fを決定する。図69Aから69Gは、実施例5における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。まず、決定部303は、図67のモデル情報に基づいて図69Aに示されるネットワークモデルを構築する。このネットワークモデルでは、始点s1及びs2が設定され、始点s1(MyJob1)に対応する終点t1が設定され、始点s2(MyJob2)に対応する終点t2が設定される。 FIG. 68 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value given to each side on the conceptual model shown in FIG. 68 is the maximum value of the current data transfer amount per unit time on the communication path (upper limit value of the transfer amount constraint condition), or the start point of the side The maximum value of the data processing amount per unit time in the processing server corresponding to (the upper limit value of the processing amount constraint condition) is shown. The determination unit 303 of the master server 300 determines the flow rate function f based on the model information of FIG. 67 so that the job processing time is minimized. FIGS. 69A to 69G are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the fifth embodiment. First, the determination unit 303 constructs the network model shown in FIG. 69A based on the model information of FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 (MyJob1) is set, and an end point t2 corresponding to the start point s2 (MyJob2) is set.

 決定部303は、図69Bに示されるように、経路(s1、MyJob1、n2、n1、n1'、t1)に50MB/sのフローを与える。結果、決定部303は、図69Cに示されるネットワークの残余グラフを特定する。 The determination unit 303 gives a flow of 50 MB / s to the route (s1, MyJob1, n2, n1, n1 ′, t1) as shown in FIG. 69B. As a result, the determination unit 303 identifies the network residual graph shown in FIG. 69C.

 決定部303は、図69Cに示される残余グラフからフロー増加路を特定し、図69Dに示されるように、経路(s2、MyJob2、n2、n3、n3'、t2)に50MB/sのフローを追加で与える。結果、決定部303は、図69Eに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 69C, and performs a flow of 50 MB / s on the path (s2, MyJob2, n2, n3, n3 ′, t2) as illustrated in FIG. 69D. Give in addition. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 69E.

 決定部303は、図69Eに示される残余グラフからフロー増加路を特定し、図69Fに示されるように、経路(s2、MyJob2、n4、n3、n3'、t2)に100MB/sのフローを追加で与える。結果、決定部303は、図69Gに示されるネットワークの残余グラフを特定する。 The determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 69E, and, as shown in FIG. 69F, determines a flow of 100 MB / s on the path (s2, MyJob2, n4, n3, n3 ′, t2). Give in addition. As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 69G.

 ここで、図69Gに示されるようにこれ以上のフロー増加路は存在しないため、決定部303は処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Here, since there is no more flow increase path as shown in FIG. 69G, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図70は、実施例5におけるデータフロー情報を示す図である。決定部303は、このように決定されたデータフロー情報に基づいて、処理プログラムを処理サーバn1及びn3に送信する。決定部303は、処理サーバn1及びn3に、処理プログラムに対応する決定情報を送信することによって、データ受信と処理実行とを指示する。 FIG. 70 is a diagram illustrating data flow information in the fifth embodiment. The determining unit 303 transmits the processing program to the processing servers n1 and n3 based on the data flow information determined in this way. The determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.

 図71は、実施例5において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2の処理データ格納部342内のデータを取得する。処理実行部p1は取得されたデータの処理を実行する。処理サーバn3は、データサーバn2の処理データ格納部342内のデータ、及び、データサーバn4の処理データ格納部342内のデータを取得する。処理実行部p3は取得された各データの処理を実行する。 FIG. 71 is a diagram conceptually illustrating data transmission / reception performed in the fifth embodiment. The processing server n1 that has received the decision information acquires data in the processing data storage unit 342 of the data server n2. The process execution unit p1 executes the process of the acquired data. The processing server n3 acquires data in the processing data storage unit 342 of the data server n2 and data in the processing data storage unit 342 of the data server n4. The process execution unit p3 executes a process for each acquired data.

 実施例6では、上述の第6実施形態の具体例を示す。実施例6における分散システム350の構成は実施例1と同様である(図34参照)。また、実施例6における入出力通信路情報格納部3080の状態についても実施例1と同様である(図36参照)。実施例6におけるジョブ情報格納部3040の状態については実施例5と同様である(図66参照)。また、実施例6におけるデータ所在格納部3070の状態は実施例2と同様である(図45参照)。 Example 6 shows a specific example of the above-described sixth embodiment. The configuration of the distributed system 350 in the sixth embodiment is the same as that in the first embodiment (see FIG. 34). Further, the state of the input / output communication path information storage unit 3080 in the sixth embodiment is the same as that in the first embodiment (see FIG. 36). The state of the job information storage unit 3040 in the sixth embodiment is the same as that in the fifth embodiment (see FIG. 66). The state of the data location storage unit 3070 in the sixth embodiment is the same as that in the second embodiment (see FIG. 45).

 図72は、実施例6におけるサーバ状態格納部3060に格納される情報を示す図である。図72に示されるように、サーバn1に関しては、処理に使用できるリソースが1残っており、ジョブMyJob1及びMyJob2の各処理に対して単位処理量(1MB/s)当たり0.01及び0.02のリソースが利用される。一方、サーバn2に関しては、処理に使用できるリソースが0.5残っており、ジョブMyJob1及びMyJob2の処理に対して単位処理量(1MB/s)当たり0.002及び0.004のリソースが利用される。 FIG. 72 is a diagram illustrating information stored in the server state storage unit 3060 according to the sixth embodiment. As shown in FIG. 72, for the server n1, one resource that can be used for processing remains, and 0.01 and 0.02 per unit processing amount (1 MB / s) for each processing of the jobs MyJob1 and MyJob2. Resources are used. On the other hand, regarding the server n2, 0.5 resources that can be used for processing remain, and resources of 0.002 and 0.004 per unit processing amount (1 MB / s) are used for the processing of jobs MyJob1 and MyJob2. The

 マスタサーバ300のジョブ情報格納部3040、サーバ状態格納部3060、入出力通信路情報格納部3080、及び、データ所在格納部3070が、図66、図72、図36及び図45に示す状態である場合に、クライアント360によって、処理対象データ(MyDataSet1)を使用するジョブMyJob1と、処理対象データ(MyDataSet2)を使用するジョブMyJob2との実行を要求するための要求情報がマスタサーバ300に送信されたと仮定する。以下、この状況における分散システム350の動作について説明する。 The job information storage unit 3040, server state storage unit 3060, input / output communication path information storage unit 3080, and data location storage unit 3070 of the master server 300 are in the states shown in FIGS. 66, 72, 36, and 45. In this case, it is assumed that request information for requesting execution of the job MyJob1 using the processing target data (MyDataSet1) and the job MyJob2 using the processing target data (MyDataSet2) is transmitted to the master server 300 by the client 360. To do. Hereinafter, the operation of the distributed system 350 in this situation will be described.

 マスタサーバ300のモデル生成部301は、図66のジョブ情報格納部3040から、現在実行が指示されているジョブの集合として{MyJob1、MyJob2}を取得し、各ジョブに関し、ジョブが使用するデータの名称、最低単位処理量及び最大単位処理量をそれぞれ取得する。更に、モデル生成部301は、図45のデータ所在格納部3070及び図72のサーバ状態格納部3060から、処理対象データが格納されているデータサーバの識別子の集合として{n2、n4}を、処理可能な処理サーバ330の識別子の集合として{n1、n3}を取得する。また、モデル生成部301は、図72のサーバ状態格納部3060から、サーバn1及びサーバn3に関する、リソース残量、処理可能量情報、及び、処理負荷情報をそれぞれ取得する。 The model generation unit 301 of the master server 300 acquires {MyJob1, MyJob2} as a set of jobs that are currently instructed to execute from the job information storage unit 3040 in FIG. 66, and for each job, the data used by the job. Get the name, minimum unit throughput, and maximum unit throughput. Further, the model generation unit 301 processes {n2, n4} from the data location storage unit 3070 in FIG. 45 and the server state storage unit 3060 in FIG. 72 as a set of identifiers of data servers storing the processing target data. {N1, n3} is acquired as a set of identifiers of possible processing servers 330. In addition, the model generation unit 301 acquires the remaining resource amount, the processable amount information, and the processing load information regarding the server n1 and the server n3 from the server state storage unit 3060 in FIG.

 モデル生成部301は、このように取得された各集合及び図36の入出力通信路情報格納部3080に格納された情報に基づいて、モデル情報を生成する。図73は、実施例6において生成されるモデル情報を示す図である。図74は、図73により示されるモデル情報から構築される概念モデルを示す図である。図74で示される概念モデル上の各辺の値は、現在の単位時間当たりのデータ転送量の最大値(転送量制約条件の上限値)を示す。 The model generation unit 301 generates model information based on each set acquired in this way and information stored in the input / output channel information storage unit 3080 in FIG. FIG. 73 is a diagram illustrating model information generated in the sixth embodiment. FIG. 74 is a diagram showing a conceptual model constructed from the model information shown in FIG. The value of each side on the conceptual model shown in FIG. 74 indicates the maximum value of the data transfer amount per unit time (the upper limit value of the transfer amount constraint condition).

 マスタサーバ300の決定部303は、図73のモデル情報と、図72のサーバ状態格納部3060から取得された、サーバn1及びn3に関するリソース残量、処理可能量情報及び処理負荷情報とに基づいて、ジョブの処理時間が最小となるように流量関数fを決定する。図75Aから図75Iは、実施例6における最大流問題におけるフロー増加法による流量関数fの決定処理及びデータフロー情報の決定処理を概念的に示す図である。 The determination unit 303 of the master server 300 is based on the model information of FIG. 73 and the remaining resource amount, processable amount information, and processing load information regarding the servers n1 and n3 acquired from the server state storage unit 3060 of FIG. The flow function f is determined so that the processing time of the job is minimized. 75A to 75I are diagrams conceptually showing a flow function f determination process and a data flow information determination process by the flow increase method in the maximum flow problem in the sixth embodiment.

 まず、決定部303は、図73のモデル情報の表を基に、図75Aに示されるネットワークモデルを構築する。このネットワークモデルでは、始点s1及びs2が設定され、始点s1に対応する終点t1が設定され、始点s2に対応する終点t2が設定される。 First, the determination unit 303 constructs the network model shown in FIG. 75A based on the model information table shown in FIG. In this network model, start points s1 and s2 are set, an end point t1 corresponding to the start point s1 is set, and an end point t2 corresponding to the start point s2 is set.

 ここで、決定部303は、図75Bに示されるように、経路(s1、MyJob1、n2、n1、t1)に50MB/sのフローを与える。結果、決定部303は、図75Cに示されるネットワークの残余グラフを特定する。また、処理サーバn1のリソース残量は、MyJob1の50MB/sのフローにより0.5(=0.01×50)のリソースが利用されるため、0.5(=1-0.5)となる。処理サーバn3のリソース残量は0.5のままである。 Here, the determination unit 303 gives a flow of 50 MB / s to the route (s1, MyJob1, n2, n1, t1) as shown in FIG. 75B. As a result, the determination unit 303 specifies the residual graph of the network illustrated in FIG. 75C. The remaining amount of resources of the processing server n1 is 0.5 (= 1−0.5) because 0.5 (= 0.01 × 50) resources are used by the 50 MB / s flow of MyJob1. Become. The resource remaining amount of the processing server n3 remains 0.5.

 次に決定部303は、図75Cに示される残余グラフからフロー増加路を特定し、図75Dに示されるように、経路(s2、MyJob2、n4、n3、t2)に100MB/sのフローを追加で与える。結果、決定部303は、図75Eに示されるネットワークの残余グラフを特定する。このとき、処理サーバn3のリソース残量は、MyJob2の100MB/sのフローにより0.4(=0.004×100)のリソースが利用されるため、0.1(=0.5-0.4)となる。処理サーバn1のリソース残量は前回のまま0.5である。 Next, the determination unit 303 identifies a flow increasing path from the residual graph illustrated in FIG. 75C and adds a flow of 100 MB / s to the path (s2, MyJob2, n4, n3, t2) as illustrated in FIG. 75D. Give in. As a result, the determination unit 303 specifies the network residual graph shown in FIG. 75E. At this time, the resource remaining amount of the processing server n3 is 0.1 (= 0.5-0.100) because 0.4 (= 0.004 × 100) resources are used by the 100 MB / s flow of MyJob2. 4). The resource remaining amount of the processing server n1 is 0.5 as it was before.

 決定部303は、図75Eに示される残余グラフからフロー増加路を特定し、図75Fに示されるように、経路(s1、MyJob1、n2、n3、t1)に50MB/sのフローを追加で与える。結果、決定部303は、図75Gに示されるネットワークの残余グラフを特定する。このとき、処理サーバn3のリソース残量は、MyJob1の50MB/sのフローにより0.1(=0.002×50)のリソースが利用されるため、0(=0.1-0.1)となる。処理サーバn1のリソース残量は前回のまま0.5である。 The determination unit 303 identifies a flow increasing path from the residual graph shown in FIG. 75E, and additionally gives a flow of 50 MB / s to the path (s1, MyJob1, n2, n3, t1) as shown in FIG. 75F. . As a result, the determination unit 303 specifies the network residual graph shown in FIG. 75G. At this time, the resource remaining amount of the processing server n3 is 0 (= 0.1−0.1) because 0.1 (= 0.002 × 50) resources are used by the 50 MB / s flow of MyJob1. It becomes. The resource remaining amount of the processing server n1 is 0.5 as it was before.

 決定部303は、図75Gに示される残余グラフからフロー増加路を特定し、図75Hに示されるように、経路(s1、MyJob2、n4、n1、t2)に25MB/sのフローを追加で与える。結果、決定部303は、図75Iに示されるネットワークの残余グラフを特定する。このとき、処理サーバn1において、MyJob2の25MB/sのフローにより0.5(=0.02×25)のリソースが利用されるため、処理サーバn1及びn3のリソース残量はそれぞれ0及び0となる。 The determination unit 303 identifies the flow increasing path from the residual graph shown in FIG. 75G, and additionally gives a flow of 25 MB / s to the path (s1, MyJob2, n4, n1, t2) as shown in FIG. 75H. . As a result, the determination unit 303 identifies the residual graph of the network shown in FIG. 75I. At this time, since 0.5 (= 0.02 × 25) resources are used in the processing server n1 by the 25 MB / s flow of MyJob2, the remaining resources of the processing servers n1 and n3 are 0 and 0, respectively. Become.

 処理サーバn1及びn3のリソース残量が0になったので、頂点n1又はn3を通過するフロー増加路は存在しない。これより、終点t1又はt2に到達するフロー増加路は存在しないため、決定部303は処理を終了する。結果、得られた各経路及び各データ流量の組み合わせ情報がデータフロー情報となる。 Since the remaining resources of the processing servers n1 and n3 are 0, there is no flow increasing path that passes through the vertex n1 or n3. Accordingly, since there is no flow increasing path that reaches the end point t1 or t2, the determination unit 303 ends the process. As a result, the obtained combination information of each route and each data flow rate becomes data flow information.

 図76は、実施例6におけるデータフロー情報を示す図である。決定部303は、このように決定されたデータフロー情報に基づいて、処理プログラムをサーバn1及びn3に送信する。更に、決定部303は、処理サーバn1及びn3に、処理プログラムに対応する決定情報を送信することによって、データ受信と処理実行とを指示する。 FIG. 76 is a diagram showing data flow information in the sixth embodiment. The determining unit 303 transmits the processing program to the servers n1 and n3 based on the data flow information determined as described above. Furthermore, the determination unit 303 instructs the data reception and processing execution by transmitting determination information corresponding to the processing program to the processing servers n1 and n3.

 図77は、実施例6において実施されるデータ送受信を概念的に示す図である。決定情報を受信した処理サーバn1は、データサーバn2及びn4の各処理データ格納部342内のデータをそれぞれ取得する。処理実行部p1は取得された各データの処理をそれぞれ実行する。処理サーバn3は、データサーバn2及びn4の各処理データ格納部342内のデータをそれぞれ取得する。処理実行部p3は取得された各データの処理をそれぞれ実行する。 FIG. 77 is a diagram conceptually illustrating data transmission / reception performed in the sixth embodiment. Receiving the decision information, the processing server n1 acquires data in each processing data storage unit 342 of the data servers n2 and n4. The process execution unit p1 executes each acquired data process. The processing server n3 acquires data in the processing data storage units 342 of the data servers n2 and n4, respectively. The process execution unit p3 executes each acquired data process.

 なお、上述の説明で用いた複数のフローチャートでは、複数のステップ(処理)が順番に記載されているが、本実施形態で実行される処理ステップの実行順序は、その記載の順番に制限されない。本実施形態では、図示される処理ステップの順番を内容的に支障のない範囲で変更することができる。また、上述の各実施形態及び各変形例は、内容が相反しない範囲で組み合わせることができる。 In the plurality of flowcharts used in the above description, a plurality of steps (processes) are described in order, but the execution order of the process steps executed in the present embodiment is not limited to the description order. In the present embodiment, the order of the processing steps shown in the figure can be changed within a range that does not hinder the contents. Moreover, each above-mentioned embodiment and each modification can be combined in the range with which the content does not conflict.

 上記の各実施形態及び各変形例の一部又は全部は、以下の付記のようにも特定され得る。但し、各実施形態及び各変形例が以下の記載に限定されるものではない。 Some or all of the above embodiments and modifications may be specified as in the following supplementary notes. However, each embodiment and each modification are not limited to the following description.

 (付記1)
 データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成するモデル生成部と、
 前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決め、該各辺の流量を満たす該概念モデル上の各経路を選択し、選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する決定部と、
 を備える管理装置。
(Appendix 1)
A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device A plurality of first sides from the first vertices to the second vertices, each of which is set with each transfer amount restriction condition including the transferable amount as an upper limit, and data processing per unit time of each processing device Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex. A model generation unit that generates model information capable of constructing a conceptual model; and
The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. The total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition is used to determine the flow rate of each side on the conceptual model and satisfy the flow rate of each side A determination unit that selects each route on the model and determines a plurality of combinations of the processing device and the data device that stores data processed by the processing device in accordance with each vertex included in each selected route. When,
A management device comprising:

 (付記2)
 データの識別子と該データを格納する前記データ装置の識別子とを対応付けて格納するデータ所在格納部と、
 ジョブにより扱われる複数のデータに関する情報を格納するジョブ情報格納部と、
 を更に備え、
 前記モデル生成部は、前記ジョブ情報格納部に格納される情報に基づいて、前記ジョブを示す頂点と、該ジョブを示す頂点から該ジョブにより扱われる複数データを格納する複数のデータ装置を示す複数の頂点へそれぞれ至る複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記決定部は、前記選択された各経路に含まれる各頂点に応じて、前記ジョブと、前記ジョブにより扱われる少なくとも1つのデータを格納する前記データ装置と、該少なくとも1つのデータを処理する前記処理装置との複数組み合わせを決定し、該複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 付記1に記載の管理装置。
(Appendix 2)
A data location storage unit that stores an identifier of data and an identifier of the data device that stores the data in association with each other;
A job information storage unit for storing information on a plurality of data handled by the job;
Further comprising
The model generation unit includes a plurality of data devices that store a plurality of data devices that store a plurality of data handled by the job from a vertex indicating the job and a vertex indicating the job based on information stored in the job information storage unit Generating the model information that can construct the conceptual model further including a plurality of sides that respectively reach the vertices of
The determining unit is configured to process the job, the data device storing at least one data handled by the job, and the at least one data according to each vertex included in each of the selected paths. A plurality of combinations with a processing device are determined, and based on the information of the plurality of combinations and information stored in the data location storage unit, the processing device, an identifier of data processed by the processing device, and the processing device Generating information indicating a correspondence relationship with the data device storing data processed by
The management apparatus according to attachment 1.

 (付記3)
 複数の複製データに多重化された元データの識別子と、該各複製データを格納する各データ装置の識別子とを対応付けて格納するデータ所在格納部を更に備え、
 前記モデル生成部は、前記元データを示す頂点と、前記元データを示す頂点から前記各複製データを格納する各データ装置を示す第1頂点へそれぞれ至る複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記決定部は、前記選択された各経路に含まれる各頂点に応じて、前記元データと、前記複数の複製データの少なくとも1つのデータを格納する前記データ装置と、前記複数の複製データの少なくとも1つを処理する前記処理装置との複数組み合わせを決定し、該複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 付記1又は2に記載の管理装置。
(Appendix 3)
A data location storage unit that stores the identifier of the original data multiplexed into a plurality of replicated data and the identifier of each data device that stores the replicated data in association with each other;
The model generation unit further includes the conceptual model further including: a vertex indicating the original data; and a plurality of sides respectively extending from the vertex indicating the original data to a first vertex indicating each data device storing the duplicate data. Generating the model information that can be constructed,
The determining unit is configured to store the original data, the data device storing at least one of the plurality of duplicate data, and at least one of the plurality of duplicate data according to each vertex included in each of the selected paths. Determining a plurality of combinations with the processing device that processes one, and based on the information of the plurality of combinations and the information stored in the data location storage unit, the processing device and the data processed by the processing device Generating information indicating the correspondence between the identifier and the data device storing the data to be processed by the processing device;
The management device according to attachment 1 or 2.

 (付記4)
 前記決定部は、前記選択された各経路に関する前記単位時間当たりのデータ処理量を、該各経路に対応する前記各対応関係にそれぞれ更に含ませ、データの識別子が共通する前記対応関係が複数生成された場合に、該複数の対応関係で示される各処理装置に、該共通の識別子が示す前記元データの、該対応関係に含まれる前記単位時間当たりのデータ処理量に応じた割合のデータ量を対象にしてそれぞれ処理させることを決定する、
 付記3に記載の管理装置。
(Appendix 4)
The determination unit further includes the data processing amount per unit time for each of the selected routes in each of the correspondences corresponding to the routes, and generates a plurality of the correspondences having a common data identifier. In such a case, the amount of data corresponding to the amount of data processing per unit time included in the correspondence relationship of the original data indicated by the common identifier is transmitted to each processing device indicated by the plurality of correspondence relationships. Decide to process each
The management device according to attachment 3.

 (付記5)
 前記モデル生成部は、前記処理装置を示す前記第2頂点から、前記データ装置を示す前記第1頂点、データを示す頂点、又は、複数のデータを扱うジョブを示す頂点に至る辺であって、前記処理量制約条件が設定された1本以上の辺を更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 付記1から4のいずれか1つに記載の管理装置。
(Appendix 5)
The model generation unit is an edge from the second vertex indicating the processing device to the first vertex indicating the data device, the vertex indicating data, or the vertex indicating a job that handles a plurality of data, Generating the model information that can construct the conceptual model further including one or more sides for which the processing amount constraint is set;
The management device according to any one of appendices 1 to 4.

 (付記6)
 前記モデル生成部は、複数のデータをそれぞれ扱う各ジョブのための各始点頂点と、該各ジョブのための各終点頂点と、該各始点頂点から該各ジョブを示す頂点にそれぞれ至る複数の辺と、前記処理装置を示す前記第2頂点から該各終点頂点にそれぞれ至る複数の辺であって、前記処理量制約条件の上限値が前記処理装置が該各ジョブに対してそれぞれ実行可能な単位時間当たりの処理量にそれぞれ設定される複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 付記1から4のいずれか1つに記載の管理装置。
(Appendix 6)
The model generation unit includes a plurality of start vertices for each job each handling a plurality of data, a respective end point vertex for each job, and a plurality of sides extending from the respective start point vertices to a vertex indicating each job. A plurality of sides extending from the second vertex indicating the processing device to each of the end points, and the upper limit value of the processing amount constraint condition is a unit that the processing device can execute for each job. Generating the model information capable of constructing the conceptual model further including a plurality of sides each set to a processing amount per time,
The management device according to any one of appendices 1 to 4.

 (付記7)
 前記モデル生成部は、前記データ装置に格納されているデータが前記処理装置で受信されるまでに経由する中間機器を示す頂点を更に含むと共に、前記データ装置を示す前記第1頂点から前記データ装置の最寄りの中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が前記データ装置から該最寄りの中間機器への単位時間当たりの転送可能量に設定される辺、該中間機器を示す頂点から他の該中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が該中間機器から他の該中間機器への単位時間当たりの転送可能量に設定される辺、及び、前記処理装置の最寄りの中間機器を示す頂点から前記処理装置を示す前記第2頂点へ至る辺であって、前記転送量制約条件の上限値が該最寄りの中間機器から前記処理装置への単位時間当たりの転送可能量に設定される辺の少なくとも1つを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 付記1から6のいずれか1つに記載の管理装置。
(Appendix 7)
The model generation unit further includes a vertex indicating an intermediate device through which data stored in the data device is received by the processing device, and from the first vertex indicating the data device to the data device An edge that reaches the apex that indicates the nearest intermediate device, and the upper limit of the transfer amount restriction condition is set to the transferable amount per unit time from the data device to the nearest intermediate device, the intermediate The upper limit value of the transfer amount restriction condition is set to the transferable amount per unit time from the intermediate device to the other intermediate device, from the vertex indicating the device to the vertex indicating the other intermediate device. And an edge from the vertex indicating the nearest intermediate device of the processing device to the second vertex indicating the processing device, and the upper limit value of the transfer amount constraint condition is from the nearest intermediate device to the processing apparatus Wherein generating the model information may construct the conceptual model further comprises at least one of the sides is set to transferable amount per unit time,
The management device according to any one of appendices 1 to 6.

 (付記8)
 前記モデル生成部は、前記中間機器を示す頂点を、前記中間機器への1以上の入力部を示す1以上の頂点と、前記中間機器の1以上の出力部を示す1以上の頂点と、データが転送され得る該入力部と該出力部との間を結ぶ1以上の辺とで構成する、
 付記7に記載の管理装置。
(Appendix 8)
The model generation unit includes a vertex indicating the intermediate device, one or more vertexes indicating one or more input units to the intermediate device, one or more vertexes indicating one or more output units of the intermediate device, and data Comprises one or more sides connecting the input unit and the output unit to which can be transferred,
The management apparatus according to appendix 7.

 (付記9)
 前記各処理装置について、単位処理量当たりに費やされる処理負荷及び残負荷をそれぞれ格納するサーバ状態格納部を更に備え、
 前記決定部は、前記概念モデルの各辺に設定された前記転送量制約条件及び前記処理量制約条件、並びに、前記各処理装置に関する前記処理負荷及び前記残負荷に応じた流量制限に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決める、
 付記1から8のいずれか1つに記載の管理装置。
(Appendix 9)
For each of the processing devices, further comprising a server state storage unit for storing a processing load and a remaining load per unit processing amount,
The determination unit is executed according to the transfer amount restriction condition and the processing amount restriction condition set for each side of the conceptual model, and the flow rate restriction according to the processing load and the remaining load related to each processing device. Using the total amount of data processing per unit time possible, determine the flow rate of each side on the conceptual model.
The management device according to any one of appendices 1 to 8.

 (付記10)
 前記モデル生成部は、前記概念モデルに含まれる少なくとも1つの辺に設定される前記転送量制約条件又は前記処理量制約条件には、データ転送量又はデータ処理量の下限値が更に設定される、
 付記1から9のいずれか1つに記載の管理装置。
(Appendix 10)
The model generation unit further sets a lower limit value of a data transfer amount or a data processing amount in the transfer amount constraint condition or the processing amount constraint condition set in at least one side included in the conceptual model.
The management device according to any one of appendices 1 to 9.

 (付記11)
 前記各データ装置と前記各処理装置との間の入出力通信路情報を格納する通信路情報格納部と、
 処理対象のデータの識別子と該データを格納する前記データ装置とを対応付けて格納するデータ所在格納部と、
 前記各処理装置の単位時間当たりのデータ処理可能量を格納するサーバ状態格納部と、
 を備える付記1から10のいずれか1つに記載の管理装置と、
 前記管理装置の前記決定部により決定される、前記処理装置と前記データ装置との各組み合わせに応じて、前記データ装置から取得されるデータを処理する前記処理装置と、
 前記管理装置の前記決定部により決定される、前記処理装置と前記データ装置との各組み合わせに応じて、前記処理装置にデータを送信するデータ装置と、
 を有する分散システム。
(Appendix 11)
A communication path information storage unit for storing input / output communication path information between each data device and each processing device;
A data location storage unit that stores an identifier of data to be processed and the data device that stores the data in association with each other;
A server status storage unit for storing the data processing capacity per unit time of each processing device;
The management device according to any one of appendices 1 to 10, comprising:
The processing device that processes data acquired from the data device according to each combination of the processing device and the data device determined by the determination unit of the management device;
A data device that transmits data to the processing device in accordance with each combination of the processing device and the data device determined by the determination unit of the management device;
A distributed system.

 (付記12)
 少なくとも1つのコンピュータが、
 データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成し、
 前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決定し、
 前記各辺の流量を満たす前記概念モデル上の各経路を選択し、
 前記選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する、
 ことを含む分散処理管理方法。
(Appendix 12)
At least one computer
A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device A plurality of first sides from the first vertices to the second vertices, each of which is set with each transfer amount restriction condition including the transferable amount as an upper limit, and data processing per unit time of each processing device Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex. Generate model information that can build a conceptual model,
The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. Using the total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition, determine the flow rate of each side on the conceptual model,
Select each path on the conceptual model that satisfies the flow rate of each side,
Determining a plurality of combinations of the processing device and the data device storing data to be processed by the processing device according to each vertex included in each of the selected paths;
Distributed processing management method.

 (付記13)
 前記少なくとも1つのコンピュータが、データの識別子と該データを格納する前記データ装置の識別子とを対応付けて格納するデータ所在格納部と、ジョブにより扱われる複数のデータに関する情報を格納するジョブ情報格納部と、を備え、
 前記モデル情報の生成は、前記ジョブ情報格納部に格納される情報に基づいて、前記ジョブを示す頂点と、該ジョブを示す頂点から該ジョブにより扱われる複数データを格納する複数のデータ装置を示す複数の頂点へそれぞれ至る複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記複数組み合わせの決定は、前記選択された各経路に含まれる各頂点に応じて、前記ジョブと、前記ジョブにより扱われる少なくとも1つのデータを格納する前記データ装置と、該少なくとも1つのデータを処理する前記処理装置との複数組み合わせを決定し、
 前記少なくとも1つのコンピュータが、
 前記複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 ことを更に含む付記12に記載の分散処理管理方法。
(Appendix 13)
The at least one computer stores a data location storage unit that associates an identifier of data with an identifier of the data device that stores the data, and a job information storage unit that stores information about a plurality of data handled by a job And comprising
The generation of the model information indicates a vertex indicating the job and a plurality of data devices storing a plurality of data handled by the job from the vertex indicating the job based on information stored in the job information storage unit. Generating the model information capable of constructing the conceptual model further including a plurality of sides each leading to a plurality of vertices;
The plurality of combinations are determined by processing the job, the data device storing at least one data handled by the job, and the at least one data according to each vertex included in each of the selected paths. Determining a plurality of combinations with the processing device
The at least one computer comprises:
The data for storing the processing device, an identifier of data processed by the processing device, and data processed by the processing device based on the information of the plurality of combinations and information stored in the data location storage unit Generate information indicating the correspondence with the device,
The distributed processing management method according to appendix 12, further comprising:

 (付記14)
 前記少なくとも1つのコンピュータが、複数の複製データに多重化された元データの識別子と、該各複製データを格納する各データ装置の識別子とを対応付けて格納するデータ所在格納部を備え、
 前記モデル情報の生成は、前記元データを示す頂点と、前記元データを示す頂点から前記各複製データを格納する各データ装置を示す第1頂点へそれぞれ至る複数の辺とを更にを含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記複数組み合わせの決定は、前記選択された各経路に含まれる各頂点に応じて、前記元データと、前記複数の複製データの少なくとも1つのデータを格納する前記データ装置と、前記複数の複製データの少なくとも1つを処理する前記処理装置との複数組み合わせを決定し、
 前記少なくとも1つのコンピュータが、
 前記複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 ことを更に含む付記12又は13に記載の分散処理管理方法。
(Appendix 14)
The at least one computer includes a data location storage unit that stores an identifier of original data multiplexed into a plurality of replicated data and an identifier of each data device that stores the replicated data in association with each other;
The generation of the model information further includes a vertex indicating the original data, and a plurality of sides extending from the vertex indicating the original data to a first vertex indicating each data device storing the duplicate data. Generating said model information that can build a model;
The plurality of combinations are determined according to each vertex included in each of the selected paths, the original data, the data device storing at least one of the plurality of duplicate data, and the plurality of duplicate data. Determining a plurality of combinations with the processing device for processing at least one of
The at least one computer comprises:
The data for storing the processing device, an identifier of data processed by the processing device, and data processed by the processing device based on the information of the plurality of combinations and information stored in the data location storage unit Generate information indicating the correspondence with the device,
The distributed processing management method according to appendix 12 or 13, further comprising:

 (付記15)
 前記複数組み合わせの決定は、前記選択された各経路に関する前記単位時間当たりのデータ処理量を、該各経路に対応する前記各対応関係にそれぞれ更に含ませ、データの識別子が共通する前記対応関係が複数生成された場合に、該複数の対応関係で示される各処理装置に、該共通の識別子が示す前記元データの、該対応関係に含まれる前記単位時間当たりのデータ処理量に応じた割合のデータ量を対象にしてそれぞれ処理させることを決定する、
 付記14に記載の分散処理管理方法。
(Appendix 15)
In the determination of the plurality of combinations, the data processing amount per unit time for each of the selected routes is further included in each of the corresponding relationships corresponding to the respective routes, and the corresponding relationship having a common data identifier is determined. When a plurality of data is generated, each processing device indicated by the plurality of correspondence relations has a ratio corresponding to the data processing amount per unit time included in the correspondence relation of the original data indicated by the common identifier. Decide what to do with each data volume,
The distributed processing management method according to attachment 14.

 (付記16)
 前記モデル情報の生成は、複数のデータをそれぞれ扱う各ジョブのための各始点頂点と、該各ジョブのための各終点頂点と、該各始点頂点から該各ジョブを示す頂点にそれぞれ至る複数の辺と、前記処理装置を示す前記第2頂点から該各終点頂点にそれぞれ至る複数の辺であって、前記処理量制約条件の上限値が前記処理装置が該各ジョブに対してそれぞれ実行可能な単位時間当たりの処理量にそれぞれ設定される複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 付記12から15のいずれか1つに記載の分散処理管理方法。
(Appendix 16)
The generation of the model information includes each start point vertex for each job handling a plurality of data, each end point vertex for each job, and a plurality of start points from each start point vertex to a vertex indicating each job. An edge and a plurality of edges extending from the second vertex indicating the processing device to each end vertex, and an upper limit value of the processing amount constraint condition is executable by the processing device for each job. Generating the model information capable of constructing the conceptual model further including a plurality of sides each set to a processing amount per unit time;
The distributed processing management method according to any one of appendices 12 to 15.

 (付記17)
 前記モデル情報の生成は、前記データ装置に格納されているデータが前記処理装置で受信されるまでに経由する中間機器を示す頂点を更に含むと共に、前記データ装置を示す前記第1頂点から前記データ装置の最寄りの中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が前記データ装置から該最寄りの中間機器への単位時間当たりの転送可能量に設定される辺、該中間機器を示す頂点から他の該中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が該中間機器から他の該中間機器への単位時間当たりの転送可能量に設定される辺、及び、前記処理装置の最寄りの中間機器を示す頂点から前記処理装置を示す前記第2頂点へ至る辺であって、前記転送量制約条件の上限値が該最寄りの中間機器から前記処理装置への単位時間当たりの転送可能量に設定される辺の少なくとも1つを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 付記12から16のいずれか1つに記載の分散処理管理方法。
(Appendix 17)
The generation of the model information further includes a vertex indicating an intermediate device through which data stored in the data device is received by the processing device, and the data from the first vertex indicating the data device. An edge that reaches a vertex indicating the nearest intermediate device of the device, and an upper limit value of the transfer amount restriction condition is set to a transferable amount per unit time from the data device to the nearest intermediate device; The edge from the vertex indicating the intermediate device to the vertex indicating the other intermediate device, and the upper limit value of the transfer amount constraint is set to the transferable amount per unit time from the intermediate device to the other intermediate device And an edge from the vertex indicating the nearest intermediate device of the processing device to the second vertex indicating the processing device, and the upper limit value of the transfer amount constraint condition is from the nearest intermediate device to the second vertex processing Generating the model information may construct the conceptual model further comprises at least one of the sides is set to transferable per unit time to the location,
The distributed processing management method according to any one of appendices 12 to 16.

 (付記18)
 前記モデル情報の生成は、前記中間機器を示す頂点を、前記中間機器への1以上の入力部を示す1以上の頂点と、前記中間機器の1以上の出力部を示す1以上の頂点と、データが転送され得る該入力部と該出力部との間を結ぶ1以上の辺とで構成する、
 付記17に記載の分散処理管理方法。
(Appendix 18)
The generation of the model information includes a vertex indicating the intermediate device, one or more vertices indicating one or more input units to the intermediate device, and one or more vertices indicating one or more output units of the intermediate device; It is composed of one or more sides connecting the input unit and the output unit to which data can be transferred.
The distributed processing management method according to appendix 17.

 (付記19)
 少なくとも1つのコンピュータに、付記12から18のいずれか1つに記載の管理方法を実行させるプログラム。
(Appendix 19)
A program for causing at least one computer to execute the management method according to any one of appendices 12 to 18.

 (付記20)
 付記19に記載のプログラムをコンピュータに読み取り可能に記録する記録媒体。
(Appendix 20)
A recording medium for recording the program according to attachment 19 in a computer-readable manner.

 この出願は、2012年3月30日に出願された日本出願特願2012-082665号を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2012-082665 filed on Mar. 30, 2012, the entire disclosure of which is incorporated herein.

Claims (10)

 データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成するモデル生成部と、
 前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決め、該各辺の流量を満たす該概念モデル上の各経路を選択し、選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する決定部と、
 を備える管理装置。
A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device A plurality of first sides from the first vertices to the second vertices, each of which is set with each transfer amount restriction condition including the transferable amount as an upper limit, and data processing per unit time of each processing device Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex. A model generation unit that generates model information capable of constructing a conceptual model; and
The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. The total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition is used to determine the flow rate of each side on the conceptual model and satisfy the flow rate of each side A determination unit that selects each route on the model and determines a plurality of combinations of the processing device and the data device that stores data processed by the processing device in accordance with each vertex included in each selected route. When,
A management device comprising:
 データの識別子と該データを格納する前記データ装置の識別子とを対応付けて格納するデータ所在格納部と、
 ジョブにより扱われる複数のデータに関する情報を格納するジョブ情報格納部と、
 を更に備え、
 前記モデル生成部は、前記ジョブ情報格納部に格納される情報に基づいて、前記ジョブを示す頂点と、該ジョブを示す頂点から該ジョブにより扱われる複数データを格納する複数のデータ装置を示す複数の頂点へそれぞれ至る複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記決定部は、前記選択された各経路に含まれる各頂点に応じて、前記ジョブと、前記ジョブにより扱われる少なくとも1つのデータを格納する前記データ装置と、該少なくとも1つのデータを処理する前記処理装置との複数組み合わせを決定し、該複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 請求項1に記載の管理装置。
A data location storage unit that stores an identifier of data and an identifier of the data device that stores the data in association with each other;
A job information storage unit for storing information on a plurality of data handled by the job;
Further comprising
The model generation unit includes a plurality of data devices that store a plurality of data devices that store a plurality of data handled by the job from a vertex indicating the job and a vertex indicating the job based on information stored in the job information storage unit Generating the model information that can construct the conceptual model further including a plurality of sides that respectively reach the vertices of
The determining unit is configured to process the job, the data device storing at least one data handled by the job, and the at least one data according to each vertex included in each of the selected paths. A plurality of combinations with a processing device are determined, and based on the information of the plurality of combinations and information stored in the data location storage unit, the processing device, an identifier of data processed by the processing device, and the processing device Generating information indicating a correspondence relationship with the data device storing data processed by
The management apparatus according to claim 1.
 複数の複製データに多重化された元データの識別子と、該各複製データを格納する各データ装置の識別子とを対応付けて格納するデータ所在格納部を更に備え、
 前記モデル生成部は、前記元データを示す頂点と、前記元データを示す頂点から前記各複製データを格納する各データ装置を示す第1頂点へそれぞれ至る複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成し、
 前記決定部は、前記選択された各経路に含まれる各頂点に応じて、前記元データと、前記複数の複製データの少なくとも1つのデータを格納する前記データ装置と、前記複数の複製データの少なくとも1つを処理する前記処理装置との複数組み合わせを決定し、該複数組み合わせの情報及び前記データ所在格納部に格納される情報に基づいて、前記処理装置と、前記処理装置により処理されるデータの識別子と、前記処理装置により処理されるデータを格納する前記データ装置との対応関係を示す情報を生成する、
 請求項1又は2に記載の管理装置。
A data location storage unit that stores the identifier of the original data multiplexed into a plurality of replicated data and the identifier of each data device that stores the replicated data in association with each other;
The model generation unit further includes the conceptual model further including: a vertex indicating the original data; and a plurality of sides respectively extending from the vertex indicating the original data to a first vertex indicating each data device storing the duplicate data. Generating the model information that can be constructed,
The determining unit is configured to store the original data, the data device storing at least one of the plurality of duplicate data, and at least one of the plurality of duplicate data according to each vertex included in each of the selected paths. Determining a plurality of combinations with the processing device that processes one, and based on the information of the plurality of combinations and the information stored in the data location storage unit, the processing device and the data processed by the processing device Generating information indicating the correspondence between the identifier and the data device storing the data to be processed by the processing device;
The management device according to claim 1 or 2.
 前記決定部は、前記選択された各経路に関する前記単位時間当たりのデータ処理量を、該各経路に対応する前記各対応関係にそれぞれ更に含ませ、データの識別子が共通する前記対応関係が複数生成された場合に、該複数の対応関係で示される各処理装置に、該共通の識別子が示す前記元データの、該対応関係に含まれる前記単位時間当たりのデータ処理量に応じた割合のデータ量を対象にしてそれぞれ処理させることを決定する、
 請求項3に記載の管理装置。
The determination unit further includes the data processing amount per unit time for each of the selected routes in each of the correspondences corresponding to the routes, and generates a plurality of the correspondences having a common data identifier. In such a case, the amount of data corresponding to the amount of data processing per unit time included in the correspondence relationship of the original data indicated by the common identifier is transmitted to each processing device indicated by the plurality of correspondence relationships. Decide to process each
The management device according to claim 3.
 前記モデル生成部は、複数のデータをそれぞれ扱う各ジョブのための各始点頂点と、該各ジョブのための各終点頂点と、該各始点頂点から該各ジョブを示す頂点にそれぞれ至る複数の辺と、前記処理装置を示す前記第2頂点から該各終点頂点にそれぞれ至る複数の辺であって、前記処理量制約条件の上限値が前記処理装置が該各ジョブに対してそれぞれ実行可能な単位時間当たりの処理量にそれぞれ設定される複数の辺とを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 請求項1から4のいずれか1項に記載の管理装置。
The model generation unit includes a plurality of start vertices for each job each handling a plurality of data, a respective end point vertex for each job, and a plurality of sides extending from the respective start point vertices to a vertex indicating each job. A plurality of sides extending from the second vertex indicating the processing device to each of the end points, and the upper limit value of the processing amount constraint condition is a unit that the processing device can execute for each job. Generating the model information capable of constructing the conceptual model further including a plurality of sides each set to a processing amount per time,
The management apparatus of any one of Claim 1 to 4.
 前記モデル生成部は、
 前記データ装置に格納されているデータが前記処理装置で受信されるまでに経由する中間機器を示す頂点を更に含むと共に、前記データ装置を示す前記第1頂点から前記データ装置の最寄りの中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が前記データ装置から該最寄りの中間機器への単位時間当たりの転送可能量に設定される辺、該中間機器を示す頂点から他の該中間機器を示す頂点へ至る辺であって、前記転送量制約条件の上限値が該中間機器から他の該中間機器への単位時間当たりの転送可能量に設定される辺、及び、前記処理装置の最寄りの中間機器を示す頂点から前記処理装置を示す前記第2頂点へ至る辺であって、前記転送量制約条件の上限値が該最寄りの中間機器から前記処理装置への単位時間当たりの転送可能量に設定される辺の少なくとも1つを更に含む前記概念モデルを構築し得る前記モデル情報を生成する、
 請求項1から5のいずれか1項に記載の管理装置。
The model generation unit
And further including a vertex indicating an intermediate device through which data stored in the data device is received by the processing device, and an intermediate device closest to the data device from the first vertex indicating the data device. An edge that reaches an apex that indicates that the upper limit value of the transfer amount constraint is set to a transferable amount per unit time from the data device to the nearest intermediate device, from the apex that indicates the intermediate device The upper edge of the intermediate device, and the upper limit value of the transfer amount restriction condition is set to the transferable amount per unit time from the intermediate device to the other intermediate device, and An edge extending from a vertex indicating the nearest intermediate device of the processing device to the second vertex indicating the processing device, and an upper limit value of the transfer amount restriction condition is per unit time from the nearest intermediate device to the processing device. of Generating the model information may construct the conceptual model further comprises at least one of the sides is set to feed possible amount,
The management apparatus of any one of Claim 1 to 5.
 前記モデル生成部は、前記中間機器を示す頂点を、前記中間機器への1以上の入力部を示す1以上の頂点と、前記中間機器の1以上の出力部を示す1以上の頂点と、データが転送され得る該入力部と該出力部との間を結ぶ1以上の辺とで構成する、
 請求項6に記載の管理装置。
The model generation unit includes a vertex indicating the intermediate device, one or more vertexes indicating one or more input units to the intermediate device, one or more vertexes indicating one or more output units of the intermediate device, and data Comprises one or more sides connecting the input unit and the output unit to which can be transferred,
The management device according to claim 6.
 前記各処理装置について、単位処理量当たりに費やされる処理負荷及び残負荷をそれぞれ格納するサーバ状態格納部を更に備え、
 前記決定部は、前記概念モデルの各辺に設定された前記転送量制約条件及び前記処理量制約条件、並びに、前記各処理装置に関する前記処理負荷及び前記残負荷に応じた流量制限に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決める、
 請求項1から7のいずれか1項に記載の管理装置。
For each of the processing devices, further comprising a server state storage unit for storing a processing load and a remaining load per unit processing amount,
The determination unit is executed according to the transfer amount restriction condition and the processing amount restriction condition set for each side of the conceptual model, and the flow rate restriction according to the processing load and the remaining load related to each processing device. Using the total amount of data processing per unit time possible, determine the flow rate of each side on the conceptual model.
The management apparatus of any one of Claim 1 to 7.
 少なくとも1つのコンピュータが、
 データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成し、
 前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決定し、
 前記各辺の流量を満たす前記概念モデル上の各経路を選択し、
 前記選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する、
 ことを含む分散処理管理方法。
At least one computer
A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device A plurality of first sides from the first vertices to the second vertices, each of which is set with each transfer amount restriction condition including the transferable amount as an upper limit, and data processing per unit time of each processing device Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex. Generate model information that can build a conceptual model,
The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. Using the total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition, determine the flow rate of each side on the conceptual model,
Select each path on the conceptual model that satisfies the flow rate of each side,
Determining a plurality of combinations of the processing device and the data device storing data to be processed by the processing device according to each vertex included in each of the selected paths;
Distributed processing management method.
 少なくとも1つのコンピュータに、
 データを格納する複数のデータ装置を示す複数の第1頂点と、データを処理する複数の処理装置を示す複数の第2頂点と、該各データ装置から該各処理装置への単位時間当たりのデータ転送可能量を上限値として含む各転送量制約条件がそれぞれ設定される、該各第1頂点から該各第2頂点に至る複数の第1辺と、該各処理装置の単位時間当たりのデータ処理可能量を上限値として含む各処理量制約条件がそれぞれ設定される、該各第2頂点から、該各第2頂点より後段の少なくとも1つの第3頂点に至る少なくとも1つの第2辺とを含む概念モデルを構築し得るモデル情報を生成するモデル生成部と、
 前記第1頂点、前記第2頂点、前記第1辺及び前記第2辺をそれぞれ含む前記概念モデル上の各経路に関する、該経路に含まれる該第1辺及び該第2辺に設定される前記転送量制約条件及び前記処理量制約条件に応じて実行可能な単位時間当たりのデータ処理量の合計を用いて、前記概念モデル上の各辺の流量を決め、該各辺の流量を満たす該概念モデル上の各経路を選択し、選択された各経路に含まれる各頂点に応じて、前記処理装置と前記処理装置により処理されるデータを格納する前記データ装置との複数組み合わせを決定する決定部と、
 を実現させる管理プログラム。
On at least one computer,
A plurality of first vertices indicating a plurality of data devices for storing data, a plurality of second vertices indicating a plurality of processing devices for processing data, and data per unit time from each data device to each processing device A plurality of first sides from the first vertices to the second vertices, each of which is set with each transfer amount restriction condition including the transferable amount as an upper limit, and data processing per unit time of each processing device Each processing amount constraint condition including the possible amount as an upper limit value is set, and includes at least one second side from each second vertex to at least one third vertex subsequent to each second vertex. A model generation unit that generates model information capable of constructing a conceptual model; and
The first side and the second side included in the path are set for the paths on the conceptual model including the first vertex, the second vertex, the first side, and the second side, respectively. The total amount of data processing per unit time that can be executed according to the transfer amount constraint condition and the processing amount constraint condition is used to determine the flow rate of each side on the conceptual model and satisfy the flow rate of each side A determination unit that selects each route on the model and determines a plurality of combinations of the processing device and the data device that stores data processed by the processing device in accordance with each vertex included in each selected route. When,
Management program that realizes.
PCT/JP2013/000305 2012-03-30 2013-01-23 Management device and distributed processing management method Ceased WO2013145512A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012082665 2012-03-30
JP2012-082665 2012-03-30

Publications (1)

Publication Number Publication Date
WO2013145512A1 true WO2013145512A1 (en) 2013-10-03

Family

ID=49258840

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/000305 Ceased WO2013145512A1 (en) 2012-03-30 2013-01-23 Management device and distributed processing management method

Country Status (2)

Country Link
JP (1) JPWO2013145512A1 (en)
WO (1) WO2013145512A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018154799A1 (en) * 2017-02-24 2018-08-30 株式会社レクサー・リサーチ Operational plan optimization device and operational plan optimization method
CN109254531A (en) * 2017-11-29 2019-01-22 辽宁石油化工大学 The optimal cost control method of multistage batch process with time lag and interference
JP2022008955A (en) * 2017-02-24 2022-01-14 株式会社レクサー・リサーチ Task plan optimizing device and task plan optimizing method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004086246A1 (en) * 2003-03-24 2004-10-07 Fujitsu Limited Decentralized processing control device, decentralized processing control method, decentralized processing control program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004086246A1 (en) * 2003-03-24 2004-10-07 Fujitsu Limited Decentralized processing control device, decentralized processing control method, decentralized processing control program

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018154799A1 (en) * 2017-02-24 2018-08-30 株式会社レクサー・リサーチ Operational plan optimization device and operational plan optimization method
JP2018139041A (en) * 2017-02-24 2018-09-06 株式会社レクサー・リサーチ Business plan optimization device and business plan optimization method
CN110337659A (en) * 2017-02-24 2019-10-15 雷克萨研究有限公司 Business planning optimization device and business planning optimization method
JP2022008955A (en) * 2017-02-24 2022-01-14 株式会社レクサー・リサーチ Task plan optimizing device and task plan optimizing method
US11314238B2 (en) 2017-02-24 2022-04-26 Lexer Research Inc. Plant operational plan optimization discrete event simulator device and method
JP7244128B2 (en) 2017-02-24 2023-03-22 株式会社レクサー・リサーチ Business plan optimization device and business plan optimization method
CN109254531A (en) * 2017-11-29 2019-01-22 辽宁石油化工大学 The optimal cost control method of multistage batch process with time lag and interference
CN109254531B (en) * 2017-11-29 2021-10-22 辽宁石油化工大学 Optimal cost control method for multi-stage batch processes with time delays and disturbances

Also Published As

Publication number Publication date
JPWO2013145512A1 (en) 2015-12-10

Similar Documents

Publication Publication Date Title
JP5850054B2 (en) Distributed processing management server, distributed system, and distributed processing management method
US20240256571A1 (en) Resource management systems and methods
JP5935889B2 (en) Data processing method, information processing apparatus, and program
US9740706B2 (en) Management of intermediate data spills during the shuffle phase of a map-reduce job
US9569245B2 (en) System and method for controlling virtual-machine migrations based on processor usage rates and traffic amounts
JP6162194B2 (en) Chassis controller to convert universal flow
US10545914B2 (en) Distributed object storage
Nathan et al. Comicon: A co-operative management system for docker container images
JP5929196B2 (en) Distributed processing management server, distributed system, distributed processing management program, and distributed processing management method
US10127275B2 (en) Mapping query operations in database systems to hardware based query accelerators
US20130151707A1 (en) Scalable scheduling for distributed data processing
US20080294872A1 (en) Defragmenting blocks in a clustered or distributed computing system
CN102947796A (en) Method and apparatus for moving virtual resources in a data center environment
JP2005056077A (en) Database control method
CN102165448A (en) Storage tiers for database server system
US10990433B2 (en) Efficient distributed arrangement of virtual machines on plural host machines
WO2013145512A1 (en) Management device and distributed processing management method
CN105760391B (en) Method for dynamic data redistribution, data node, name node and system
US20230205751A1 (en) Multiple Volume Placement Based on Resource Usage and Scoring Functions
US9146694B2 (en) Distribution processing unit of shared storage
CN112714903B (en) Scalable cell-based packet processing service using client-provided decision metadata
US20120030446A1 (en) Method and system for providing distributed programming environment using distributed spaces, and computer readable recording medium
WO2016174739A1 (en) Multicomputer system, management computer, and data linkage management method
ELomari et al. New data placement strategy in the HADOOP framework
JP5031538B2 (en) Data distribution method, data distribution program, and parallel database system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13769218

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014507351

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13769218

Country of ref document: EP

Kind code of ref document: A1