WO2017000693A1 - Performance synchronization and statistics method for cluster device and system - Google Patents
Performance synchronization and statistics method for cluster device and system Download PDFInfo
- Publication number
- WO2017000693A1 WO2017000693A1 PCT/CN2016/082350 CN2016082350W WO2017000693A1 WO 2017000693 A1 WO2017000693 A1 WO 2017000693A1 CN 2016082350 W CN2016082350 W CN 2016082350W WO 2017000693 A1 WO2017000693 A1 WO 2017000693A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- statistical
- node
- cluster
- record
- statistics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1095—Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Definitions
- the present invention relates to, but is not limited to, the field of NAS (Network Attached Storage) storage of the cluster technology, and in particular, to a cluster device performance synchronization statistics method and system.
- NAS Network Attached Storage
- the performance statistics are generally implemented by the network management system actively communicating with the service device agent or by the device agent to obtain the original performance parameter values and processing.
- This implementation mainly has the following three characteristics. The first is that a central host must be required to run the network management system; the second is that the network management system and the service device agent are implemented in C/S mode; the third is each device. The statistical objects are all irrelevant.
- Clustering technology is a relatively new technology that yields relatively high performance in terms of performance, reliability, and flexibility at a lower cost. It is a whole set of independent devices connected by high-speed network.
- the cluster environment is also a multi-device environment, but the cluster environment has its own uniqueness. Take the NAS storage cluster environment as an example. The following three situations are not It is suitable for the above traditional statistical methods: 1) switching of the running node of the volume statistical object; 2) joining or exiting the cluster, and 3) adding or deleting statistical objects on the node.
- the traditional network management system cannot dynamically detect changes in the above three cases, and it is impossible to perform normal performance statistics.
- a cluster device performance synchronization statistics method is applied to a first node of a cluster device as a master node, wherein the cluster device performance synchronization statistics method includes:
- the statistical record is synchronized to other nodes in the cluster device than the first node.
- the method further includes:
- a performance statistics database is created, and the default statistical object type is in the database. Create a database table on it;
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the method further includes:
- the statistical records are stored in the performance statistics database.
- the method further includes:
- the memory capacity of the second node is smaller than the memory capacity of the first node, deleting part of the statistical records in the performance statistics database.
- the method further includes:
- the method further includes:
- the statistical time of the statistical record is subtracted from the operation
- the absolute value of the result obtains a first result, and the first result is used as the latest statistical time of the statistical record; if the operation result is greater than or equal to zero, the statistical time of the statistical record is added to the operation result
- the absolute value yields a second result, which is used as the most recent statistical time of the statistical record.
- the method further includes:
- the pre-processing operation is an adding operation or a deleting operation of the statistical object
- a cluster device performance synchronization statistics system is applied to a first node in a cluster device as a master node, and the system includes:
- the first collection module is configured to collect a counter value of a statistical object on all nodes including the first node in the cluster;
- the first statistical record generating module is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record
- the first synchronization module is configured to synchronize the statistical record to other nodes in the cluster device than the first node.
- the statistical system further includes:
- the first initialization module is configured to create a performance statistics database and preset the performance when the first node is powered on before collecting the counter value of the statistical object on all nodes including the first node in the cluster. Create a database table on the database for each type of statistical object;
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the first statistical record generating module is further configured to: after collecting the counter values of the statistical objects on each of the nodes respectively, and generating the statistical records, storing the statistical records in the performance statistics. database.
- system further includes:
- a first receiving module configured to receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node
- Comparing a module configured to compare a memory capacity of the second node with a memory capacity of the first node
- the deleting module is configured to delete part of the statistical records in the performance statistics database if the memory capacity of the second node is smaller than the memory capacity of the first node.
- system further includes:
- the second receiving module is configured to receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction;
- a locating module configured to search for a first statistic record corresponding to the statistic object that matches the request parameter, according to the request parameter carried by the preset request instruction;
- the information feedback module is configured to send the first statistical record to the sender of the preset request instruction.
- system further includes:
- the first monitoring module is configured to monitor whether the system time has changed
- the time acquisition module is configured to acquire the current system time of the first node when the preset acquisition time arrives when the system time is changed;
- the calculation module is configured to calculate the current system time, the last statistical time, and the preset acquisition time to obtain an operation result
- the determining module is configured to compare the absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, further determine whether the operation result is less than zero;
- Calculating an evaluation module if the operation result is less than zero, subtracting the statistical value of the statistical record from the absolute value of the operation result to obtain a first result, and using the first result as the latest of the statistical record Statistics time; if the operation result is greater than or equal to zero, adding the statistical time of the statistical record to the absolute value of the operation result to obtain a second result, and using the second result as the latest statistical time of the statistical record .
- the system also includes:
- a second monitoring module configured to monitor whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object;
- a statistical object obtaining module configured to acquire statistical object data stored in the cluster management and index array data of the statistical object if the preprocessing operation is detected;
- the comparison processing module is configured to compare the statistical object data with the index array data, and preprocess the index array data according to the comparison result.
- a cluster device performance synchronization statistics method is applied to a second node in a cluster device as a slave node, and the method includes:
- the method further includes:
- a performance statistics database is created, and each type of the statistical object is preset. Create a database table on the database;
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the method further includes:
- a cluster device performance synchronization statistics system is applied to a second node in a cluster device as a slave node, and the system includes:
- the obtaining module is configured to acquire a statistical record on the first node and store the statistical record.
- system further includes:
- a second initialization module configured to create a performance statistics database when the second node is powered on, and create a database table on the database for each of the preset statistical object types
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- system further includes:
- a detecting module configured to detect whether a change signal changed from the second node to the first node has been generated
- a second collection module configured to: if the change signal has been generated, collect a counter value of a statistical object on all nodes including the second node that is powered on in the cluster;
- a second statistical record generating module is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record
- a second synchronization module is configured to synchronize the statistical record to other nodes in the cluster device than the second node.
- a computer readable storage medium storing computer executable instructions that, when executed by a processor, implement the cluster device performance synchronization statistics method described above.
- the statistical value of the statistical object on each node in the cluster is calculated by the primary node in the cluster, and the statistical record is generated, and the statistical record is synchronized to each slave node in the cluster to ensure the master of the cluster.
- the slave nodes in the cluster can continue to complete the work of the master node according to the synchronized statistics records, ensuring that the cluster performance statistics are highly sustainable.
- FIG. 1 is a general flowchart of a method for synchronizing statistical performance of a cluster device applied to a primary node according to an embodiment of the present invention
- FIG. 2 is a block diagram showing a cluster device performance synchronization statistics system applied to a master node according to an embodiment of the present invention
- FIG. 3 is a schematic diagram showing interaction between a master device and a cluster device performance synchronization statistics system on a slave node according to an embodiment of the present invention
- FIG. 4 is a flowchart of processing an abnormal situation 1 of a cluster device performance synchronization statistics system according to an embodiment of the present invention
- FIG. 5 is a flowchart of processing an abnormal situation 2 of a cluster device performance synchronization statistics system according to an embodiment of the present invention
- FIG. 6 is a flowchart 1 of processing an abnormal situation 3 of a cluster device performance synchronization statistics system according to an embodiment of the present invention
- FIG. 7 is a flowchart 2 of the process for the abnormal situation 3 of the cluster device performance synchronization statistics system according to the embodiment of the present invention.
- FIG. 8 is a flowchart of processing an abnormal situation 4 of a cluster device performance synchronization statistics system according to an embodiment of the present invention
- FIG. 9 is a flowchart of processing an abnormal situation 5 of a cluster device performance synchronization statistics system according to an embodiment of the present invention.
- the embodiments of the present invention can not dynamically monitor the changes of each node in the cluster when the device statistics are collected by using the traditional statistics in the related clusters, resulting in the problem that the performance of the cluster performance is not high.
- an embodiment of the present invention provides a cluster device performance synchronization statistics method, which is applied to a first node that is a master node in a cluster device, and the method includes steps S101 to S103:
- S101 Collect a counter value of a statistical object on all nodes including the first node in the cluster.
- S102 Perform a summary process on the counter values of the statistical objects on each of the nodes to generate a statistical record.
- a statistical record is generated by using a primary node in a cluster to collect a counter value of a statistical object on each node in the cluster, and the statistical record is synchronized to each slave node in the cluster, thereby ensuring When the primary node of the cluster is powered off or down, the slave nodes in the cluster can continue to complete the work of the master node according to the synchronization statistics, and maintain the high degree of sustainability of the cluster performance statistics.
- step S101 the statistical object on each node is pre-defined by the cluster device administrator and stored in the node of the cluster, and each node in the cluster runs with itself when it is running. Collecting data information generated for these statistical objects, and then the node processes the data information, and saves the obtained result in a counter of the node, which should be stated
- the counter can be regarded as a storage area for storing data information of the statistical object on the node. It should be noted that the process of storing the data information of the statistical object into the storage area after processing the data information is well known to those skilled in the art, and will not be described in detail herein.
- Step S102 is a process of collecting a counter value of a statistical object on a primary node of the cluster to generate a statistical record.
- the main implementation manner includes: using a corresponding calculation manner, for example, traffic statistics, according to a counter value of the statistical object.
- the formula performs the flow counter value/statistical time interval; the calculated cache hit ratio is calculated by the number of hits executed by the formula/(number of hits + number of misses).
- the counter value is processed into a readable data item according to different calculation methods, and the statistical record of the statistical object is composed of a plurality of such data items.
- Step S103 is implemented to synchronize the statistical records to other nodes in the cluster.
- the implementation manner may include: after reaching a fixed time interval, the primary node sends its own saved statistical records to other slave nodes to implement Sharing with data from the slave node.
- the statistical records should be stored one by one according to the type of the statistical object, and in order to conveniently implement the query and management of the data, the database is usually used for data storage, so when the cluster primary node is powered on for the first time, You need to create a database to save statistics records.
- the implementation methods include:
- the method further includes: before collecting the counter value of the statistical object on all nodes including the first node in the cluster (ie, before step S101), creating a performance statistics database when the first node is powered on And create a database table on the database for each of the default statistical object types.
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the method further includes:
- the statistical records are stored in the performance statistics database.
- a database table is created for each statistical object type according to the statistical object type.
- the statistical records are stored one by one according to the type of the statistical object to the corresponding database table of the performance statistics database.
- the statistics object is port 1, port 2, virtual disk a, and volume 1.
- port 1 and port are required.
- the statistical records of 2 are stored in the network port database table, and the statistical records of the virtual disk a are stored in the virtual disk database table, and the statistical records of the volume 1 are stored in the volume database table.
- the statistical object type and the statistical object may be added or deleted in real time according to the statistical requirements of the cluster device during actual operation.
- the database in the embodiment of the present invention needs to implement sharing on the primary node and the secondary node of the cluster, that is, the same performance statistics database is saved on both the primary node and the secondary node. Therefore, the primary node and the secondary node need to be guaranteed.
- the method can further include the performance statistics database. Therefore, in the embodiment of the present invention, the method further includes steps S201 to S203:
- S201 Receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node.
- slave node capacity is greater than or equal to the master node capacity, there is no need to delete the database.
- the database must be deleted, optionally, the earliest generated statistical records can be deleted in chronological order, and the most recent latest statistical records are retained.
- the statistical object of each statistical object type is used to limit the maximum number of statistical records.
- determine whether the number of statistical records of the current statistical object is The maximum value is reached. If the maximum value is reached, the oldest statistical record of the statistical object needs to be deleted. In this case, the new statistical record can be saved to the database, and the new statistical record is synchronized by the database synchronization mechanism to other slave nodes.
- the cluster device performance synchronization statistics method in the embodiment of the present invention further includes steps S301 to S303, in order to facilitate the cluster administrator to perform better management on the cluster according to the statistical records.
- S301 Receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction.
- the request parameter carried according to the preset request instruction (it should be noted that the data included in the request parameter may be a statistical object ID, a measurement type, or a time interval), and the statistical record is searched for and described in the statistical record.
- the filter obtains multiple statistical records of the last hour, according to the measurement type. Measurement types include traffic, delay, IOPS (Input/Output Operations Per Second), and then extract the required fields of each statistical record (such as the node statistics traffic object measurement type). , as a query response.
- the request carrying parameter may have a statistical object ID or no statistical object ID. If there is a statistical object ID, according to the statistical object ID, first match with the element ID in the array of a certain type of statistical object, if there is no match, then The direct return statistical object does not exist, or directly returns an error. Otherwise, all statistical records of the statistical object are queried from the database. If the request carries the parameter without the statistical object ID, all statistical records of all statistical objects are queried, and the statistic will be queried. The record is saved to a file, the file is sent to the requester, and the file is deleted.
- the cluster device performance synchronization statistical method in the embodiment of the present invention further includes steps S401 to S405:
- the system time change mainly refers to the adjustment of the system time when the cluster administrator finds that the system time is incorrect.
- S403 Calculate the current system time, the last statistical time, and the preset collection time to obtain an operation result.
- S404 Compare an absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, further determine whether the operation result is less than zero.
- the cluster device performance synchronization statistical method in the embodiment of the present invention further includes the step S501 ⁇ S503:
- the statistical object data stored in the database is compared with the indexed array data of the statistical object, and the statistical object that does not exist in the indexed array data is newly added.
- Object at this time to increase the statistics object to the number of index arrays According to the data; when the statistical object is deleted, the statistical object data stored in the database is compared with the indexed array data of the statistical object, and some statistical objects in the indexed array data are found in the statistical object data stored in the database. If it does not exist, you need to delete the non-existing statistics object in the index array data.
- the master node in the cluster is responsible for collecting the counter information of the node device statistics object from each node of the cluster, processing, storing, and synchronizing to other nodes, and also providing performance statistics query to ensure high reliability of cluster performance statistics. .
- the embodiment of the present invention further provides a cluster device performance synchronization statistics system 1 applied to a first node as a master node in a cluster device, where the cluster device performance synchronization statistics system includes:
- the first collection module 101 is configured to collect a counter value of a statistical object on all nodes including the first node in the cluster;
- the first statistical record generating module 102 is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record
- the first synchronization module 103 is configured to synchronize the statistical record to other nodes in the cluster device than the first node.
- the cluster device performance synchronization statistics system further includes:
- the first initialization module 104 is configured to create a performance statistics database and set a preset when the first node is powered on before collecting the counter value of the statistical object on all nodes including the first node in the cluster.
- Each type of statistical object creates a database table on the database.
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the first statistic record generating module 102 is further configured to: after collecting the counter values of the statistic objects on each of the nodes respectively, and generating the statistic records, storing the statistic records in the performance Statistical database.
- the cluster device performance synchronization statistics system further includes:
- the first receiving module 105 is configured to receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node.
- the comparing module 106 is configured to set the memory capacity of the second node to be within the first node The storage capacity is compared.
- the deleting module 107 is configured to delete part of the statistical records in the performance statistics database if the memory capacity of the second node is smaller than the memory capacity of the first node.
- the cluster device performance synchronization statistics system further includes:
- the second receiving module 108 is configured to receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction.
- the searching module 109 is configured to search, according to the request parameter carried by the preset request instruction, a first statistical record corresponding to the statistical object that matches the request parameter in the statistical record.
- the information feedback module 110 is configured to send the first statistical record to the sender of the preset request instruction.
- the cluster device performance synchronization statistics system further includes:
- the first monitoring module 111 is configured to monitor whether the system time has changed.
- the time acquisition module 112 is configured to acquire the current system time of the first node when the preset acquisition time arrives when the system time is changed.
- the calculation module 113 is configured to calculate the current system time, the last statistical time, and the preset acquisition time to obtain an operation result.
- the determining module 114 is configured to compare the absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, determine whether the operation result is less than zero.
- Calculating the evaluation module 115 if the operation result is less than zero, subtracting the statistical value of the statistical record from the absolute value of the operation result to obtain a first result, and using the first result as the statistical record The latest statistical time; if the operation result is greater than or equal to zero, the statistical time of the statistical record is added to the absolute value of the operation result to obtain a second result, and the second result is used as the latest statistics of the statistical record. time.
- the cluster device performance synchronization statistics system further includes:
- the second monitoring module 116 is configured to monitor whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object.
- the statistical object obtaining module 117 is configured to acquire the statistical object data stored in the cluster management and the index array data of the statistical object if the preprocessing operation is detected;
- the comparison processing module 118 is configured to enter the statistical object data and the index array data into Line alignment, preprocessing the index array data according to the comparison result.
- cluster device performance synchronization statistics system is a system corresponding to the foregoing cluster device performance synchronization statistics method, and all implementation manners of the cluster device performance synchronization statistics method are applicable to the cluster device performance synchronization statistics system, The same technical effect as the above-mentioned cluster device performance synchronization statistical method can be achieved.
- the embodiment of the present invention further provides a cluster device performance synchronization statistics method, which is applied to a second node that is a slave node in a cluster device, and the cluster device performance synchronization statistics method includes:
- the cluster device performance synchronization statistics method further includes:
- a performance statistics database is created, and the default statistical object type is in the database. Create a database table on it.
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the statistical objects corresponding to the slave node and the master node may be different, when the slave node is powered on for the first time, it is also necessary to establish a performance statistics database for its own statistical object, and then, when interacting with the master node, The statistics records obtained from the master node are saved to their own performance statistics database.
- the cluster device performance synchronization statistical method of the present invention further includes the steps. S601 ⁇ S604:
- the node implements the function of the master node, performs statistical data collection and statistical record generation, and uses this slave node as the master node, and this node can implement all the functions of the above-mentioned master node, ensuring that the cluster performance statistics are highly sustainable.
- the embodiment of the present invention provides a cluster device performance synchronization statistics system 2, which is applied to a second node as a slave node in the cluster device, and the cluster device performance synchronization statistics system includes:
- the obtaining module 201 is configured to acquire a statistical record on the first node, and store the statistical record.
- the cluster device performance synchronization statistics system further includes:
- the second initialization module 202 is configured to create a performance statistics database when the second node is powered on, and create a database table on the database for each of the preset statistical object types.
- the statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
- the cluster device performance synchronization statistics system further includes:
- the detecting module 204 is configured to detect whether a change signal that is changed from the second node to the first node is generated;
- the second collection module 205 is configured to: if the change signal has been generated, collect a counter value of a statistical object on all nodes including the second node that is powered on in the cluster;
- the second statistical record generating module 206 is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record
- the second synchronization module 207 is configured to synchronize the statistical record to other nodes in the cluster device than the second node.
- cluster device performance synchronization statistics system is a system corresponding to the foregoing cluster device performance synchronization statistics method, and all implementation manners of the cluster device performance synchronization statistics method are applicable to the cluster device performance synchronization statistics system, The same technical effect as the above-mentioned cluster device performance synchronization statistical method can be achieved.
- the clustering device performance synchronization statistics system is illustrated as follows:
- the cluster device performance synchronization statistics system can be divided into: power-on initialization unit according to functions 11.
- the cluster device performance synchronization statistics system exists on each node of the cluster, but in the actual application, the working unit of the cluster device performance synchronization statistics system on the master node is the power-on initialization unit 11,
- the working units of the synchronous statistical system are the power-on initialization unit 11, the master-slave node interaction unit 14, and the takeover performance statistics unit 15, the original counter value collection unit 12, the performance statistics data generation unit 13, and the query and export statistics unit 16 are not working.
- the power-on initialization unit 11 runs when all nodes of the cluster are powered on, is responsible for creating a performance statistics database, creating a database table on the database for each statistical object type, and synchronizing database data from the node (synchronous performance statistics database synchronization request is initiated from the node, The data of each statistical object type table of the performance statistics database on the master node is synchronized to the corresponding table of the database on the slave node, and the data changes of the database on the master node are synchronously incremented to the slave node database), and the registration node status change notification is The takeover performance statistics unit 15) is notified once the node status changes.
- the original counter value collection unit 12 runs on the cluster master node, enables the data collection timer, and the collection interval can be configured as needed, for example, 15 seconds.
- the cluster node list information is obtained through cluster management.
- the address information of the node in the cluster node list is sent to each service subsystem of the node to obtain the original counter value of the corresponding statistical object and the two statistical time intervals before and after, and the original counter value of each statistical object obtained is obtained.
- the statistics interval is sent to the performance statistics generating unit 13; the need to explain here is that the business subsystem should clear the original counter value of the statistical object after processing the request.
- Performance statistics generating unit 13 running on the cluster master node, each cluster to be collected
- the raw counter value of the statistical object of the node is summarized and processed (using the corresponding calculation method, such as traffic statistics, the execution flow original counter value/statistical time interval is obtained; and the cache hit rate is calculated, the number of hits is executed/(the number of hits + the number of losses)
- the original counter value is processed into a readable data item
- the statistical object statistical record is composed of a plurality of such data items, the statistical object statistical record needs to be stored in the database), and the readability is readability for the user.
- a strong statistical record a statistical object corresponds to a statistical record at a statistical time point, the statistical record is stored in the database, and then the database synchronization mechanism synchronizes the statistical record to all cluster slave nodes as a backup; meanwhile, the in-memory database occupies memory, cannot To save data indefinitely, you need to determine the restriction method.
- the statistical object of each statistical object type is used to limit the maximum number of statistical records. The maximum number of statistical records is determined according to how long the storage time is. Before the statistical record is saved, the statistical record bar of the current statistical object is determined.
- the oldest statistical record of the statistical object needs to be deleted, and then the new statistical record can be saved to the database, and the new statistical record is synchronized by the database synchronization mechanism to other slave nodes.
- the master-slave node interaction unit 14 runs when the cluster slave node is powered on, and informs the cluster master node of the memory size of the own node device, and the master node receives the request, and determines the data that the database needs to retain according to the request parameter.
- Takeover performance statistics unit 15 Because the cluster performance statistics are performed on the primary node, if the primary node is abnormal, performance statistics cannot be performed normally; when the cluster primary node is down due to human restart or unknown reasons, the cluster management is in the cluster secondary node. A change is performed in the takeover performance statistics unit 15 of the cluster slave node that is newly elected as the cluster master node, and the performance statistics task is taken over, and all statistical objects are read from the database and saved to various In the type statistics object index array, the notification service subsystem clears each original counter value, and enables the timer to start cluster device performance statistics.
- the query and export statistics unit 16 receives the query request sent by the user, returns the corresponding statistical record of all the statistical objects according to the parameters carried by the request, and the client uses the statistical records to draw a dynamic statistical graph; and receives the export request sent by the user.
- the request parameter the corresponding statistical data record of the selected statistical object is returned in the form of a file, and the user selects to save the file to the local directory to achieve the purpose of saving the overall view or the historical statistical record.
- the original counter value is the value of the variable. If the system runs for a long time, it will inevitably overflow, and since the request is not successful every time, if there is a failure, the performance statistics generating unit does not know the specific statistical time interval, which will inevitably lead to calculation. Not precise enough.
- the processing method for case one is steps S701 to S706 (shown in FIG. 4):
- the original counter value collecting unit 12 first requests the service subsystem to obtain the original counter value of the statistical object.
- the service subsystem records the time of the request statistics, returns the original counter value, and clears the original counter value.
- the original counter value collection unit 12 again requests the service subsystem to obtain the original counter value.
- S704 The service module is very busy, and the request is not responded in time, and the request times out.
- the original counter value collection unit 12 continues to request the service module to obtain the original counter value.
- step S706 In response to the request, calculate a current system time, the time is subtracted from the statistical time recorded in step S702, as a statistical time interval between two points, save the system time as the statistical time of the request, and return the original counter value. The statistical interval is given to the original counter value acquisition unit 12, and the original counter value is cleared.
- a statistical record corresponds to a unique statistical time. If the initial device system time is incorrect, the date of these errors will be added to the statistical record during the statistics. The device system time is corrected later. The statistical time of the previously generated statistical records is not available. Get corrections.
- the processing method for case 2 is steps S801 to S803 (as shown in FIG. 5):
- the system time is abnormal, and the performance statistical data generating unit 13 saves the statistical record to the database, and records the system time as the latest statistical time.
- the statistic timer arrives, the timer period is 15 seconds, and the current system time is obtained, and a first time value is subtracted.
- the first time value the statistical time of the last statistical system is +15, and then the absolute value of the difference is taken. If the absolute value is greater than a certain number, the configuration here is 8 seconds, indicating that the system time is positive. In the correction, at this time, it is determined that the current system time minus the first time value, that is, the statistical time of the latest statistical system +15) is less than 0, indicating that the time is changed back, and the timestamp field values of all statistical records are subtracted from the previous calculation. The absolute value is saved. If it is determined that the current system time minus the first time value, that is, the statistical time of the last statistical system +15) is greater than or equal to 0, the absolute value is saved, and the current system time is finally recorded as the latest statistics. time.
- Statistical objects are dynamically added and deleted at any time. If the statistical object increases, it is obviously not possible to query the statistical information of the statistical object. You need to increase the statistical information of the statistical object. Similarly, if a statistical object is deleted, all its statistics need to be deleted.
- the performance statistics data generating unit 13 acquires all statistical objects of the current cluster.
- the statistical object obtained in the previous step is sequentially queried in the index array of the statistical object. If the query is not found, the statistical object is added, and a new statistical object is added in the index array of the statistical object, and the statistical record of the new statistical object is calculated and saved. Go to the database.
- the performance statistical data generating unit 13 acquires a current cluster statistical object.
- the processing method for case 4 includes steps S1101 to S1104 (shown in FIG. 8):
- S1102 Obtain cluster node list information, and obtain a node that is powered off according to the state of the node.
- S1104 Save the statistical record of the statistical object to the database.
- the original counter value collection unit initiates the acquisition request. Since the service module is busy and cannot process the request, the timing collection performance statistics unit cannot wait for a long time. Therefore, the acquisition of the original counter value is not successful every time, and must be measured in some manner. Let the user feel that there is no problem with the system.
- the processing manner for case 5 includes steps S1201 to S1204 (as shown in FIG. 9):
- the original counter value collecting unit 12 requests the service subsystem to obtain the original counter value.
- S1202 The service subsystem does not respond in time, and the request is overdue.
- S1203 The data field of the statistical record of the current statistical object is assigned by the statistical record number field of the object last counted, and the new statistical record only modifies the statistical time.
- the main features of the cluster device performance synchronization statistics system in the embodiment of the present invention are as follows: 1) The master node in the cluster is responsible for collecting, storing, synchronizing the raw counter information of the node device statistics object from each node of the cluster, and providing performance statistics query. In fact, the primary node is equivalent to the central host. When the current primary node fails, the performance statistics will be forwarded to the newly elected primary node to maintain high performance statistics; 2) performance statistics have been generated and saved in the cluster. On the node, data statistics can be obtained through B/S mode and C/S mode; 3) node join or exit occurs in the cluster, the volume statistics object switches the running node, the user increases and deletes the statistical object, etc., and can dynamically detect Measured and evaluated for performance.
- all or part of the steps of the above embodiments may also be implemented using an integrated circuit.
- the steps may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps may be fabricated into a single integrated circuit module.
- the devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
- the device/function module/functional unit in the above embodiment When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium.
- the above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
- the statistical value of the statistical object on each node in the cluster is calculated by the primary node in the cluster, and the statistical record is generated, and the statistical record is synchronized to each slave node in the cluster to ensure the master of the cluster.
- the slave nodes in the cluster can continue to complete the work of the master node according to the synchronized statistics records, ensuring that the cluster performance statistics are highly sustainable.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本申请涉及但不限于集群技术的NAS(Network Attached Storage,网络附属存储)存储领域,尤其涉及一种集群设备性能同步统计方法及系统。The present invention relates to, but is not limited to, the field of NAS (Network Attached Storage) storage of the cluster technology, and in particular, to a cluster device performance synchronization statistics method and system.
在传统多设备环境中,性能统计一般由网络管理系统主动跟业务设备代理进行通信或者设备代理上报获得原始性能参数值,并进行处理来实现。这种实现方式主要有如下三个特点,第一是必须需要一台中心主机来运行网络管理系统;第二是网络管理系统和业务设备代理程序实现是C/S模式;第三是每个设备的统计对象都是不相关的。In a traditional multi-device environment, the performance statistics are generally implemented by the network management system actively communicating with the service device agent or by the device agent to obtain the original performance parameter values and processing. This implementation mainly has the following three characteristics. The first is that a central host must be required to run the network management system; the second is that the network management system and the service device agent are implemented in C/S mode; the third is each device. The statistical objects are all irrelevant.
集群技术是一种较新的技术,在付出较低成本的情况下获得在性能、可靠性、灵活性方面的相对较高的收益。它是由一组相互独立的、通过高速网络互联的设备构成的一个整体,集群环境也是一种多设备环境,但是集群环境有其独特性,以NAS存储集群环境为例,下面三种情况不适合上述传统统计方式:1)卷统计对象运行节点的切换;2)节点加入或退出集群,3)节点上统计对象的增加或删除。传统方式的网络管理系统在上面三种情况下不能动态侦测到变化,也就无法对其进行正常性能统计。Clustering technology is a relatively new technology that yields relatively high performance in terms of performance, reliability, and flexibility at a lower cost. It is a whole set of independent devices connected by high-speed network. The cluster environment is also a multi-device environment, but the cluster environment has its own uniqueness. Take the NAS storage cluster environment as an example. The following three situations are not It is suitable for the above traditional statistical methods: 1) switching of the running node of the volume statistical object; 2) joining or exiting the cluster, and 3) adding or deleting statistical objects on the node. The traditional network management system cannot dynamically detect changes in the above three cases, and it is impossible to perform normal performance statistics.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
一种集群设备性能同步统计方法,应用于集群设备中作为主节点的第一节点,其中,所述集群设备性能同步统计方法包括:A cluster device performance synchronization statistics method is applied to a first node of a cluster device as a master node, wherein the cluster device performance synchronization statistics method includes:
采集集群中包括第一节点在内的所有节点上的统计对象的计数器值;Collecting counter values of statistical objects on all nodes including the first node in the cluster;
分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;Collecting the counter values of the statistical objects on each of the nodes to generate a statistical record;
将所述统计记录同步到集群设备中的除第一节点外的其它节点。The statistical record is synchronized to other nodes in the cluster device than the first node.
可选地,所述方法还包括: Optionally, the method further includes:
在采集集群中包括第一节点在内的所有节点上的统计对象的计数器值之前,在所述第一节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表;Before collecting the counter value of the statistical object on all nodes including the first node in the cluster, when the first node is powered on, a performance statistics database is created, and the default statistical object type is in the database. Create a database table on it;
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述方法还包括:Optionally, the method further includes:
在分别将每个所述节点上的统计对象的计数器值进行汇总,生成统计记录之后,将所述统计记录存入所述性能统计数据库。After the counter values of the statistical objects on each of the nodes are respectively summarized to generate a statistical record, the statistical records are stored in the performance statistics database.
可选地,所述方法还包括:Optionally, the method further includes:
接收集群中作为一从节点的第二节点发送的包含有所述第二节点的内存容量的数据信息;Receiving, by the second node that is a slave node in the cluster, data information including a memory capacity of the second node;
将所述第二节点的内存容量与所述第一节点的内存容量进行比较;Comparing the memory capacity of the second node with the memory capacity of the first node;
如果所述第二节点的内存容量小于所述第一节点的内存容量,则删除所述性能统计数据库中的部分统计记录。If the memory capacity of the second node is smaller than the memory capacity of the first node, deleting part of the statistical records in the performance statistics database.
可选地,所述方法还包括:Optionally, the method further includes:
接收预设请求指令,其中,所述预设请求指令为查询请求指令或导出请求指令;Receiving a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction;
根据所述预设请求指令携带的请求参数,在所述统计记录中查找与所述请求参数相匹配的统计对象对应的第一统计记录;Determining, by the request parameter carried in the preset request instruction, a first statistical record corresponding to the statistical object that matches the request parameter in the statistical record;
将所述第一统计记录发送给所述预设请求指令的发送方。Sending the first statistical record to the sender of the preset request instruction.
可选地,所述方法还包括:Optionally, the method further includes:
监测系统时间是否发生变更;Monitor system time changes;
当监测到系统时间发生变更时,在预设采集时间到达时,获取所述第一节点的当前系统时间;Obtaining the current system time of the first node when the preset acquisition time arrives when the system time change is detected;
将当前系统时间、上一次统计时间和预设采集时间进行运算,得到一运算结果;Calculating the current system time, the last statistical time, and the preset acquisition time to obtain an operation result;
将所述运算结果的绝对值与预设值进行比较,如果所述绝对值大于所述预设值,则判断所述运算结果是否小于零;Comparing the absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, determining whether the operation result is less than zero;
如果所述运算结果小于零,则将所述统计记录的统计时间减去所述运算 结果的绝对值得到第一结果,将所述第一结果作为所述统计记录的最新统计时间;如果所述运算结果大于或等于零,则将所述统计记录的统计时间加上所述运算结果的绝对值得到第二结果,将所述第二结果作为所述统计记录的最新统计时间。If the operation result is less than zero, the statistical time of the statistical record is subtracted from the operation The absolute value of the result obtains a first result, and the first result is used as the latest statistical time of the statistical record; if the operation result is greater than or equal to zero, the statistical time of the statistical record is added to the operation result The absolute value yields a second result, which is used as the most recent statistical time of the statistical record.
可选地,所述方法还包括:Optionally, the method further includes:
监测是否存在对集群管理中存储的统计对象数据的预处理操作,所述预处理操作为统计对象的增加操作或删除操作;Monitoring whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object;
如果监测到存在所述预处理操作,则获取集群管理中存储的统计对象数据以及统计对象的索引数组数据;If it is detected that the pre-processing operation exists, acquiring statistical object data stored in the cluster management and index array data of the statistical object;
将所述统计对象数据与所述索引数组数据进行比对,依据比对结果对所述索引数组数据进行预处理。Comparing the statistical object data with the index array data, and preprocessing the index array data according to the comparison result.
一种集群设备性能同步统计系统,应用于集群设备中作为主节点的第一节点,所述系统包括:A cluster device performance synchronization statistics system is applied to a first node in a cluster device as a master node, and the system includes:
第一采集模块,设置为采集集群中包括第一节点在内的所有节点上的统计对象的计数器值;The first collection module is configured to collect a counter value of a statistical object on all nodes including the first node in the cluster;
第一统计记录生成模块,设置为分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;The first statistical record generating module is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record;
第一同步模块,设置为将所述统计记录同步到集群设备中的除第一节点外的其它节点。The first synchronization module is configured to synchronize the statistical record to other nodes in the cluster device than the first node.
可选地,所述统计系统还包括:Optionally, the statistical system further includes:
第一初始化模块,设置为在采集集群中包括第一节点在内的所有节点上的统计对象的计数器值之前,在所述第一节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表;The first initialization module is configured to create a performance statistics database and preset the performance when the first node is powered on before collecting the counter value of the statistical object on all nodes including the first node in the cluster. Create a database table on the database for each type of statistical object;
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述第一统计记录生成模块还设置为:在分别将每个所述节点上的统计对象的计数器值进行汇总,生成统计记录之后,将所述统计记录存入所述性能统计数据库。Optionally, the first statistical record generating module is further configured to: after collecting the counter values of the statistical objects on each of the nodes respectively, and generating the statistical records, storing the statistical records in the performance statistics. database.
可选地,所述系统还包括: Optionally, the system further includes:
第一接收模块,设置为接收集群中作为一从节点的第二节点发送的包含有所述第二节点的内存容量的数据信息;a first receiving module, configured to receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node;
比较模块,设置为将所述第二节点的内存容量与所述第一节点的内存容量进行比较;Comparing a module, configured to compare a memory capacity of the second node with a memory capacity of the first node;
删除模块,设置为如果所述第二节点的内存容量小于所述第一节点的内存容量,则删除所述性能统计数据库中的部分统计记录。The deleting module is configured to delete part of the statistical records in the performance statistics database if the memory capacity of the second node is smaller than the memory capacity of the first node.
可选地,所述系统还包括:Optionally, the system further includes:
第二接收模块,设置为接收预设请求指令,其中,所述预设请求指令为查询请求指令或导出请求指令;The second receiving module is configured to receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction;
查找模块,设置为根据所述预设请求指令携带的请求参数,在所述统计记录中查找与所述请求参数相匹配的统计对象对应的第一统计记录;a locating module, configured to search for a first statistic record corresponding to the statistic object that matches the request parameter, according to the request parameter carried by the preset request instruction;
信息反馈模块,设置为将所述第一统计记录发送给所述预设请求指令的发送方。The information feedback module is configured to send the first statistical record to the sender of the preset request instruction.
可选地,所述系统还包括:Optionally, the system further includes:
第一监测模块,设置为监测系统时间是否发生变更;The first monitoring module is configured to monitor whether the system time has changed;
时间获取模块,设置为当监测到系统时间发生变更时,在预设采集时间到达时,获取所述第一节点的当前系统时间;The time acquisition module is configured to acquire the current system time of the first node when the preset acquisition time arrives when the system time is changed;
计算模块,设置为将当前系统时间、上一次统计时间和预设采集时间进行运算,得到一运算结果;The calculation module is configured to calculate the current system time, the last statistical time, and the preset acquisition time to obtain an operation result;
判断模块,设置为将所述运算结果的绝对值与预设值进行比较,若所述绝对值大于所述预设值,则进一步判断所述运算结果是否小于零;The determining module is configured to compare the absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, further determine whether the operation result is less than zero;
计算赋值模块,设置为如果所述运算结果小于零,则将所述统计记录的统计时间减去所述运算结果的绝对值得到第一结果,将所述第一结果作为所述统计记录的最新统计时间;如果所述运算结果大于或等于零,则将所述统计记录的统计时间加上所述运算结果的绝对值得到第二结果,将所述第二结果作为所述统计记录的最新统计时间。Calculating an evaluation module, if the operation result is less than zero, subtracting the statistical value of the statistical record from the absolute value of the operation result to obtain a first result, and using the first result as the latest of the statistical record Statistics time; if the operation result is greater than or equal to zero, adding the statistical time of the statistical record to the absolute value of the operation result to obtain a second result, and using the second result as the latest statistical time of the statistical record .
所述系统还包括:The system also includes:
第二监测模块,设置为监测是否存在对集群管理中存储的统计对象数据的预处理操作,所述预处理操作为统计对象的增加操作或删除操作; a second monitoring module, configured to monitor whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object;
统计对象获取模块,设置为如果监测到存在所述预处理操作,则获取集群管理中存储的统计对象数据以及统计对象的索引数组数据;a statistical object obtaining module, configured to acquire statistical object data stored in the cluster management and index array data of the statistical object if the preprocessing operation is detected;
比对处理模块,设置为将所述统计对象数据与所述索引数组数据进行比对,依据比对结果对所述索引数组数据进行预处理。The comparison processing module is configured to compare the statistical object data with the index array data, and preprocess the index array data according to the comparison result.
一种集群设备性能同步统计方法,应用于集群设备中作为从节点的第二节点,所述方法包括:A cluster device performance synchronization statistics method is applied to a second node in a cluster device as a slave node, and the method includes:
获取第一节点上的统计记录,并将所述统计记录进行存储。Obtain a statistical record on the first node and store the statistical record.
可选地,所述方法还包括:Optionally, the method further includes:
在所述获取第一节点上的统计记录,并将所述统计记录进行存储的步骤之前,在所述第二节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表;Before the step of acquiring the statistical record on the first node and storing the statistical record, when the second node is powered on, a performance statistics database is created, and each type of the statistical object is preset. Create a database table on the database;
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述方法还包括:Optionally, the method further includes:
检测是否已经生成由第二节点变为第一节点的变化信号;Detecting whether a change signal changed from the second node to the first node has been generated;
如果已经生成所述变化信号,则采集集群中上电的包括所述第二节点在内的所有节点上的统计对象的计数器值;If the change signal has been generated, collecting a counter value of a statistical object on all nodes including the second node that is powered on in the cluster;
分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;Collecting the counter values of the statistical objects on each of the nodes to generate a statistical record;
将所述统计记录同步到集群设备中的除所述第二节点外的其它节点。Synchronizing the statistical record to other nodes in the cluster device than the second node.
一种集群设备性能同步统计系统,应用于集群设备中作为从节点的第二节点,所述系统包括:A cluster device performance synchronization statistics system is applied to a second node in a cluster device as a slave node, and the system includes:
获取模块,设置为获取第一节点上的统计记录,并将所述统计记录进行存储。The obtaining module is configured to acquire a statistical record on the first node and store the statistical record.
可选地,所述系统还包括:Optionally, the system further includes:
第二初始化模块,设置为在所述第二节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表;其中,a second initialization module, configured to create a performance statistics database when the second node is powered on, and create a database table on the database for each of the preset statistical object types;
所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述系统还包括: Optionally, the system further includes:
检测模块,设置为检测是否已经生成由第二节点变为第一节点的变化信号;a detecting module configured to detect whether a change signal changed from the second node to the first node has been generated;
第二采集模块,设置为如果已经生成所述变化信号,则采集集群中上电的包括所述第二节点在内的所有节点上的统计对象的计数器值;a second collection module, configured to: if the change signal has been generated, collect a counter value of a statistical object on all nodes including the second node that is powered on in the cluster;
第二统计记录生成模块,设置为分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;a second statistical record generating module is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record;
第二同步模块,设置为将所述统计记录同步到集群设备中的除所述第二节点外的其它节点。A second synchronization module is configured to synchronize the statistical record to other nodes in the cluster device than the second node.
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现上述的集群设备性能同步统计方法。A computer readable storage medium storing computer executable instructions that, when executed by a processor, implement the cluster device performance synchronization statistics method described above.
通过本发明实施例的方案,利用集群中主节点统计集群中各个节点上的统计对象的计数器值,生成统计记录,并将该统计记录同步到集群中的每个从节点,保证了集群的主节点在下电或宕机时,集群中的从节点也能够根据其同步的统计记录,继续完成主节点的工作,保证了集群性能统计的高度可持续。According to the solution of the embodiment of the present invention, the statistical value of the statistical object on each node in the cluster is calculated by the primary node in the cluster, and the statistical record is generated, and the statistical record is synchronized to each slave node in the cluster to ensure the master of the cluster. When the node is powered off or down, the slave nodes in the cluster can continue to complete the work of the master node according to the synchronized statistics records, ensuring that the cluster performance statistics are highly sustainable.
附图概述BRIEF abstract
图1表示本发明实施例的应用于主节点的集群设备性能同步统计方法的总体流程图;1 is a general flowchart of a method for synchronizing statistical performance of a cluster device applied to a primary node according to an embodiment of the present invention;
图2表示本发明实施例的应用于主节点的集群设备性能同步统计系统的模块示意图;2 is a block diagram showing a cluster device performance synchronization statistics system applied to a master node according to an embodiment of the present invention;
图3表示本发明实施例的主节点与从节点上的集群设备性能同步统计系统的交互示意图;3 is a schematic diagram showing interaction between a master device and a cluster device performance synchronization statistics system on a slave node according to an embodiment of the present invention;
图4表示本发明实施例的集群设备性能同步统计系统对异常情况一的处理流程图;4 is a flowchart of processing an abnormal situation 1 of a cluster device performance synchronization statistics system according to an embodiment of the present invention;
图5表示本发明实施例的集群设备性能同步统计系统对异常情况二的处理流程图;FIG. 5 is a flowchart of processing an abnormal situation 2 of a cluster device performance synchronization statistics system according to an embodiment of the present invention;
图6表示本发明实施例的集群设备性能同步统计系统对异常情况三的处理流程图一; 6 is a flowchart 1 of processing an abnormal situation 3 of a cluster device performance synchronization statistics system according to an embodiment of the present invention;
图7表示本发明实施例的集群设备性能同步统计系统对异常情况三的处理流程图二;FIG. 7 is a flowchart 2 of the process for the abnormal situation 3 of the cluster device performance synchronization statistics system according to the embodiment of the present invention;
图8表示本发明实施例的集群设备性能同步统计系统对异常情况四的处理流程图;8 is a flowchart of processing an abnormal situation 4 of a cluster device performance synchronization statistics system according to an embodiment of the present invention;
图9表示本发明实施例的集群设备性能同步统计系统对异常情况五的处理流程图。FIG. 9 is a flowchart of processing an abnormal situation 5 of a cluster device performance synchronization statistics system according to an embodiment of the present invention.
下面结合附图对本发明的实施方式进行描述。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。Embodiments of the present invention will be described below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
本发明实施例针对相关的集群中在采用传统统计方式进行设备性能统计时,不能动态监测集群中每个节点的变化,造成集群性能统计的可持续性不高的问题。The embodiments of the present invention can not dynamically monitor the changes of each node in the cluster when the device statistics are collected by using the traditional statistics in the related clusters, resulting in the problem that the performance of the cluster performance is not high.
如图1所示,本发明实施例提出了一种集群设备性能同步统计方法,应用于集群设备中作为主节点的第一节点,所述方法包括步骤S101~S103:As shown in FIG. 1 , an embodiment of the present invention provides a cluster device performance synchronization statistics method, which is applied to a first node that is a master node in a cluster device, and the method includes steps S101 to S103:
S101、采集集群中包括第一节点在内的所有节点上的统计对象的计数器值。S101. Collect a counter value of a statistical object on all nodes including the first node in the cluster.
S102、分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录。S102: Perform a summary process on the counter values of the statistical objects on each of the nodes to generate a statistical record.
S103、将所述统计记录同步到集群设备中的除第一节点外的其它节点。S103. Synchronize the statistical record to other nodes in the cluster device except the first node.
本发明实施例的上述方案,通过利用集群中主节点来统计集群中各个节点上的统计对象的计数器值,生成统计记录,并将该统计记录同步到集群中的每个从节点,以此保证了集群的主节点在下电或宕机时,集群中的从节点也可以根据其同步的统计记录,继续完成主节点的工作,保持了集群性能统计的高度可持续性。In the above solution of the embodiment of the present invention, a statistical record is generated by using a primary node in a cluster to collect a counter value of a statistical object on each node in the cluster, and the statistical record is synchronized to each slave node in the cluster, thereby ensuring When the primary node of the cluster is powered off or down, the slave nodes in the cluster can continue to complete the work of the master node according to the synchronization statistics, and maintain the high degree of sustainability of the cluster performance statistics.
应当说明的是,步骤S101中,每个节点上的统计对象是集群设备管理员预先定义好,且保存在集群的节点中的,集群中的每个节点在运行时,会随着自身的运行采集针对这些统计对象生成的数据信息,然后节点对这些数据信息进行处理,将获取的结果保存在节点的计数器中,应当说明的是,所述 计数器可以看作是节点上存放统计对象的数据信息的一存储区域。应当说明的是,由统计对象的数据信息到将此数据信息进行加工处理后存放在存储区域的过程为本领域技术人员所熟知的,在此不再进行详细的说明。It should be noted that, in step S101, the statistical object on each node is pre-defined by the cluster device administrator and stored in the node of the cluster, and each node in the cluster runs with itself when it is running. Collecting data information generated for these statistical objects, and then the node processes the data information, and saves the obtained result in a counter of the node, which should be stated The counter can be regarded as a storage area for storing data information of the statistical object on the node. It should be noted that the process of storing the data information of the statistical object into the storage area after processing the data information is well known to those skilled in the art, and will not be described in detail herein.
步骤S102为在集群的主节点上汇总统计对象的计数器值,生成统计记录的过程,可选地,主要实现方式包括:根据统计对象的计数器值,利用相应的计算方式,比如说流量统计,由公式执行流量计数器值/统计时间间隔得到;而计算缓存命中率,由公式执行命中次数/(命中次数+丢失次数)计算得到。根据不同计算方式将计数器值加工成可读的数据项,统计对象的统计记录由多个这样的数据项组成。Step S102 is a process of collecting a counter value of a statistical object on a primary node of the cluster to generate a statistical record. Optionally, the main implementation manner includes: using a corresponding calculation manner, for example, traffic statistics, according to a counter value of the statistical object. The formula performs the flow counter value/statistical time interval; the calculated cache hit ratio is calculated by the number of hits executed by the formula/(number of hits + number of misses). The counter value is processed into a readable data item according to different calculation methods, and the statistical record of the statistical object is composed of a plurality of such data items.
步骤S103实现的便是将统计记录同步到集群中的其它节点,可选地,其实现方式可以包括:在达到固定的时间间隔后,主节点向其它从节点发送自身保存的统计记录,以实现与从节点上数据的共享。Step S103 is implemented to synchronize the statistical records to other nodes in the cluster. Alternatively, the implementation manner may include: after reaching a fixed time interval, the primary node sends its own saved statistical records to other slave nodes to implement Sharing with data from the slave node.
应当说明的是,所述统计记录应按照统计对象类型进行一一存储,而为了方便的实现对数据的查询与管理,通常使用数据库进行数据的保存,因此在集群主节点首次上电运行时,需要先创建一个保存统计记录的数据库,可选地,实现方法包括:It should be noted that the statistical records should be stored one by one according to the type of the statistical object, and in order to conveniently implement the query and management of the data, the database is usually used for data storage, so when the cluster primary node is powered on for the first time, You need to create a database to save statistics records. Optionally, the implementation methods include:
所述方法还包括:在采集集群中包括第一节点在内的所有节点上的统计对象的计数器值之前(即在步骤S101之前),在所述第一节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表。The method further includes: before collecting the counter value of the statistical object on all nodes including the first node in the cluster (ie, before step S101), creating a performance statistics database when the first node is powered on And create a database table on the database for each of the default statistical object types.
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
应当说明的是,根据创建的性能统计数据库,可选地,所述方法还包括:It should be noted that, according to the created performance statistics database, optionally, the method further includes:
在分别将每个所述节点上的统计对象的计数器值进行汇总,生成统计记录之后,将所述统计记录存入所述性能统计数据库。After the counter values of the statistical objects on each of the nodes are respectively summarized to generate a statistical record, the statistical records are stored in the performance statistics database.
应当说明的是,依据统计对象类型为每个统计对象类型建立一个数据库表,在进行统计信息记录时,依据统计对象的类型将统计记录一一对应存储到性能统计数据库相应的数据库表中。例如:统计对象为端口1、端口2、虚拟盘a和卷1,在获取得到每个统计对象的统计记录时,需要将端口1和端口 2的统计记录存入网络端口数据库表中,将虚拟盘a的统计记录存入虚拟盘数据库表中,将卷1的统计记录存入卷数据库表中。所述统计对象类型以及统计对象可以根据集群设备在实际运行时的统计需求进行实时的增加或删除。通过按照统计对象类型存储每个统计对象的统计记录,方便了对统计信息的管理。It should be noted that a database table is created for each statistical object type according to the statistical object type. When the statistical information is recorded, the statistical records are stored one by one according to the type of the statistical object to the corresponding database table of the performance statistics database. For example, the statistics object is port 1, port 2, virtual disk a, and volume 1. When obtaining the statistics record of each statistical object, port 1 and port are required. The statistical records of 2 are stored in the network port database table, and the statistical records of the virtual disk a are stored in the virtual disk database table, and the statistical records of the volume 1 are stored in the volume database table. The statistical object type and the statistical object may be added or deleted in real time according to the statistical requirements of the cluster device during actual operation. By storing the statistical records of each statistical object according to the statistical object type, it is convenient to manage the statistical information.
因本发明实施例中的所述数据库需要实现在集群的主节点和从节点上的共享,即主节点和从节点上均保存一份相同的性能统计数据库,因此,需保证主节点和从节点均能容纳所述性能统计数据库,因此,在本发明实施例中,所述方法还包括步骤S201~S203:The database in the embodiment of the present invention needs to implement sharing on the primary node and the secondary node of the cluster, that is, the same performance statistics database is saved on both the primary node and the secondary node. Therefore, the primary node and the secondary node need to be guaranteed. The method can further include the performance statistics database. Therefore, in the embodiment of the present invention, the method further includes steps S201 to S203:
S201、接收集群中作为一从节点的第二节点发送的包含有所述第二节点的内存容量的数据信息;S201. Receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node.
S202、将所述第二节点的内存容量与所述第一节点的内存容量进行比较;S202. Compare a memory capacity of the second node with a memory capacity of the first node.
S203、如果所述第二节点的内存容量小于所述第一节点的内存容量,则删除所述性能统计数据库中的部分统计记录。S203. If the memory capacity of the second node is smaller than the memory capacity of the first node, delete a part of the statistical records in the performance statistics database.
应当说明的是,当从节点容量大于或等于主节点容量时,不需对数据库进行删除操作。必须进行数据库的删除时,可选地,可以按照时间顺序,删除最早生成的统计记录,保留最近最新的统计记录。It should be noted that when the slave node capacity is greater than or equal to the master node capacity, there is no need to delete the database. When the database must be deleted, optionally, the earliest generated statistical records can be deleted in chronological order, and the most recent latest statistical records are retained.
因数据库占用内存,不能无限保存数据,需要确定限制方式,采用每种统计对象类型的统计对象限制一个最大统计记录数,根据保存时间多久决定最大统计记录数,例如设定的统计对象的统计记录需保存一天,且统计对象的计数器的采集时间间隔为15秒,则最大统计记录数为24*60*60/15=5760条,保存统计记录之前,判断当前的统计对象的统计记录条数是否达到最大值,如果达到最大值,需要删除该统计对象的最老的一条统计记录,这时才能将新统计记录保存到数据库中,新统计记录由数据库同步机制同步到其他从节点数据库中。Because the database occupies memory, you cannot save the data indefinitely. You need to determine the restriction mode. The statistical object of each statistical object type is used to limit the maximum number of statistical records. The maximum number of statistical records is determined according to the storage time. For example, the statistical records of the set statistics objects are set. It needs to be saved for one day, and the collection interval of the counter of the statistical object is 15 seconds. The maximum number of statistical records is 24*60*60/15=5760. Before saving the statistical record, determine whether the number of statistical records of the current statistical object is The maximum value is reached. If the maximum value is reached, the oldest statistical record of the statistical object needs to be deleted. In this case, the new statistical record can be saved to the database, and the new statistical record is synchronized by the database synchronization mechanism to other slave nodes.
为了方便集群管理员根据统计记录能对集群进行更好的管理,本发明实施例的所述集群设备性能同步统计方法还包括步骤S301~S303:The cluster device performance synchronization statistics method in the embodiment of the present invention further includes steps S301 to S303, in order to facilitate the cluster administrator to perform better management on the cluster according to the statistical records.
S301、接收预设请求指令,其中,所述预设请求指令为查询请求指令或导出请求指令。 S301. Receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction.
S302、根据所述预设请求指令携带的请求参数(应当说明的是,所述请求参数中包含的数据可以为统计对象ID、测量类型或时间区间),在所述统计记录中查找与所述请求参数相匹配的统计对象对应的第一统计记录。S302. The request parameter carried according to the preset request instruction (it should be noted that the data included in the request parameter may be a statistical object ID, a measurement type, or a time interval), and the statistical record is searched for and described in the statistical record. The first statistical record corresponding to the statistical object whose matching parameters are requested.
S303、将所述第一统计记录发送给所述预设请求指令的发送方。S303. Send the first statistical record to a sender of the preset request instruction.
需要说明的是,在依据统计对象ID进行统计记录查看时,根据集群管理员输入的统计对象ID,先与某种类型统计对象数组里的元素ID进行匹配,如果没有匹配到,则直接返回统计对象不存在,或直接返回错误,如果匹配到了,则从数据库查询出所有该统计对象的统计记录,依据时间区间,比如说1小时,过滤得到最近1小时的多条统计记录,根据测量类型,测量类型包括流量、时延、IOPS(Input/Output Operations Per Second,即每秒进行读写操作的次数)等,然后将每条统计记录的所需字段(比如节点统计流量对象测量类型)提取出来,作为查询响应。因节点上保存数据库的空间有限,而数据库中的某些统计对象可能需要长久的进行保存,这时,便需要通过手动将数据库中的某些对象导出,存储到其它设备中,导出的统计记录请求携带参数可以带有统计对象ID或不带统计对象ID,如果带有统计对象ID,则根据统计对象ID,先与某种类型统计对象数组里的元素ID进行匹配,如果没有匹配到,则直接返回统计对象不存在,或直接返回错误,否则从数据库查询出该统计对象的所有统计记录,如果请求携带参数不带统计对象ID,则查询全部统计对象的所有统计记录,将查询到的统计记录保存到一个文件中,将文件发给请求者,同时删除该文件。It should be noted that, when the statistical record is viewed according to the statistical object ID, according to the statistical object ID input by the cluster administrator, the element ID in the array of a certain type of statistical object is matched first, and if there is no match, the statistics are directly returned. The object does not exist, or directly returns an error. If it matches, all the statistical records of the statistical object are queried from the database. According to the time interval, for example, 1 hour, the filter obtains multiple statistical records of the last hour, according to the measurement type. Measurement types include traffic, delay, IOPS (Input/Output Operations Per Second), and then extract the required fields of each statistical record (such as the node statistics traffic object measurement type). , as a query response. Because the space for saving the database on the node is limited, some statistical objects in the database may need to be saved for a long time. In this case, some objects in the database are manually exported and stored in other devices, and the exported statistical records are recorded. The request carrying parameter may have a statistical object ID or no statistical object ID. If there is a statistical object ID, according to the statistical object ID, first match with the element ID in the array of a certain type of statistical object, if there is no match, then The direct return statistical object does not exist, or directly returns an error. Otherwise, all statistical records of the statistical object are queried from the database. If the request carries the parameter without the statistical object ID, all statistical records of all statistical objects are queried, and the statistic will be queried. The record is saved to a file, the file is sent to the requester, and the file is deleted.
因一条统计记录对应一个唯一统计时间,如果最开始设备系统时间不正确,在进行统计时,这些错误的日期会加入到统计记录中,后面更正了设备系统时间,前面生成的统计记录的统计时间没有得到更正,会造成保存的统计记录的时间不准的问题,因此,本发明实施例的所述集群设备性能同步统计方法还包括步骤S401~S405:Because a statistical record corresponds to a unique statistical time, if the initial device system time is incorrect, when the statistics are performed, the date of these errors will be added to the statistical record, and the device system time is corrected later, and the statistical time of the previously generated statistical records is corrected. If the correction is not obtained, the time of the saved statistical record may be inaccurate. Therefore, the cluster device performance synchronization statistical method in the embodiment of the present invention further includes steps S401 to S405:
S401、监测系统时间是否发生变更;系统时间的变更主要指集群管理员在发现系统时间不正确时,对系统时间的调整。S401. Whether the monitoring system time changes; the system time change mainly refers to the adjustment of the system time when the cluster administrator finds that the system time is incorrect.
S402、当监测到系统时间发生变更时,在预设采集时间到达时,获取所述第一节点的当前系统时间。 S402. When the system time is changed, when the preset acquisition time arrives, the current system time of the first node is obtained.
S403、将当前系统时间、上一次统计时间和预设采集时间进行运算,得到一运算结果。S403: Calculate the current system time, the last statistical time, and the preset collection time to obtain an operation result.
S404、将所述运算结果的绝对值与预设值进行比较,如果所述绝对值大于所述预设值,则进一步判断所述运算结果是否小于零。S404: Compare an absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, further determine whether the operation result is less than zero.
S405、如果所述运算结果小于零,则将所述统计记录的统计时间减去所述运算结果的绝对值得到第一结果,将所述第一结果作为所述统计记录的最新统计时间;如果所述运算结果大于或等于零,则将所述统计记录的统计时间加上所述运算结果的绝对值得到第二结果,将所述第二结果作为所述统计记录的最新统计时间。S405. If the operation result is less than zero, subtracting the statistical value of the statistical record from the absolute value of the operation result to obtain a first result, and using the first result as the latest statistical time of the statistical record; If the operation result is greater than or equal to zero, the statistical time of the statistical record is added to the absolute value of the operation result to obtain a second result, and the second result is used as the latest statistical time of the statistical record.
应当说明的是,只有在系统时间调整时才进行上述步骤,通过在更正系统时间后,对统计记录的统计时间进行相应的调整,保证了一个统计对象在一个时刻只对应一条唯一的统计记录。It should be noted that the above steps are performed only when the system time is adjusted. By correcting the statistical time of the statistical records after correcting the system time, it is ensured that one statistical object only corresponds to one unique statistical record at a time.
因集群中存储有众多的统计对象,为了方便对所述统计对象的管理,一般将所述统计对象存放在数据库中,但是,在进行统计对象的统计记录的查看时,检索数据库中的统计对象,会存在数据读取速度慢,耗时较长的问题,因此,需依据数据库中的统计对象建立统计对象的索引数组,在进行统计对象的统计记录的查看时,依据统计对象的索引数组查找相应的统计记录,此种方式提高了数据的读取速率。无论是统计对象的删除还是增加,都先要对数据库进行修改,为了保证数据库中的统计对象与统计对象的索引数组一致,本发明实施例的所述集群设备性能同步统计方法还包括步骤S501~S503:Because a large number of statistical objects are stored in the cluster, in order to facilitate the management of the statistical objects, the statistical objects are generally stored in the database, but when the statistical records of the statistical objects are viewed, the statistical objects in the database are retrieved. There will be a problem that the data reading speed is slow and takes a long time. Therefore, the index array of the statistical object needs to be established according to the statistical object in the database, and when the statistical record of the statistical object is viewed, the index array of the statistical object is searched. Corresponding statistical records, this way increases the rate of data read. Whether the deletion or the addition of the statistical object is performed, the database is modified first. In order to ensure that the statistical object in the database is consistent with the indexed array of the statistical object, the cluster device performance synchronization statistical method in the embodiment of the present invention further includes the step S501~ S503:
S501、监测是否存在对集群管理中存储的统计对象数据的预处理操作,所述预处理操作为统计对象的增加操作或删除操作;S501. Monitor whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object;
S502、如果监测到存在所述预处理操作,则获取集群管理中存储的统计对象数据以及统计对象的索引数组数据;S502. If the pre-processing operation is detected, acquiring statistical object data stored in the cluster management and index array data of the statistical object;
S503、将所述统计对象数据与所述索引数组数据进行比对,依据比对结果对所述索引数组数据进行预处理。S503. Compare the statistical object data with the index array data, and preprocess the index array data according to the comparison result.
应当说明的是,当增加了统计对象时,需将数据库中存储的统计对象数据与统计对象的索引数组数据进行比对,比对得到索引数组数据中未存在的统计对象则为新增的统计对象,此时要将该统计对象增加到所述索引数组数 据中;当删除了统计对象时,将数据库中存储的统计对象数据与统计对象的索引数组数据进行比对后,会发现索引数组数据中的某些统计对象在数据库中存储的统计对象数据中不存在,此时,便需将该不存的统计对象在索引数组数据中删除。It should be noted that when the statistical object is added, the statistical object data stored in the database is compared with the indexed array data of the statistical object, and the statistical object that does not exist in the indexed array data is newly added. Object, at this time to increase the statistics object to the number of index arrays According to the data; when the statistical object is deleted, the statistical object data stored in the database is compared with the indexed array data of the statistical object, and some statistical objects in the indexed array data are found in the statistical object data stored in the database. If it does not exist, you need to delete the non-existing statistics object in the index array data.
上述方案,集群中的主节点负责从集群各个节点采集节点设备统计对象的计数器信息并进行处理、存储、且同步到其它节点,并且还提供性能统计数据查询,保证了集群性能统计的高度可靠性。In the above solution, the master node in the cluster is responsible for collecting the counter information of the node device statistics object from each node of the cluster, processing, storing, and synchronizing to other nodes, and also providing performance statistics query to ensure high reliability of cluster performance statistics. .
如图2所示,本发明实施例还提供一种集群设备性能同步统计系统1,应用于集群设备中作为主节点的第一节点,所述集群设备性能同步统计系统包括:As shown in FIG. 2, the embodiment of the present invention further provides a cluster device performance synchronization statistics system 1 applied to a first node as a master node in a cluster device, where the cluster device performance synchronization statistics system includes:
第一采集模块101,设置为采集集群中包括第一节点在内的所有节点上的统计对象的计数器值;The
第一统计记录生成模块102,设置为分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;The first statistical
第一同步模块103,设置为将所述统计记录同步到集群设备中的除第一节点外的其它节点。The
可选地,所述集群设备性能同步统计系统,还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第一初始化模块104,设置为在采集集群中包括第一节点在内的所有节点上的统计对象的计数器值之前,在所述第一节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表。The first initialization module 104 is configured to create a performance statistics database and set a preset when the first node is powered on before collecting the counter value of the statistical object on all nodes including the first node in the cluster. Each type of statistical object creates a database table on the database.
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述第一统计记录生成模块102还设置为:在分别将每个所述节点上的统计对象的计数器值进行汇总,生成统计记录之后,将所述统计记录存入所述性能统计数据库。Optionally, the first statistic
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第一接收模块105,设置为接收集群中作为一从节点的第二节点发送的包含有所述第二节点的内存容量的数据信息。The first receiving module 105 is configured to receive, by the second node that is a slave node in the cluster, data information that includes a memory capacity of the second node.
比较模块106,设置为将所述第二节点的内存容量与所述第一节点的内 存容量进行比较。The comparing module 106 is configured to set the memory capacity of the second node to be within the first node The storage capacity is compared.
删除模块107,设置为如果所述第二节点的内存容量小于所述第一节点的内存容量,则删除所述性能统计数据库中的部分统计记录。The deleting module 107 is configured to delete part of the statistical records in the performance statistics database if the memory capacity of the second node is smaller than the memory capacity of the first node.
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第二接收模块108,设置为接收预设请求指令,其中,所述预设请求指令为查询请求指令或导出请求指令。The second receiving module 108 is configured to receive a preset request instruction, where the preset request instruction is a query request instruction or an export request instruction.
查找模块109,设置为根据所述预设请求指令携带的请求参数,在所述统计记录中查找与所述请求参数相匹配的统计对象对应的第一统计记录。The searching module 109 is configured to search, according to the request parameter carried by the preset request instruction, a first statistical record corresponding to the statistical object that matches the request parameter in the statistical record.
信息反馈模块110,设置为将所述第一统计记录发送给所述预设请求指令的发送方。The information feedback module 110 is configured to send the first statistical record to the sender of the preset request instruction.
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第一监测模块111,设置为监测系统时间是否发生变更。The first monitoring module 111 is configured to monitor whether the system time has changed.
时间获取模块112,设置为当监测到系统时间发生变更时,在预设采集时间到达时,获取所述第一节点的当前系统时间;The time acquisition module 112 is configured to acquire the current system time of the first node when the preset acquisition time arrives when the system time is changed.
计算模块113,设置为将当前系统时间、上一次统计时间和预设采集时间进行运算,得到一运算结果。The calculation module 113 is configured to calculate the current system time, the last statistical time, and the preset acquisition time to obtain an operation result.
判断模块114,设置为将所述运算结果的绝对值与预设值进行比较,如果所述绝对值大于所述预设值,则判断所述运算结果是否小于零。The determining module 114 is configured to compare the absolute value of the operation result with a preset value, and if the absolute value is greater than the preset value, determine whether the operation result is less than zero.
计算赋值模块115,设置为如果所述运算结果小于零,则将所述统计记录的统计时间减去所述运算结果的绝对值得到第一结果,将所述第一结果作为所述统计记录的最新统计时间;如果所述运算结果大于或等于零,则将所述统计记录的统计时间加上所述运算结果的绝对值得到第二结果,将所述第二结果作为所述统计记录的最新统计时间。Calculating the evaluation module 115, if the operation result is less than zero, subtracting the statistical value of the statistical record from the absolute value of the operation result to obtain a first result, and using the first result as the statistical record The latest statistical time; if the operation result is greater than or equal to zero, the statistical time of the statistical record is added to the absolute value of the operation result to obtain a second result, and the second result is used as the latest statistics of the statistical record. time.
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第二监测模块116,设置为监测是否存在对集群管理中存储的统计对象数据的预处理操作,所述预处理操作为统计对象的增加操作或删除操作。The second monitoring module 116 is configured to monitor whether there is a pre-processing operation on the statistical object data stored in the cluster management, where the pre-processing operation is an adding operation or a deleting operation of the statistical object.
统计对象获取模块117,设置为如果监测到存在所述预处理操作,则获取集群管理中存储的统计对象数据以及统计对象的索引数组数据;The statistical object obtaining module 117 is configured to acquire the statistical object data stored in the cluster management and the index array data of the statistical object if the preprocessing operation is detected;
比对处理模块118,设置为将所述统计对象数据与所述索引数组数据进 行比对,依据比对结果对所述索引数组数据进行预处理。The comparison processing module 118 is configured to enter the statistical object data and the index array data into Line alignment, preprocessing the index array data according to the comparison result.
需要说明的是,该集群设备性能同步统计系统是与上述集群设备性能同步统计方法相对应的系统,上述集群设备性能同步统计方法的所有实现方式均适用于该集群设备性能同步统计系统中,也能达到与上述集群设备性能同步统计方法相同的技术效果。It should be noted that the cluster device performance synchronization statistics system is a system corresponding to the foregoing cluster device performance synchronization statistics method, and all implementation manners of the cluster device performance synchronization statistics method are applicable to the cluster device performance synchronization statistics system, The same technical effect as the above-mentioned cluster device performance synchronization statistical method can be achieved.
本发明实施例还提供了一种集群设备性能同步统计方法,应用于集群设备中作为从节点的第二节点,所述集群设备性能同步统计方法包括:The embodiment of the present invention further provides a cluster device performance synchronization statistics method, which is applied to a second node that is a slave node in a cluster device, and the cluster device performance synchronization statistics method includes:
获取第一节点上的统计记录,并将所述统计记录进行存储。Obtain a statistical record on the first node and store the statistical record.
可选地,所述集群设备性能同步统计方法还包括:Optionally, the cluster device performance synchronization statistics method further includes:
在获取第一节点上的统计记录,并将所述统计记录进行存储的步骤之前,在所述第二节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表。Before acquiring the statistical record on the first node and storing the statistical record, when the second node is powered on, a performance statistics database is created, and the default statistical object type is in the database. Create a database table on it.
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
需要说明的是,因从节点和主节点对应的统计对象可能不同,因此在从节点首次上电时,也需要针对自身的统计对象建立性能统计数据库,然后在与主节点进行交互时,在将从主节点获取的统计记录保存到自身的性能统计数据库中。It should be noted that, since the statistical objects corresponding to the slave node and the master node may be different, when the slave node is powered on for the first time, it is also necessary to establish a performance statistics database for its own statistical object, and then, when interacting with the master node, The statistics records obtained from the master node are saved to their own performance statistics database.
需要说明的是,集群在运行过程中,主节点可能会出现断电下机的情况,因此,为了保证集群中的统计数据不发生间断,本发明的所述集群设备性能同步统计方法还包括步骤S601~S604:It should be noted that, in the running process of the cluster, the master node may be powered off and down. Therefore, in order to ensure that the statistics in the cluster are not interrupted, the cluster device performance synchronization statistical method of the present invention further includes the steps. S601~S604:
S601、检测是否已经生成由第二节点变为第一节点的变化信号。S601. Detect whether a change signal that is changed from the second node to the first node has been generated.
S602、如果已经生成所述变化信号,则采集集群中上电的包括所述第二节点在内的所有节点上的统计对象的计数器值。S602. If the change signal has been generated, collect a counter value of a statistical object on all nodes including the second node that is powered on in the cluster.
S603、分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录.S603. Perform a summary process on the counter values of the statistical objects on each of the nodes to generate a statistical record.
S604、将所述统计记录同步到集群设备中的除所述第二节点外的其它节点。S604. Synchronize the statistical record to other nodes in the cluster device except the second node.
应当说明的是,在主节点发生下电或宕机时,通过选用从节点中的一个 节点实现主节点的功能,进行统计数据采集以及统计记录的生成,将此从节点作为主节点来使用,且此节点能实现上述主节点的所有功能,确保了集群性能统计的高度可持续。It should be noted that when the main node is powered off or down, one of the slave nodes is selected. The node implements the function of the master node, performs statistical data collection and statistical record generation, and uses this slave node as the master node, and this node can implement all the functions of the above-mentioned master node, ensuring that the cluster performance statistics are highly sustainable.
针对于上述集群设备性能同步统计方法,本发明实施例提供了一种集群设备性能同步统计系统2,应用于集群设备中作为从节点的第二节点,所述集群设备性能同步统计系统包括:For the above-mentioned cluster device performance synchronization statistics method, the embodiment of the present invention provides a cluster device performance synchronization statistics system 2, which is applied to a second node as a slave node in the cluster device, and the cluster device performance synchronization statistics system includes:
获取模块201,设置为获取第一节点上的统计记录,并将所述统计记录进行存储。The obtaining module 201 is configured to acquire a statistical record on the first node, and store the statistical record.
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
第二初始化模块202,设置为在所述第二节点上电运行时,创建性能统计数据库,并为预设的每种统计对象类型在数据库上创建数据库表。The second initialization module 202 is configured to create a performance statistics database when the second node is powered on, and create a database table on the database for each of the preset statistical object types.
其中,所述统计对象类型包括:节点、网络端口、虚拟盘、卷和逻辑单元号。The statistical object types include: a node, a network port, a virtual disk, a volume, and a logical unit number.
可选地,所述集群设备性能同步统计系统还包括:Optionally, the cluster device performance synchronization statistics system further includes:
检测模块204,设置为检测是否生成由第二节点变为第一节点的变化信号;The detecting module 204 is configured to detect whether a change signal that is changed from the second node to the first node is generated;
第二采集模块205,设置为如果已经生成所述变化信号,则采集集群中上电的包括所述第二节点在内的所有节点上的统计对象的计数器值;The second collection module 205 is configured to: if the change signal has been generated, collect a counter value of a statistical object on all nodes including the second node that is powered on in the cluster;
第二统计记录生成模块206,设置为分别将每个所述节点上的统计对象的计数器值进行汇总处理,生成统计记录;The second statistical record generating module 206 is configured to separately collect the counter values of the statistical objects on each of the nodes to generate a statistical record;
第二同步模块207,设置为将所述统计记录同步到集群设备中的除所述第二节点外的其它节点。The second synchronization module 207 is configured to synchronize the statistical record to other nodes in the cluster device than the second node.
需要说明的是,该集群设备性能同步统计系统是与上述集群设备性能同步统计方法相对应的系统,上述集群设备性能同步统计方法的所有实现方式均适用于该集群设备性能同步统计系统中,也能达到与上述集群设备性能同步统计方法相同的技术效果。It should be noted that the cluster device performance synchronization statistics system is a system corresponding to the foregoing cluster device performance synchronization statistics method, and all implementation manners of the cluster device performance synchronization statistics method are applicable to the cluster device performance synchronization statistics system, The same technical effect as the above-mentioned cluster device performance synchronization statistical method can be achieved.
结合上述实施例以及实际的使用情况,对上述集群设备性能同步统计系统举例说明如下:With reference to the foregoing embodiment and the actual usage, the clustering device performance synchronization statistics system is illustrated as follows:
所述集群设备性能同步统计系统按照功能可以划分为:上电初始化单元
11、原始计数器值采集单元12、性能统计数据生成单元13、主从节点交互单元14、接管性能统计单元15和查询和导出统计数据单元16。需要说明的是,集群的每个节点上均存在所述集群设备性能同步统计系统,但在实际应用中,主节点上的所述集群设备性能同步统计系统的工作单元为上电初始化单元11、原始计数器值采集单元12、性能统计数据生成单元13、主从节点交互单元14和查询和导出统计数据单元16,接管性能统计单元15处于不工作的状态;而主节点上的所述集群设备性能同步统计系统的工作单元为上电初始化单元11、主从节点交互单元14和接管性能统计单元15,原始计数器值采集单元12、性能统计数据生成单元13和查询和导出统计数据单元16处于不工作的状态;当主节点下电或宕机,从节点中的一个代替该主节点起作用时,此时该从节点上的原始计数器值采集单元12、性能统计数据生成单元13和查询和导出统计数据单元16由不工作状态转为工作状态,且接管性能统计单元15由工作状态转为不工作状态;主节点和从节点的工作单元,如图3所示,图3中主节点和从节点上处于不工作的单元未示出。The cluster device performance synchronization statistics system can be divided into: power-on initialization unit according to functions
11. The original counter
以上单元的主要功能为:The main functions of the above units are:
上电初始化单元11:在集群所有节点上电时运行,负责创建性能统计数据库、为每种统计对象类型在数据库上创建数据库表、从节点同步数据库数据(从节点发起同步性能统计数据库同步请求,主节点上性能统计数据库的各个统计对象类型表的数据同步到从节点上数据库的相应表上,以后主节点上数据库的数据变化都会同步增量到从节点数据库上)、注册节点状态变化通知(一旦节点状态变化就通知到接管性能统计单元15)。The power-on initialization unit 11: runs when all nodes of the cluster are powered on, is responsible for creating a performance statistics database, creating a database table on the database for each statistical object type, and synchronizing database data from the node (synchronous performance statistics database synchronization request is initiated from the node, The data of each statistical object type table of the performance statistics database on the master node is synchronized to the corresponding table of the database on the slave node, and the data changes of the database on the master node are synchronously incremented to the slave node database), and the registration node status change notification is The takeover performance statistics unit 15) is notified once the node status changes.
原始计数器值采集单元12:在集群主节点上运行,启用数据采集定时器,采集间隔可以根据需要配置,例如配置15秒,定时器消息到达时,先通过集群管理获取集群节点列表信息,依次取得集群节点列表中的节点的地址信息,向该节点的各个业务子系统发送消息请求获取其上对应统计对象的原始计数器值和前后两次统计时间间隔,并将获得的各个统计对象的原始计数器值和前后两次统计间隔发送给性能统计数据生成单元13;这里需要说明的是业务子系统处理请求后应将统计对象原始计数器值清零。The original counter value collection unit 12: runs on the cluster master node, enables the data collection timer, and the collection interval can be configured as needed, for example, 15 seconds. When the timer message arrives, the cluster node list information is obtained through cluster management. The address information of the node in the cluster node list is sent to each service subsystem of the node to obtain the original counter value of the corresponding statistical object and the two statistical time intervals before and after, and the original counter value of each statistical object obtained is obtained. And the statistics interval is sent to the performance
性能统计数据生成单元13:在集群主节点上运行,将采集到的集群各个 节点的统计对象原始计数器值进行汇总加工(利用相应的计算方式,比如说流量统计,执行流量原始计数器值/统计时间间隔得到;而计算缓存命中率,执行命中次数/(命中次数+丢失次数)计算得到,根据不同计算方式将原始计数器值加工为可读的数据项,统计对象统计记录由多个这样的数据项组成,统计对象统计记录需要存进数据库),生成对用户来说可读性强的统计记录,一个统计对象在一个统计时间点对应一条统计记录,将统计记录存进去数据库,再由数据库同步机制同步该统计记录到所有集群从节点作为备份;同时,内存数据库占用内存,不能无限保存数据,需要确定限制方式,采用每种统计对象类型的统计对象限制一个最大统计记录数,根据保存时间多久决定最大统计记录数,在统计记录保存之前,判断当前的统计对象的统计记录条数是否达到最大值,如果达到最大值,需要删除该统计对象的最老的一条统计记录,这时才能将新统计记录保存到数据库中,新统计记录由数据库同步机制同步到其他从节点数据库中。Performance statistics generating unit 13: running on the cluster master node, each cluster to be collected The raw counter value of the statistical object of the node is summarized and processed (using the corresponding calculation method, such as traffic statistics, the execution flow original counter value/statistical time interval is obtained; and the cache hit rate is calculated, the number of hits is executed/(the number of hits + the number of losses) Calculated, according to different calculation methods, the original counter value is processed into a readable data item, the statistical object statistical record is composed of a plurality of such data items, the statistical object statistical record needs to be stored in the database), and the readability is readability for the user. A strong statistical record, a statistical object corresponds to a statistical record at a statistical time point, the statistical record is stored in the database, and then the database synchronization mechanism synchronizes the statistical record to all cluster slave nodes as a backup; meanwhile, the in-memory database occupies memory, cannot To save data indefinitely, you need to determine the restriction method. The statistical object of each statistical object type is used to limit the maximum number of statistical records. The maximum number of statistical records is determined according to how long the storage time is. Before the statistical record is saved, the statistical record bar of the current statistical object is determined. Whether the number is up to To the maximum value, if the maximum value is reached, the oldest statistical record of the statistical object needs to be deleted, and then the new statistical record can be saved to the database, and the new statistical record is synchronized by the database synchronization mechanism to other slave nodes.
主从节点交互单元14:在集群从节点上电时候运行,向集群主节点告知自身节点设备的内存大小,主节点接收到请求,根据请求参数决定数据库需要保留的数据。The master-slave node interaction unit 14: runs when the cluster slave node is powered on, and informs the cluster master node of the memory size of the own node device, and the master node receives the request, and determines the data that the database needs to retain according to the request parameter.
接管性能统计单元15:因集群性能统计是在主节点上进行的,如果主节点出现异常,就无法正常进行性能统计;当集群主节点由于人为重启或者未知原因宕机,集群管理在集群从节点中选举出一个作为集群主节点,被新选举为集群主节点的集群从节点的该接管性能统计单元15监测到这种变化,接管性能统计任务,从数据库读出所有的统计对象保存到各种类型统计对象索引数组中,通知业务子系统清零各个原始计数器值,启用定时器开始进行集群设备性能统计。Takeover performance statistics unit 15: Because the cluster performance statistics are performed on the primary node, if the primary node is abnormal, performance statistics cannot be performed normally; when the cluster primary node is down due to human restart or unknown reasons, the cluster management is in the cluster secondary node. A change is performed in the takeover
查询和导出统计数据单元16:接收用户发来的查询请求,根据请求携带的参数,返回所有某个统计对象相应统计记录,客户端利用这些统计记录画动态统计曲线图;接收用户发出的导出请求,根据请求参数,以文件形式返回选择统计对象的相应统计数据记录,用户选择保存文件到本地目录,实现整体查看或历史统计记录保存的目的。The query and export statistics unit 16: receives the query request sent by the user, returns the corresponding statistical record of all the statistical objects according to the parameters carried by the request, and the client uses the statistical records to draw a dynamic statistical graph; and receives the export request sent by the user. According to the request parameter, the corresponding statistical data record of the selected statistical object is returned in the form of a file, and the user selects to save the file to the local directory to achieve the purpose of saving the overall view or the historical statistical record.
应当说明的是,所述集群设备性能同步统计系统还需要对以下异常情况 进行处理:It should be noted that the cluster device performance synchronization statistics system also needs the following abnormal conditions. Processing:
情况一:Case 1:
原始计数器值是变量值,系统运行久了必然会出现溢出,并且又由于请求并不是每次都是成功的,如果出现失败,性能统计数据生成单元就不知道具体的统计时间间隔,必然导致计算不够精确。The original counter value is the value of the variable. If the system runs for a long time, it will inevitably overflow, and since the request is not successful every time, if there is a failure, the performance statistics generating unit does not know the specific statistical time interval, which will inevitably lead to calculation. Not precise enough.
针对情况一的处理方式为步骤S701~S706(如图4所示):The processing method for case one is steps S701 to S706 (shown in FIG. 4):
S701、原始计数器值采集单元12初次请求业务子系统获得统计对象原始计数器值.S701. The original counter
S702、业务子系统记录该次请求统计时间,返回原始计数器值,并清零原始计数器值。S702. The service subsystem records the time of the request statistics, returns the original counter value, and clears the original counter value.
S703、原始计数器值采集单元12再次请求业务子系统获得原始计数器值。S703. The original counter
S704、业务模块十分忙,没有及时响应请求,请求超时。S704: The service module is very busy, and the request is not responded in time, and the request times out.
S705、原始计数器值采集单元12继续请求业务模块获得原始计数器值。S705. The original counter
S706、响应这次请求,计算当前的系统时间,该时间减去步骤S702记录的统计时间,作为两个点之间的统计时间间隔,保存该系统时间作为该次请求统计时间,返回原始计数器值、统计间隔给原始计数器值采集单元12,并清零原始计数器值。S706: In response to the request, calculate a current system time, the time is subtracted from the statistical time recorded in step S702, as a statistical time interval between two points, save the system time as the statistical time of the request, and return the original counter value. The statistical interval is given to the original counter
情况二:Case 2:
一条统计记录对应一个唯一统计时间,如果最开始设备系统时间不正确,在进行统计时,这些错误的日期会加入的统计记录中,后面更正了设备系统时间,前面生成的统计记录的统计时间没有得到更正。A statistical record corresponds to a unique statistical time. If the initial device system time is incorrect, the date of these errors will be added to the statistical record during the statistics. The device system time is corrected later. The statistical time of the previously generated statistical records is not available. Get corrections.
针对情况二的处理方式为步骤S801~S803(如图5所示):The processing method for case 2 is steps S801 to S803 (as shown in FIG. 5):
S801、系统时间不正常,性能统计数据生成单元13将统计记录保存到数据库中,记录系统时间为最近一次统计时间.S801, the system time is abnormal, and the performance statistical
S802、用户更正系统时间。S802, the user corrects the system time.
S803、统计定时器到达,定时器周期为15秒,获取当前系统时间,减去一个第一时间值,该第一时间值=最近一次统计系统的统计时间+15,然后取差值的绝对值,如果绝对值大于某个数,这里配置为8秒,表明系统时间正 在更正,这时判定当前系统时间减去第一时间值即最近一次统计系统的统计时间+15)小于0,说明是往回修改时间,将所有统计记录的时间戳字段值减去前面计算出来的绝对值保存,如果判定当前系统时间减去第一时间值即最近一次统计系统的统计时间+15)大于或等于0,则加上该绝对值保存,最后将当前系统时间记录为最近一次统计时间。S803: The statistic timer arrives, the timer period is 15 seconds, and the current system time is obtained, and a first time value is subtracted. The first time value=the statistical time of the last statistical system is +15, and then the absolute value of the difference is taken. If the absolute value is greater than a certain number, the configuration here is 8 seconds, indicating that the system time is positive. In the correction, at this time, it is determined that the current system time minus the first time value, that is, the statistical time of the latest statistical system +15) is less than 0, indicating that the time is changed back, and the timestamp field values of all statistical records are subtracted from the previous calculation. The absolute value is saved. If it is determined that the current system time minus the first time value, that is, the statistical time of the last statistical system +15) is greater than or equal to 0, the absolute value is saved, and the current system time is finally recorded as the latest statistics. time.
情况三:Case 3:
统计对象都会随时动态地增加和删除,如果统计对象增加,无法查询到该统计对象的统计信息显然是不应该的,需要增加该统计对象的统计信息。同理,如果统计对象被删除,需要删除其所有统计信息。Statistical objects are dynamically added and deleted at any time. If the statistical object increases, it is obviously not possible to query the statistical information of the statistical object. You need to increase the statistical information of the statistical object. Similarly, if a statistical object is deleted, all its statistics need to be deleted.
针对情况三的处理方式为:The treatment for Case 3 is:
1、增加统计对象,包括步骤S901~S903(如图6所示):1. Add statistical objects, including steps S901 to S903 (as shown in Figure 6):
S901、用户新增了统计对象。S901, the user added a statistical object.
S902、性能统计数据生成单元13,获取当前集群所有统计对象。S902. The performance statistics
S903、将上一步骤获得的统计对象,依次在统计对象索引数组查询,如果查询不到,表明统计对象为新增,在统计对象索引数组添加新统计对象,计算得到新统计对象的统计记录保存到数据库中。S903. The statistical object obtained in the previous step is sequentially queried in the index array of the statistical object. If the query is not found, the statistical object is added, and a new statistical object is added in the index array of the statistical object, and the statistical record of the new statistical object is calculated and saved. Go to the database.
2、删除统计对象,包括步骤S1001~S1003(如图7所示)2. Delete the statistical object, including steps S1001 to S1003 (as shown in Figure 7).
S1001、用户删除了统计对象。S1001, the user deletes the statistical object.
S1002、性能统计数据生成单元13,获取当前集群统计对象。S1002: The performance statistical
S1003、将统计对象索引数组保存的每个统计对象根据ID在上一步骤获得的集群统计对象中查询,如果查询不到,表明统计对象已经被删除。统计对象索引数组需要删除该统计对象,并且从数据库中删除该统计对象的所有统计记录。S1003: Each statistical object saved in the statistical object index array is queried according to the ID of the cluster statistical object obtained in the previous step. If the query is not found, the statistical object has been deleted. The statistical object index array needs to delete the statistical object and delete all statistical records of the statistical object from the database.
情况四:Situation 4:
有些统计对象属于整个集群,另一些属于单个节点,对于属于单个节点上的统计对象,只要没有退出集群,仍然有可能随节点运行出现,这样就需要保证原来的统计记录不删除,在节点不运行的这段时间也需要进行统计,统计值需要以某种方式呈现。Some statistical objects belong to the entire cluster, and others belong to a single node. For statistical objects belonging to a single node, as long as the cluster is not exited, it is still possible to run with the node. Therefore, it is necessary to ensure that the original statistical records are not deleted and the nodes are not running. Statistics also need to be calculated during this time, and the statistical values need to be presented in some way.
针对情况四的处理方式包括步骤S1101~S1104(如图8所示): The processing method for case 4 includes steps S1101 to S1104 (shown in FIG. 8):
S1101、某个节点下电。S1101: A node is powered off.
S1102、获取集群节点列表信息,根据节点状态得到下电的节点。S1102: Obtain cluster node list information, and obtain a node that is powered off according to the state of the node.
S1103、得到该下电节点的所有统计对象,进行性能统计时,该统计对象的统计记录的数据字段值全部置为零。S1103: When all statistical objects of the power-off node are obtained, and performance statistics are performed, the data field values of the statistical records of the statistical object are all set to zero.
S1104、将统计对象的统计记录保存到数据库中。S1104: Save the statistical record of the statistical object to the database.
情况五:Case 5:
原始计数器值采集单元发起获取请求,由于业务模块比较忙,来不及处理请求,定时采集性能统计数据单元不能长时间等待,因此获取原始计数器值并不是每次都成功的,必须采取某种方式统计,让用户感觉系统没有出现问题。The original counter value collection unit initiates the acquisition request. Since the service module is busy and cannot process the request, the timing collection performance statistics unit cannot wait for a long time. Therefore, the acquisition of the original counter value is not successful every time, and must be measured in some manner. Let the user feel that there is no problem with the system.
针对情况五的处理方式包括步骤S1201~S1204(如图9所示):The processing manner for case 5 includes steps S1201 to S1204 (as shown in FIG. 9):
S1201、原始计数器值采集单元12请求业务子系统获得原始计数器值.S1201, the original counter
S1202、业务子系统未及时响应,请求超期。S1202: The service subsystem does not respond in time, and the request is overdue.
S1203、当前统计对象的统计记录的数据字段采用该对象上次统计的统计记录数字字段赋值,新统计记录仅修改统计时间。S1203: The data field of the statistical record of the current statistical object is assigned by the statistical record number field of the object last counted, and the new statistical record only modifies the statistical time.
S1204、保存新统计记录到数据库中。S1204. Save the new statistical record to the database.
本发明实施例的所述集群设备性能同步统计系统主要特点为:1)集群中的主节点负责从集群各个节点采集节点设备统计对象原始计数器信息进行处理、存储、同步,并提供性能统计数据查询,实际上主节点相当于中心主机,当前主节点出现故障时,性能统计会交由新选举出来的主节点继续进行,保持性能统计的高度可持续;2)性能统计数据已经生成和保存在集群节点上,通过B/S模式和C/S模式均可以获得数据统计数据;3)集群中出现节点加入或退出、卷统计对象切换运行节点、用户增加和删除统计对象等情况,可以动态及时侦测到,并且对其进行性能统计。The main features of the cluster device performance synchronization statistics system in the embodiment of the present invention are as follows: 1) The master node in the cluster is responsible for collecting, storing, synchronizing the raw counter information of the node device statistics object from each node of the cluster, and providing performance statistics query. In fact, the primary node is equivalent to the central host. When the current primary node fails, the performance statistics will be forwarded to the newly elected primary node to maintain high performance statistics; 2) performance statistics have been generated and saved in the cluster. On the node, data statistics can be obtained through B/S mode and C/S mode; 3) node join or exit occurs in the cluster, the volume statistics object switches the running node, the user increases and deletes the statistical object, etc., and can dynamically detect Measured and evaluated for performance.
本领域普通技术人员可以理解上述实施例的全部或部分步骤可以使用计算机程序流程来实现,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在相应的硬件平台上(如系统、设备、装置、器件等)执行,在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art will appreciate that all or a portion of the steps of the above-described embodiments can be implemented using a computer program flow, which can be stored in a computer readable storage medium, such as on a corresponding hardware platform (eg, The system, device, device, device, etc. are executed, and when executed, include one or a combination of the steps of the method embodiments.
可选地,上述实施例的全部或部分步骤也可以使用集成电路来实现,这 些步骤可以被分别制作成一个个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。Alternatively, all or part of the steps of the above embodiments may also be implemented using an integrated circuit. The steps may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps may be fabricated into a single integrated circuit module.
上述实施例中的装置/功能模块/功能单元可以采用通用的计算装置来实现,它们可以集中在单个的计算装置上,也可以分布在多个计算装置所组成的网络上。The devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
上述实施例中的装置/功能模块/功能单元以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。上述提到的计算机可读取存储介质可以是只读存储器,磁盘或光盘等。When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. The above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
通过本发明实施例的方案,利用集群中主节点统计集群中各个节点上的统计对象的计数器值,生成统计记录,并将该统计记录同步到集群中的每个从节点,保证了集群的主节点在下电或宕机时,集群中的从节点也能够根据其同步的统计记录,继续完成主节点的工作,保证了集群性能统计的高度可持续。 According to the solution of the embodiment of the present invention, the statistical value of the statistical object on each node in the cluster is calculated by the primary node in the cluster, and the statistical record is generated, and the statistical record is synchronized to each slave node in the cluster to ensure the master of the cluster. When the node is powered off or down, the slave nodes in the cluster can continue to complete the work of the master node according to the synchronized statistics records, ensuring that the cluster performance statistics are highly sustainable.
Claims (21)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510385906.4A CN106331047A (en) | 2015-06-30 | 2015-06-30 | Cluster equipment performance synchronization statistical method and system |
| CN201510385906.4 | 2015-06-30 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017000693A1 true WO2017000693A1 (en) | 2017-01-05 |
Family
ID=57607676
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/082350 Ceased WO2017000693A1 (en) | 2015-06-30 | 2016-05-17 | Performance synchronization and statistics method for cluster device and system |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN106331047A (en) |
| WO (1) | WO2017000693A1 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110879825A (en) * | 2018-09-06 | 2020-03-13 | 阿里巴巴集团控股有限公司 | A data synchronization method and device |
| WO2020053290A1 (en) | 2018-09-12 | 2020-03-19 | Smith & Nephew Plc | Device, apparatus and method of determining skin perfusion pressure |
| CN113743564A (en) * | 2021-01-19 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Counting method and device, electronic equipment and storage medium |
| CN115203331A (en) * | 2022-07-25 | 2022-10-18 | 济南浪潮数据技术有限公司 | A performance data cache management method, device, device and storage medium |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106713487B (en) * | 2017-01-16 | 2020-10-09 | 腾讯科技(深圳)有限公司 | Data synchronization method and device |
| CN108924206B (en) * | 2018-06-26 | 2021-07-16 | 郑州云海信息技术有限公司 | Cluster event synchronization method, device and device for distributed system |
| CN111125253A (en) * | 2019-12-22 | 2020-05-08 | 北京浪潮数据技术有限公司 | Data synchronization method, device, equipment and storage medium |
| CN111694694A (en) * | 2020-05-22 | 2020-09-22 | 北京三快在线科技有限公司 | Database cluster processing method and device, storage medium and node |
| CN114138825B (en) * | 2021-11-24 | 2024-08-23 | 聚好看科技股份有限公司 | Server and method for providing data query service for application program |
| CN115348185B (en) * | 2022-08-19 | 2023-12-05 | 招银云创信息技术有限公司 | A control method and control device for a distributed query engine |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101820384A (en) * | 2010-02-05 | 2010-09-01 | 浪潮(北京)电子信息产业有限公司 | Method and device for dynamically distributing cluster services |
| CN102571452A (en) * | 2012-02-20 | 2012-07-11 | 华为技术有限公司 | Multi-node management method and system |
| US20130286893A1 (en) * | 2011-05-25 | 2013-10-31 | Huawei Technologies Co., Ltd. | Route calculation method and master node device in virtual network element |
| CN103582122A (en) * | 2012-07-19 | 2014-02-12 | 中兴通讯股份有限公司 | Group call establishment method based on digital trunked communication system, network side equipment |
| CN104320459A (en) * | 2014-10-24 | 2015-01-28 | 杭州华三通信技术有限公司 | Node management method and device |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1591394A (en) * | 2003-09-02 | 2005-03-09 | 华为技术有限公司 | Statistic method and system for use of digit content |
| CN102833129A (en) * | 2012-08-15 | 2012-12-19 | 苏州迈科网络安全技术股份有限公司 | Website visit rate statistical method and system |
| CN104104717B (en) * | 2014-06-30 | 2017-11-03 | 广州唯品会网络技术有限公司 | Deliver channel data statistical approach and device |
-
2015
- 2015-06-30 CN CN201510385906.4A patent/CN106331047A/en active Pending
-
2016
- 2016-05-17 WO PCT/CN2016/082350 patent/WO2017000693A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101820384A (en) * | 2010-02-05 | 2010-09-01 | 浪潮(北京)电子信息产业有限公司 | Method and device for dynamically distributing cluster services |
| US20130286893A1 (en) * | 2011-05-25 | 2013-10-31 | Huawei Technologies Co., Ltd. | Route calculation method and master node device in virtual network element |
| CN102571452A (en) * | 2012-02-20 | 2012-07-11 | 华为技术有限公司 | Multi-node management method and system |
| CN103582122A (en) * | 2012-07-19 | 2014-02-12 | 中兴通讯股份有限公司 | Group call establishment method based on digital trunked communication system, network side equipment |
| CN104320459A (en) * | 2014-10-24 | 2015-01-28 | 杭州华三通信技术有限公司 | Node management method and device |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110879825A (en) * | 2018-09-06 | 2020-03-13 | 阿里巴巴集团控股有限公司 | A data synchronization method and device |
| CN110879825B (en) * | 2018-09-06 | 2023-04-28 | 阿里巴巴集团控股有限公司 | A data synchronization method and device |
| WO2020053290A1 (en) | 2018-09-12 | 2020-03-19 | Smith & Nephew Plc | Device, apparatus and method of determining skin perfusion pressure |
| CN113743564A (en) * | 2021-01-19 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Counting method and device, electronic equipment and storage medium |
| CN113743564B (en) * | 2021-01-19 | 2023-12-05 | 北京沃东天骏信息技术有限公司 | Counting method, counting device, electronic equipment and storage medium |
| CN115203331A (en) * | 2022-07-25 | 2022-10-18 | 济南浪潮数据技术有限公司 | A performance data cache management method, device, device and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106331047A (en) | 2017-01-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2017000693A1 (en) | Performance synchronization and statistics method for cluster device and system | |
| AU2019250229B2 (en) | Replication lag-constrained deletion of data in a large-scale distributed data storage system | |
| US10496669B2 (en) | System and method for augmenting consensus election in a distributed database | |
| US9037677B2 (en) | Update protocol for client-side routing information | |
| CN108121782B (en) | Distribution method of query request, database middleware system and electronic equipment | |
| US10769114B2 (en) | Database syncing | |
| EP2653986B1 (en) | Client-side caching of a database transaction token. | |
| US9367261B2 (en) | Computer system, data management method and data management program | |
| JP6028850B2 (en) | Data multiplexing system | |
| JP5686034B2 (en) | Cluster system, synchronization control method, server device, and synchronization control program | |
| CN111552701B (en) | Method for determining data consistency in distributed cluster and distributed data system | |
| CN108183965A (en) | A kind of method of data synchronization, device, equipment, system and readable storage medium storing program for executing | |
| WO2019057193A1 (en) | Data deletion method and distributed storage system | |
| CN107451013A (en) | Data reconstruction method, apparatus and system based on distributed system | |
| Didona et al. | Okapi: Causally consistent geo-replication made faster, cheaper and more available | |
| US9648104B2 (en) | Configuration information acquisition method and management computer | |
| WO2016095716A1 (en) | Fault information processing method and related device | |
| US20150347516A1 (en) | Distributed storage device, storage node, data providing method, and medium | |
| JPWO2014199568A1 (en) | Method for controlling data writing to persistent storage device | |
| Pankowski | Consistency and availability of Data in replicated NoSQL databases | |
| US10860580B2 (en) | Information processing device, method, and medium | |
| US20180165380A1 (en) | Data processing system and data processing method | |
| WO2023071367A1 (en) | Processing method and apparatus for communication service data, and computer storage medium | |
| JP2009282563A (en) | Data storage system, program, method, and monitoring device | |
| CN115168366A (en) | Data processing method, data processing device, electronic equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16817057 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16817057 Country of ref document: EP Kind code of ref document: A1 |