Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not limiting of embodiments of the invention. It should be further noted that, for convenience of description, only some, but not all of the structures related to the embodiments of the present invention are shown in the drawings.
For ease of understanding, the main implementation concept of the embodiments of the present invention will be briefly described first.
In a traditional computing service scheduling environment, computing tasks are typically statically allocated to available computing resources, such as workstations, servers, or cloud computing nodes. However, these systems often lack fine-grained management and dynamic deployment capabilities for computing resources. Specifically, there are several key issues:
1. The resource allocation is uneven, and due to the lack of an effective resource management mechanism, some computing tasks may occupy excessive resources, so that other tasks cannot be executed in time due to insufficient resources. This imbalance in resource allocation severely affects the overall performance and efficiency of the system.
2. The resource utilization efficiency is low, and the traditional computing service scheduling method cannot be dynamically adjusted according to the real-time resource condition and task requirements, so that the resources cannot be fully utilized. In some cases, even if the resource is in an idle state, the resource cannot be effectively allocated to a required task, thereby causing resource waste.
3. The flexibility is lacking, and the traditional scheduling system usually only supports a fixed resource allocation strategy and cannot be flexibly adjusted according to actual requirements. This limits the adaptability and scalability of the system in different scenarios.
4. And (3) the resource overload risk that the computing task can occupy the resources without limit because of no effective upper limit control mechanism of the resource use, so that the system is overloaded and even crashed. This not only affects the performance of the current task, but may also pose a threat to the stability of the overall system.
The inventor provides a computing service scheduling method and a computing service scheduling system by finding the defects in the prior art, and aims to realize efficient utilization of computing resources and efficient execution of tasks by customizing the upper limit of resource use of each computing service and dynamically allocating the computing resources. Specifically, the computing service scheduling method of the present invention includes the steps of:
1. and acquiring the requirements of the container computing task, namely firstly, acquiring the requirement information of the container computing task to be executed by a system, wherein the requirement information comprises the required CPU, memory, storage and other resource quantities, the priority of the task and the like.
2. And adding the tasks into the waiting execution queue, namely reading description parameters of the task demands of the container calculation by the system, adding the tasks into the waiting execution queue according to priority or other rules, and scheduling by a waiting scheduler.
3. And generating an execution scheme by the scheduler according to the real-time available resource condition in the limited resource pool and the current allocation rule. This solution takes into account the resource requirements of the task, the availability of resources and the constraints of the allocation rules.
4. And executing the scheduling task, namely distributing the task to the proper computing resource for execution by the scheduler according to the generated execution scheme. Meanwhile, the system monitors the use condition of the resources in real time, and ensures that the task is executed within the set upper limit of the use of the resources.
In addition, the computing service scheduling method also supports various allocation rules, such as time sequence priority, maximum utilization resource priority and the like. This provides a more flexible selection of resource scheduling policies for users to meet the needs of different scenarios.
Example 1
Fig. 1 is a flow chart of a computing service scheduling method in a first embodiment of the present invention, as shown in fig. 1, in which the first embodiment provides a computing service scheduling method, including:
step S100, obtaining the requirement of a container calculation task;
In particular, the system needs to comprehensively and accurately capture and understand the specific needs of each container computing task. The requirements of the container computing task cover multiple dimensions including, but not limited to, CPU resources, memory resources, storage resources, and priorities of the tasks. This demand information is the core basis for subsequent resource allocation and task scheduling, and therefore its accuracy and integrity are critical. In the process of obtaining the container computing task requirements, the system first needs to interact with the task submitter to define the specific requirements of the task. For example, to parsing task description parameters, such as provided in the form of configuration files, command line parameters, or API requests, etc. The system needs to be able to correctly parse these parameters to extract the information of the type and amount of resources required for the task, the priority of the task, and any specific execution constraints. After the task requirements are obtained, the system needs to further verify and process the information. On the one hand, the system needs to ensure that the acquired task demands are reasonable and achievable, i.e. the required resources can theoretically be met in the current cluster or resource pool. On the other hand, the system also needs to sort and order the tasks according to their priorities and execution constraints so that resources can be efficiently allocated according to established rules in the subsequent scheduling process.
In practical applications, the demands of the computing tasks may change over time and with changing circumstances. Therefore, the system needs to have the ability to capture changes in task demand in real time and dynamically adjust the resource allocation policies and task scheduling plans based on these changes to ensure efficient utilization of computing resources and efficient execution of tasks.
Based on the above analysis, obtaining the container computing task requirements is a basis and premise of a computing service scheduling method. By comprehensively and accurately acquiring and understanding the requirements of the container computing tasks, the system can provide powerful support for subsequent resource allocation and task scheduling, thereby realizing efficient utilization of computing resources and efficient execution of tasks. The accuracy and flexibility of this step has a crucial impact on the performance and efficiency of the overall computing service dispatch system.
Step S200, reading description parameters of the container computing task requirements, and adding the container computing task requirements into a waiting execution queue;
Specifically, the system can accurately read the description parameters of the container computing task requirements. These descriptive parameters include, for example, the type and amount of computing resources (e.g., CPU, memory, storage, etc.) required for the task, the priority of the task, the expected execution time, the required environmental configuration (e.g., operating system, dependency library, etc.), and any particular execution constraints (e.g., must be performed at a particular time or condition). By analyzing the description parameters, the system can comprehensively understand the specific requirements of each container computing task, which is the basis for subsequent resource allocation and task scheduling. After reading the description parameters, the system adds the container calculation task demands into a waiting execution queue according to certain rules and strategies. This queue is the core data structure of the scheduler for managing and scheduling tasks, which determines the order of task execution and the allocation of resources. The process of joining the waiting execution queue is not simply queuing, but needs to be comprehensively considered according to the priority of the task, the resource requirement, the current load condition of the system, the scheduling strategy set by the user and other factors. For example, for higher priority tasks, the system will place them in front of the queue to ensure that the tasks are executed as soon as possible. For tasks with larger resource demands, the system needs to wait for enough resources to release and then arrange to execute the tasks so as to avoid resource contention and system overload.
In addition, the wait for execution queue also supports dynamic adjustment. As the system state changes (e.g., new tasks join, existing tasks complete, resources are released, etc.), the order and state of the tasks in the queue will change accordingly. This dynamic adjustment capability enables the system to more flexibly cope with various complex computing scenarios and demand changes.
Based on the analysis, the description parameters of task demands are calculated through reading the container, and the task demands are added into the waiting execution queue, so that important basis and basis are provided for subsequent resource allocation and task scheduling. The realization of the step not only requires the system to have strong resolving power and flexible queue management mechanism, but also needs to comprehensively consider various factors so as to ensure that the calculation tasks can be efficiently and orderly executed.
Step S300, obtaining real-time available resource conditions and current allocation rules in a limited resource pool, and generating an execution scheme;
Specifically, step S300 is a core element in the computing service scheduling method, which involves obtaining the real-time available resource in the limited resource pool and understanding and applying the current allocation rule, and the final objective is to generate an efficient and feasible implementation scheme.
In the computing service scheduling system, the limiting resource pool refers to a set of computing resources with preset upper use limits, and the resources comprise multiple types such as CPU, memory, GPU, video memory, storage and the like. Real-time available resource conditions refer to how many of these resources are available for use by new computing tasks at the current time. The process of obtaining this information typically involves monitoring and statistics of the status of the use of resources, requiring real-time, accurate data acquisition capabilities of the system.
The current allocation rules are obtained-allocation rules are policies that the computing service scheduling system uses to decide how to allocate resources to different computing tasks. These rules may be based on a variety of factors, such as the priority of the task, the amount of resource demand, the execution time of the task, and so forth. Different allocation rules may lead to different resource allocation results, which in turn affect the overall performance and efficiency of the system. Thus, prior to generating an execution scheme, the system needs to ascertain the current allocation rules in order to reasonably allocate resources according to these rules.
Generating an execution scheme after acquiring the real-time available resource condition and the current allocation rule, the system can start generating the execution scheme. This process involves matching and optimizing the resource requirements of multiple computing tasks to ensure that the requirements of all tasks are maximally met in the presence of limited resources. Specifically, the system will order the tasks according to their priorities and resource demands, and then allocate resources for the tasks one by one according to allocation rules. If the resources are insufficient to meet the needs of all tasks, the system may need to take some policies such as waiting for the resources to be released, adjusting the task priorities, or finding tasks that can be performed under existing resource conditions, etc. The system also needs to take into account the dynamic variability of the resources when generating the execution scheme. Because the execution of the calculation task and the use of the resource are dynamic, the system has certain prediction and adaptation capability so as to adjust the execution scheme in time when the resource condition changes.
Optimization and adjustment of execution scheme after generation of the execution scheme, the system also needs to optimize and adjust it. This includes checking whether the execution scheme meets the resource requirements of all tasks, whether it would result in resource overload or waste, whether system performance can be improved by adjusting the task execution order or resource allocation policy, etc. Through continuous optimization and adjustment, the system can gradually find a more efficient and stable resource allocation scheme, thereby realizing efficient utilization of computing resources.
Based on the above analysis, step S300 is a key element in the computing service scheduling method, and involves understanding and applying the real-time resource situation and allocation rule, and generating and optimizing the execution scheme. Through the step, the system can realize fine granularity management and dynamic allocation of the computing resources, thereby improving the utilization efficiency of the resources and the overall performance of the system.
Step S400, executing a scheduling task based on the execution scheme;
Specifically, the scheduler assigns computing tasks to specific computing resources, such as workstations, servers, cloud computing nodes, or the like, according to an execution scheme. This process involves a number of aspects of considerations and operations:
First, the scheduler will look at the task allocation specified in the execution scheme, knowing which tasks need to be executed and their respective resource requirements. The scheduler will then check the status of the resources in the current pool of available resources, ensuring that there are enough resources to meet the needs of these tasks. This includes availability checking of resources such as CPU, memory, storage, etc.
The scheduler may then schedule the execution of tasks according to priority or timing rules in the execution scheme. If the allocation rule is time-sequential, then the scheduler will execute the tasks sequentially in the order in which they join the wait for execution queue. If the allocation rule is to maximize resource utilization priority, the scheduler will attempt to find a task execution order that maximizes resource utilization.
In the process of executing the task, the scheduler can monitor the use condition of the resource in real time, and ensure that the task is executed within the set upper limit of the use of the resource. This is accomplished by a resource manager that can fine-grained management and restriction of the use of computing resources. If a task exceeds the upper limit of the use of the resource, the resource manager can timely inform the scheduler, and the scheduler can take measures such as killing processes, rescheduling and the like to ensure the stability of the system and the effective utilization of the resource.
In addition, step S400 involves interaction and communication with computing resources. The scheduler needs to communicate with the computing resources, assign tasks to them, and monitor their execution status via an appropriate interface or protocol.
Finally, when the task execution is completed, the scheduler updates the task's state and removes it from the wait for execution queue. Meanwhile, the scheduler can optimize future scheduling strategies according to the execution result of the task and the resource use condition in the execution process so as to improve the overall performance and efficiency of the system.
Based on the above analysis, step S400 is a key step of actually performing the scheduling task based on the execution scheme, and it relates to various aspects of task allocation, resource monitoring, interaction with computing resources, and communication. By fine management and dynamic allocation of computing resources, the present invention aims to achieve efficient utilization of computing resources and efficient execution of tasks.
In this embodiment, the computing service scheduling method further includes:
step S500, generating an execution result based on executing the scheduling task;
Specifically, as a computing task executes on a designated computing resource, the system monitors the execution state of the task in real-time. This includes, but is not limited to, progress of execution of the task, resource usage, whether there is an exception, etc. The purpose of the monitoring is to ensure that the task can perform normally as expected and to take corresponding measures, such as resource adjustment, task restart, etc., as necessary to cope with the possible occurrence of an abnormal situation. Once the task execution is completed, the system gathers execution results. These results may include output data of the computing task, log information during execution, statistics of resource usage, and so forth. The collected result data may be stored in a designated location for subsequent analysis and processing. After the execution results are collected, the system also needs to sort the results. The collating process may include, for example, cleansing of the data, unification of formats, handling of outliers, and the like. By the arrangement, the execution result can be clearer, and the method is easy to understand and analyze. Meanwhile, the tidied result can also be used as a reference basis for subsequent task scheduling, so that the system is helped to optimize a resource allocation strategy, and the resource utilization efficiency is improved. Finally, the generated execution result is passed to the relevant caller or user. This may be achieved by a notification service, such as sending an email, a push message, etc. The caller or user may evaluate the performance of the computing task based on these results and whether further operations or adjustments are needed.
Based on the above analysis, step S500 plays a role in the whole computing service scheduling method. The method is not only used for monitoring and summarizing the task execution process, but also is a key link for providing important references for subsequent task scheduling and resource management. By implementing the step, the high-efficiency execution of the computing task and the reasonable utilization of resources can be ensured, so that the performance and the efficiency of the whole computing service scheduling system are improved.
Step S600, based on the execution result, sending the execution result through a notification service;
Specifically, after the scheduled task is executed, the system generates an execution result, where the result includes key information such as the execution state of the task, output data, error information (if any), and the like. This information is very important to the caller or user because they can determine therefrom whether the task was successfully performed and whether further operations or processing are required. Various notification modes such as mail notification, short message notification, webHook notification and the like can be realized by sending out an execution result through the notification service. Therefore, no matter where the calling party or the user is, the execution result of the task can be received in time, so that corresponding response is made. For example, if the task execution is successful, the caller or the user can perform subsequent analysis or processing according to the output data, and if the task execution is failed, the problem can be located according to the error information and corresponding repair or adjustment can be performed. In addition, the notification service may also provide real-time status update functionality. In the task execution process, the system can feed back the execution progress of the task, the resource use condition and other information to the caller or the user in real time through the notification service. Thus, the calling party or the user can know the execution state of the task at any time, so that the whole calculation process is better controlled. The notification service needs to be tightly integrated with the scheduling system, so that the execution result can be timely and accurately transmitted to the notification service. Second, notification services need to have a high degree of reliability and stability to ensure that notifications can still be sent stably when a large number of tasks are performed simultaneously. Finally, notification services also need to provide flexible configuration options to meet notification requirements of different callers or users.
Based on the above analysis, step S600 is an important step in the method for computing service scheduling according to the present invention. The method not only ensures that the execution result of the computing task can be timely and accurately fed back to related calling parties or users, but also provides various notification modes and real-time state updating functions, thereby greatly improving the availability and user experience of computing services.
In this embodiment, the obtaining the real-time usable resource condition and the current allocation rule in the limited resource pool, and generating the execution scheme includes:
if the allocation rule is time sequence priority and the available resources are satisfied, the next calculation task is sequentially executed;
If the allocation rule is time sequence priority and available resources are not satisfied, waiting for the release of the resources and then executing the next task in priority order;
In particular, in computing service scheduling systems, efficient management of resources and rational scheduling of tasks are key to ensuring system performance. The invention realizes the preliminary management of the task by acquiring the requirement of the container calculation task and adding the requirement into the waiting execution queue. However, the execution of tasks is not a simple first-in first-out (FIFO) process, but rather needs to be dynamically decided according to the current resource conditions and allocation rules.
Wherein the system first needs to know the available resource conditions in the current resource pool, which includes but is not limited to the remaining amount and use state of resources such as CPU, memory, storage, etc. This information is dynamically changing, so the system needs to be acquired in real time to ensure accuracy of the scheduling decisions.
The system then generates an execution scheme according to the preset allocation rules. The allocation rule is the basis of a scheduling decision, which determines how to prioritize the execution of tasks in case of limited resources. In the present invention, an allocation rule of "timing priority" is provided.
The "timing priority" rule means that tasks will execute in the order they were added to the waiting execution queue. Such rules ensure fairness of tasks, i.e. tasks that arrive first will be executed first. However, in the case of limited resources, simple timing priorities may result in inefficient use of resources, as certain tasks may be waiting for a long period of time because of waiting for resources.
To solve this problem, the present invention further considers the availability of resources in computing the execution scheme. Specifically, if the allocation rule is time sequence priority and the available resources meet the requirement of the current task, the system sequentially executes the next computing task. In this case, the system can fully utilize the resources while maintaining the execution order of the tasks. If the allocation rule is time sequence priority, but the available resources do not meet the requirement of the current task, the system waits for the release of the resources and then prioritizes the execution of the next task in the sequence. This strategy avoids wasting resources while ensuring sequential execution of tasks. During the waiting period, the system may perform other tasks with low demands on resources to fully utilize the resources.
Through the dynamic resource management and task scheduling strategy, the invention realizes the aim of efficiently scheduling and executing the multi-task under the condition of limited resources. The method not only improves the utilization efficiency of resources, but also ensures the timeliness and fairness of tasks.
It is assumed that in a computing resource scheduling system, there are three computing tasks to be performed, task a, task B, and task C, respectively. Systems currently have sufficient CPU and memory resources to perform these tasks, but depending on the priority of the tasks and the commit time, the system needs to decide their order of execution. The system selects "timing priority" as the current allocation rule.
Task list
Task A, the submitting time is 9:00 am, 2 CPU cores and 4GB memory are needed, and the predicted executing time is 1 hour;
task B, the submitting time is 9:15 am, 1 CPU core and 2GB memory are needed, and the predicted executing time is 30 minutes;
Task C, commit time is 9:30 am, requiring 2 CPU cores and 4GB of memory, and execution time is expected to be 2 hours.
Resource pool status
The current available resources are 4 CPU cores and 8GB of memory.
Execution process
1. The initial state is that tasks A, B, C are added into a waiting execution queue and are arranged according to the submitting time sequence;
2. Task scheduling, namely, checking a task queue to find that a task A is the first task in the queue, namely, evaluating a resource pool to find that 4 CPU cores and 8GB of memory are available currently, and meeting the resource requirement of the task A (2 CPU cores and 4GB of memory), and determining to execute the task A according to a time sequence priority rule, wherein the task A is the first task in the queue, and the resource meets the requirement.
3. In the execution of the task A, the task A starts to execute, occupies 2 CPU cores and 4GB of memory, and the remaining available resources comprise 2 CPU cores and 4GB of memory.
4. The method comprises the following steps of continuously scheduling, namely, after a task A is executed, checking a task queue to find that the task B is the next task in the queue, namely, evaluating a resource pool to find that 2 CPU cores and 4GB of memory are available currently, and meeting the resource requirement of the task B (1 CPU core and 2GB of memory), and determining to execute the task B according to a time sequence priority rule, wherein the task B is the next task in the queue, and the resource meets the requirement.
5. In the task B execution, the task B starts to execute, occupies 1 CPU core and 2GB of memory, and the remaining available resources comprise 1 CPU core and 2GB of memory.
6. And (4) continuing to schedule, namely checking a task queue after the task B is executed, and finding that the task C is the next task in the queue. Step 1, evaluating a resource pool, finding that only 1 CPU core and 2GB of memory are available currently, and not meeting the resource requirement of a task C (2 CPU cores and 4GB of memory), and step 2, according to a time sequence priority rule, the system does not immediately execute the task C but waits for the release of the resource because the resource is not met.
7. And the task B is completed, 1 CPU core and 2GB of memory are released after the task B is executed, and the resource pool state is updated, namely 2 CPU cores and 4GB of memory.
8. Executing task C, the current resource pool meets the requirement of task C, and task C is the next task in the queue, and deciding to execute task C, because it is now the first task to be executed in the queue, and the resources meet the requirement.
From the above examples, it can be seen that under the allocation rule of "time-sequential priority", the system performs tasks strictly in the time order in which the tasks were submitted. When the resource meets the requirements of the current task, the system will immediately execute the task. If the resources do not meet the requirements of the current task, the system can wait for the release of the resources and then execute the task, so that the sequential execution of the tasks and the effective utilization of the resources are ensured. The allocation rule ensures fairness of tasks, namely, tasks which arrive first can be executed first, and meanwhile, waste of resources and long-time waiting of the tasks are avoided.
Based on the analysis, by combining the real-time resource information and flexible allocation rules, the efficient scheduling and execution of the multi-tasks is realized. The method particularly aims at the specific operation of the time sequence priority rule, ensures the sequential execution of tasks, improves the utilization efficiency of resources, and is a great innovation in the field of computing service scheduling.
In this embodiment, the obtaining the real-time usable resource condition and the current allocation rule in the limited resource pool, and generating the execution scheme includes:
If the allocation rule is the priority of the maximum utilization resource, executing the next task in the priority order under the condition that the available resource is satisfied;
If the allocation rule is priority of maximum utilization of the resources and the available resources are not satisfied, skipping is performed, and the next task which can be executed under the existing resource condition is searched and is executed preferentially;
Specifically, the allocation rule is a set of resource allocation strategies formulated by the system according to the user requirements, resource conditions, business logic and other factors. In this embodiment, the allocation rules are mainly two kinds, timing priority and maximum utilization resource priority.
When timing priorities are used as allocation rules, the system schedules execution according to the task's commit time order. If the currently available resources meet the resource requirements of the task to be performed, then the system will perform the next task in order. If the currently available resources do not meet the task requirements, the system may choose to wait until enough resources are released and then execute the next task in sequence. Such allocation rules guarantee the execution order of tasks, but may reduce the efficiency of resource utilization to some extent.
And when maximizing utilization of the resource priority as an allocation rule, the system may prioritize utilization efficiency of the resource. If the currently available resources meet the resource requirements of the task to be performed, the system will preferentially perform this task. If the currently available resources do not meet the task requirements, the system does not choose to wait, but skips the task to find the next task that can be executed under the existing resource conditions and execute the task preferentially. The allocation rule can fully utilize the existing resources and improve the utilization efficiency of the resources, but can lead to the disorder of the execution sequence of certain tasks.
Under both allocation rules, the system needs to constantly perform resource monitoring and task scheduling to ensure reasonable utilization of resources and efficient execution of tasks. Meanwhile, the system also has certain fault tolerance capacity so as to cope with the situations of possible resource abnormality or task execution failure and the like.
In this way, the computing service scheduling method of the embodiment of the invention can customize the upper limit of the resource use of each computing service and dynamically allocate the resources according to the real-time resource condition and allocation rules, thereby realizing the efficient utilization of the computing resources on the workstation or the server. The method not only solves the problems of uneven resource allocation, low resource utilization efficiency and the like in the traditional computing service scheduling, but also provides more flexible resource scheduling strategy selection for users so as to meet the demands under different scenes.
In addition, the computing service scheduling method of the embodiment of the invention also supports various containerization technologies, such as Docker, kubernetes, so that the computing task can be more conveniently migrated and expanded among different computing resources. Meanwhile, through resource management tools such as a system-level self-grinding VGPU library, the embodiment of the invention can realize the fine management of resources such as CPU, memory, GPU and video memory, and further improve the utilization efficiency of the resources and the execution efficiency of tasks.
It is assumed that on a cloud computing platform, there are multiple computing tasks that need to be performed, with the tasks requiring different resources. The resource pool on the platform contains a certain amount of CPU, memory and GPU resources. To maximize resource utilization efficiency, the platform administrator selects "maximize utilization resource priority" as the current allocation rule.
Task list
Task A, requiring 2 CPU cores and 4GB of memory, and predicting the execution time to be 1 hour;
Task B, 1 GPU and 8GB memory are needed, and the expected execution time is 2 hours;
task C, requiring 1 CPU core and 2GB memory, and the predicted execution time is 30 minutes;
Task D, requiring 1 GPU and 4GB of memory, the predicted execution time is 1.5 hours.
Resource pool status
The current available resources are 2 CPU cores, 8GB of memory and 1 GPU.
Execution process
The initial state is that tasks A, B, C, D are added into a waiting execution queue;
Task scheduling, namely, checking a task queue to find that a task A is the first task in the queue, namely, evaluating a resource pool to find that 2 CPU cores and 8GB of memory are available currently, meeting the resource requirement of the task A (2 CPU cores and 4GB of memory), and determining to execute the task A according to a rule of 'maximizing utilization of resources priority', namely, immediately starting execution without waiting for other resources.
In the execution of the task A, the task A starts to execute, occupies 2 CPU cores and 4GB of memory, and the residual available resources comprise 0 CPU cores, 4GB of memory and 1 GPU.
The method comprises the following steps of continuously scheduling, namely, after a task A is executed, checking a task queue to find that a task B is the next task in the queue, namely, evaluating a resource pool to find that no CPU cores are currently available (2 tasks A are being used), but 1 GPU and 4GB of memory are available, namely, because the task B needs 1 GPU and 8GB of memory, the current resources do not meet the requirements (lack of 4GB of memory), and determining to skip the task B according to a rule of' maximally utilizing the resources to find the next task which can be executed under the existing resource condition, wherein the task B is continuously scheduled.
Task C is selected, namely 1 CPU core and 2GB of memory are found to be needed by task C, enough resources are available in the current resource pool (0 CPU cores are available, but 2 are released after the execution of task A is finished, and 4GB of memory is also enough), and task C is decided to be executed after task A is finished, because the task C can be started to be executed immediately after the resource is released by task A, so that the resource utilization efficiency is maximized.
And the task A is completed, namely, after the task A is executed, releasing 2 CPU cores and 4GB of memory, and updating the resource pool state, namely, 2 CPU cores, 8GB of memory and 1 GPU.
The task C is executed, wherein the current resource pool meets the requirement of the task C, the task C is immediately executed, the task C starts to execute, 1 CPU core and 2GB of memory are occupied, and the remaining available resources comprise 1 CPU core, 6GB of memory and 1 GPU.
And (3) continuing to schedule, namely after the task C is executed, checking a task queue to find that the task B is still the next task in the queue, evaluating a resource pool to find that 1 GPU and 6GB of memory are available currently, still not meeting the 8GB memory requirement of the task B, and skipping the task B again to find the next task.
Selecting a task D, namely finding that the task D needs 1 GPU and 4GB of memory, enabling a current resource pool to meet the requirements of the task D, and determining to execute the task D after the task C is completed.
And the task C is completed, 1 CPU core and 2GB of memory are released after the task C is executed, and the resource pool state is updated, namely 2 CPU cores, 8GB of memory and 1 GPU.
And executing the task D, wherein the current resource pool meets the requirement of the task D, the task D is immediately executed, and the task D starts to execute, so that 1 GPU and 4GB of memory are occupied.
By way of example, it can be seen that under the allocation rule of "maximize utilization of resources first", the system can dynamically evaluate the status of the resource pool and the demands of the tasks, preferentially executing those tasks that can begin immediately and have high utilization of resources. When the resource does not meet the current task requirement, the system skips the task and continues to search for the next task which can be executed under the existing resource condition, so that the maximum utilization of the resource is ensured.
Based on the analysis, the computing service scheduling method of the embodiment of the invention realizes the efficient utilization of the computing resources and the efficient execution of the tasks by customizing the upper limit of the use of the resources, dynamically allocating the computing resources, supporting various allocation rules and containerization technologies and the like, and provides a new solution for the technical field of computer service scheduling.
Fig. 4 is a logic flow diagram of a computing service scheduling method according to an embodiment of the present invention, in an embodiment of the present invention, the system obtains a current available state through resourcer (resource manager) mainly through cyclic execution of a scheduler, obtains a current queuing task state through accessing a queue, determines an optimal scheduling scheme, distributes tasks to corresponding excutor (executor) for execution, and finally notifies a corresponding caller through a notification service. Specifically, the CPU, memory and storage management mechanism of the containerization technology is used, the self-grinding VGPU library at the system level is used for realizing the limit use of resources such as CPU, memory, GPU and video memory, the user or administrator is allowed to set the upper limit of resource use for each computing service, the upper limit comprises CPU utilization rate, memory usage amount, GPU, video memory, disk I/O and the like, and the limits can be dynamically adjusted according to actual requirements in a form of configuration parameters so as to adapt to different computing tasks and load conditions. When a computing task is submitted to a scheduling service, a resource scheduling system reads a resource demand description parameter of the task and adds the task into a waiting execution queue, the resource scheduling system acquires a resource condition limiting real-time availability in a resource pool and acquires a current allocation rule, if the allocation rule is time sequence priority and available resources are met, the next computing task is sequentially executed, if the available resources are not met, the next task is executed in priority after the release of the resources, if the allocation rule is maximum utilization resource priority and available resources are met, the next task is executed in priority, if the available resources are not met, the next task is skipped, and the next task which can be executed under the existing resource condition is searched for and executed in priority. The system monitors the resource use condition of each computing service in real time, ensures that each resource can not exceed the limit to the maximum extent through the contract of authorizing and then using, kills the occupied process with the lowest priority once a certain resource exceeds the set maximum threshold, and puts the task back into the queue again to wait for the next opportunity capable of meeting the resource requirement to be executed again.
In another embodiment of the present invention, in order to implement accurate prediction and efficient scheduling of computing resources, the present invention further provides a resource prediction algorithm and an intelligent scheduling policy, in which the algorithm specifically adopts a long-short-term memory network (LSTM), which is a special Recurrent Neural Network (RNN), and is good at processing time-series data, and has a long-term dependency capturing capability, so that it is very suitable for predicting resource usage. Time series data of key resource indexes such as CPU utilization rate, memory occupancy rate, disk I/O and the like are collected from historical data, and the data can come from a monitoring system, a log file or a resource manager and the like. And (3) performing preprocessing operations such as cleaning, normalization, feature extraction and the like on the collected data so as to improve the accuracy of prediction. The LSTM model is trained using the preprocessed data. In the training process, model parameters can be adjusted by methods such as cross-validation and the like to optimize the prediction performance. After training is completed, the model is deployed into a production environment. The system periodically collects the latest resource use condition data, inputs the latest resource use condition data into the LSTM model for prediction, and obtains a resource demand prediction result in a future period of time. According to the prediction result of the LSTM model, enough resources are reserved for the upcoming high-resource-demand task, which can ensure that the task does not wait or fail due to insufficient resources when executing. According to the resource prediction result and the importance of the task, the priority of the task is dynamically adjusted, and for the task with high and important resource demand, the priority of the task can be improved, so that the task can be ensured to be preferentially supported by the resource. When resource demand is predicted to proliferate, more computing resources (e.g., virtual machines, containers, etc.) may be automatically started to meet the demand for task execution. When the resource demand decreases, then some of the resources may be freed up to reduce cost. The system monitors the use condition of the resources and the execution state of the tasks in real time, and ensures the effective execution of the resource prediction and intelligent scheduling strategies. And dynamically adjusting strategies such as resource allocation, task priority and the like according to the real-time monitoring result and new resource prediction data so as to adapt to continuously changing resource requirements and environments.
The method and the device realize accurate prediction and efficient scheduling of computing resources by combining an LSTM-based resource prediction algorithm and an intelligent scheduling strategy. The method not only improves the use efficiency of resources and the task execution efficiency, but also reduces the operation cost, and provides a new solution for the field of computing service scheduling.
Example two
Fig. 2 is a schematic structural diagram of a computing service scheduling system according to a second embodiment of the present invention, and as shown in fig. 2, the second embodiment provides a computing service scheduling system, which includes an obtaining module 201, a joining module 202, a first generating module 203, and an executing module 204. The obtaining module 201 is configured to obtain a container computing task requirement. The joining module 202 is configured to read a description parameter of the container computing task requirement, and join the container computing task requirement into a waiting execution queue. The first generation module 203 is configured to obtain real-time available resource conditions and current allocation rules in the limited resource pool, and generate an execution scheme. And the execution module 204 is configured to execute a scheduling task based on the execution scheme.
In this embodiment, the computing service scheduling system further includes a second generation module and an issue module. The second generation module is used for generating an execution result based on the execution of the scheduling task. And the sending module is used for sending the execution result through a notification service based on the execution result.
In this embodiment, the first generating module 203 includes a first executing unit and a second executing unit. The first execution unit is configured to sequentially execute a next calculation task if the allocation rule is time-sequence priority and the available resource is satisfied. And the second execution unit is used for executing the next task in the priority execution sequence after waiting for the release of the resources if the allocation rule is time sequence priority and the available resources are not satisfied.
In this embodiment, the first generating module 203 further includes a third executing unit and a fourth executing unit. And the third execution unit is used for executing the next task in a priority order under the condition that the available resources are satisfied if the allocation rule is the priority of the maximum utilization resources. And the fourth execution unit is used for skipping if the allocation rule is the priority of the maximum utilization resource and the available resource is not satisfied, searching for the next task which can be executed under the existing resource condition and executing the task preferentially.
The various modifications and specific examples of the computing service scheduling method provided in the first embodiment are equally applicable to the computing service scheduling system provided in the present embodiment, and those skilled in the art will be aware of the implementation of one computing service scheduling system in the present embodiment through the foregoing detailed description of one computing service scheduling method, so that, for brevity of the description, they will not be described in detail herein.
Example III
Fig. 3 is a schematic structural diagram of an electronic device in a third embodiment of the present invention, and as shown in fig. 3, the third embodiment further provides an electronic device 300, which may include a processor 301 and a memory 302.
The memory 302 is used for storing programs, the memory 302 may include volatile memory (english: volatilememory) such as random-access memory (english: RAM), such as static random-access memory (english: SRAM), double data rate synchronous dynamic random-access memory (english: double Data Rate Synchronous Dynamic Random Access Memory, DDR SDRAM), and the like, and nonvolatile memory (english: non-volatile memory) such as flash memory (english: flash memory). The memory 302 is used to store computer programs (e.g., application programs, functional modules, etc. that implement the methods described above), computer instructions, etc., which may be stored in one or more of the memories 302 in a partitioned manner. And the above-described computer programs, computer instructions, data, etc. may be invoked by the processor 301.
The computer programs, computer instructions, etc., described above may be stored in one or more of the memories 302 in partitions. And the above-described computer programs, computer instructions, etc. may be invoked by the processor 301.
A processor 301 for executing a computer program stored in a memory 302 to implement the steps of the method according to the above-mentioned embodiment.
Reference may be made in particular to the description of the embodiments of the method described above.
The processor 301 and the memory 302 may be separate structures or may be integrated structures integrated together. When the processor 301 and the memory 302 are separate structures, the memory 302 and the processor 301 may be coupled by a bus 303.
The electronic device in this embodiment may execute the technical scheme in the above method, and the specific implementation process and the technical principle are the same, which are not described herein again.
Example IV
The fourth embodiment also provides a computer-readable storage medium including a computer program and instructions which, when run on a computer, cause the computer to perform the computing service scheduling method of any embodiment of the invention.
The computer readable storage medium includes various media capable of storing program codes such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk.
The present embodiment also provides a computer program product comprising a computer program stored in a readable storage medium, from which at least one processor of an electronic device can read the computer program, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any of the embodiments described above.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solution of the present disclosure are achieved, and are not limited herein.
In summary, the computing service scheduling method and system of the invention have the following beneficial effects:
1. The invention allows a user or an administrator to set an upper limit of resource use for each computing service, including CPU utilization rate, memory usage, GPU, video memory, disk I/O and the like;
2. By monitoring the use condition of the resources and the execution state of the tasks in real time, the invention can dynamically adjust the allocation of the computing resources, namely the system can flexibly adjust the execution sequence and the resource allocation of the tasks according to the current resource condition and the task demand, thereby maximizing the utilization efficiency of the resources;
3. By customizing the upper limit of resource use and dynamically allocating the computing resources, the invention effectively solves the problems of uneven resource allocation and low resource utilization efficiency in the traditional computing service scheduling, ensures that the computing resources can be efficiently utilized in a multi-task and multi-resource environment, thereby improving the performance and efficiency of the whole system;
4. The invention provides a plurality of allocation rules, such as time sequence priority, maximum utilization resource priority and the like, and a user can select a proper rule to carry out resource scheduling according to actual demands;
5. by monitoring the use condition of resources and the execution state of tasks in real time, the invention can timely find and process abnormal conditions such as resource overload, task execution failure and the like;
6. by accurately predicting the resource demand and the intelligent scheduling strategy, the invention can avoid unnecessary resource waste and cost expenditure. For example, when resource demand is predicted to proliferate, the system may automatically launch more computing resources to meet the demand, and when resource demand drops, part of the resources may be released to reduce cost.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.