WO2017067484A1

WO2017067484A1 - Virtualization data center scheduling system and method

Info

Publication number: WO2017067484A1
Application number: PCT/CN2016/102759
Authority: WO
Inventors: 张玉军
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2015-10-23
Filing date: 2016-10-20
Publication date: 2017-04-27
Anticipated expiration: 2018-04-23
Also published as: CN106612312A

Abstract

The present invention discloses a virtualization data center scheduling system and method. The system comprises: a service definition and orchestration module, a data center management platform, and a network controller, wherein defined interfaces of a service type, a resource template, and a reliability requirement level are provided, a service orchestration policy is received to generate a service function chain, and a resource of a data center is allocated according to the defined service type and resource template to generate a service node. Upon determination of an anomaly of the service node, a resource of the data center is scheduled according to the service reliability requirement level to restore the service node, and a data packet forwarding path of the service node is decided. In this way, when an anomaly of a service node occurs, the service node can be restored according to a service type, resource template, and service reliability requirement level defined by a user, thereby ensuring normal operation of the service node in a service function chain, and improving the service of the data center to be highly reliable.

Description

Virtualized data center scheduling system and method

Technical field

本发明涉及云计算领域和数据中心虚拟化技术，尤指一种虚拟化数据中心调度系统和方法。The present invention relates to the field of cloud computing and data center virtualization technologies, and more particularly to a virtualized data center scheduling system and method.

Background technique

目前，随着云计算和大数据等新兴应用的快速发展，数据中心的需求也越来越多，其中，数据中心虚拟化是现数据中心面向服务的基本趋势，虚拟化数据中心具备统一服务器、运维灵活性、节省电力、节省空间、改善灾难恢复能力，高可用性等诸多优势。At present, with the rapid development of emerging applications such as cloud computing and big data, the demand for data centers is also increasing. Among them, data center virtualization is the basic trend of service-oriented data centers, and virtualized data centers have unified servers. Operational flexibility, power savings, space savings, disaster recovery, high availability and many other advantages.

现有技术中，随着软件定义网络(Software Defined Network，简称：SDN)和网络功能虚拟化(Network Function Virtualizatio，简称：NFV)架构和相关技术的发展和推动，服务功能链(Service Function Chaining，简称：SFC)是一组具有特定服务路径的服务节点集合，是为虚拟化数据中心提供服务的理想方案。In the prior art, with the development and promotion of a Software Defined Network (SDN) and a Network Function Virtualization (NFV) architecture and related technologies, a Service Function Chaining (Service Function Chaining, Abbreviation: SFC) is a set of service nodes with a specific service path, which is an ideal solution for providing services to virtualized data centers.

但是，采用现有技术，由于服务功能链组成复杂，其可靠性依赖于每个服务功能节点的可靠性，因此，为保证数据中心提供高可靠性的服务，如何维护服务功能链中服务节点的可靠性是必须需要解决的一个关键性问题。However, with the prior art, since the service function chain is complex in composition, its reliability depends on the reliability of each service function node. Therefore, in order to ensure high reliability service in the data center, how to maintain the service node in the service function chain Reliability is a key issue that must be addressed.

发明内容Summary of the invention

为了解决上述技术问题，本发明实施例提供了一种虚拟化数据中心调度系统和方法，能够维护服务功能链中服务节点的正常运行，从而为数据中心提供高可靠性服务。In order to solve the above technical problem, an embodiment of the present invention provides a virtualized data center scheduling system and method, which can maintain normal operation of a service node in a service function chain, thereby providing a high reliability service for a data center.

第一方面，本发明实施例提供一种虚拟化数据中心调度系统，包括：In a first aspect, an embodiment of the present invention provides a virtualized data center scheduling system, including:

服务定义和编排模块、数据中心管理平台和网络控制器；Service definition and orchestration modules, data center management platforms, and network controllers;

所述服务定义和编排模块，设置为提供服务类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点； The service definition and orchestration module is configured to provide a service interface, a resource template, and a reliability requirement level definition interface, and receive a service orchestration policy generation service function chain, where the service reliability requirement level includes an abnormal alarm, and a single service node automatically recovers Differentiated service active and standby nodes and equal service active and standby nodes;

所述数据中心管理平台，设置为根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点；The data center management platform is configured to allocate a resource node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling resource recovery of the data center according to a service reliability requirement level. The service node;

所述网络控制器，设置为制定所述服务节点的数据包转发路径。The network controller is configured to formulate a packet forwarding path of the service node.

第二方面，本发明实施例提供一种虚拟化数据中心调度方法，该方法包括：In a second aspect, an embodiment of the present invention provides a virtualized data center scheduling method, where the method includes:

提供服务的类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点；Providing a service interface type, a resource template, and a reliability requirement level definition interface, and receiving a service orchestration policy generation service function chain, where the service reliability requirement level includes an abnormal alarm, a single service node automatic recovery, a differentiated service active standby node, and an equalization Serving the active and standby nodes;

根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点；Allocating, according to the defined service type and the resource template, the resource generation service node of the data center, after determining that the service node is abnormal, scheduling the resource of the data center according to the service reliability requirement level to restore the service node;

制定所述服务节点的数据包转发路径。A packet forwarding path of the service node is formulated.

与现有技术相比，本发明实施例提供的一种虚拟化数据中心调度系统和方法，包括：服务定义和编排模块、数据中心管理平台和网络控制器，通过服务定义和编排模块提供服务的类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点，数据中心管理平台根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点，网络控制器制定所述服务节点的数据包转发路径，从而使得在服务节点出现异常时，可以根据用户所定义的服务类型、资源模板和服务可靠性需求级别来恢复服务节点，进而保证了功能服务链的服务节点的正常运行，提高了数据中心的高可靠性服务。Compared with the prior art, a virtualized data center scheduling system and method provided by an embodiment of the present invention includes: a service definition and orchestration module, a data center management platform, and a network controller, and provides services through a service definition and orchestration module. The definition interface of the type, the resource template, and the reliability requirement level, and the service service orchestration policy generation service function chain, the service reliability requirement level includes an abnormal alarm, a single service node automatic recovery, a differentiated service active standby node, and an equal service active/standby a node, the data center management platform allocates a resource node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the resource of the data center according to the service reliability requirement level to restore the service. a node, the network controller formulates a packet forwarding path of the service node, so that when the service node is abnormal, the service node can be restored according to the service type, resource template, and service reliability requirement level defined by the user, thereby ensuring Service section of the functional service chain Normal operation, high reliability and improved service data center.

本发明的其它特征和优点将在随后的说明书中阐述，并且，部分地从说明书中变得显而易见，或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Other features and advantages of the invention will be set forth in the description which follows, The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.

DRAWINGS

附图用来提供对本发明技术方案的进一步理解，并且构成说明书的一部分，与本申请的实施例一起用于解释本发明的技术方案，并不构成对本发明技术方案的限制。 The drawings are used to provide a further understanding of the technical solutions of the present invention, and constitute a part of the specification, which together with the embodiments of the present application are used to explain the technical solutions of the present invention, and do not constitute a limitation of the technical solutions of the present invention.

图1为本发明提供的一种虚拟化数据中心调度系统实施例一功能模块图；1 is a functional block diagram of a virtualized data center scheduling system according to an embodiment of the present invention;

图2为本发明提供的一种虚拟化数据中心调度系统实施例二功能模块图；2 is a functional block diagram of a second embodiment of a virtualized data center scheduling system according to the present invention;

图3为本发明提供的一种虚拟化数据中心调度系统实施例三功能模块图；3 is a functional block diagram of a third embodiment of a virtualized data center scheduling system according to the present invention;

图4为为本发明提供的一种虚拟化数据中心调度方法实施例一流程示意图。FIG. 4 is a schematic flowchart of Embodiment 1 of a method for scheduling a virtualized data center according to the present invention.

detailed description

为使本发明的目的、技术方案和优点更加清楚明白，下文中将结合附图对本发明的实施例进行详细说明。需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互任意组合。The embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.

在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。The steps illustrated in the flowchart of the figures may be executed in a computer system such as a set of computer executable instructions. Also, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.

本发明实施例涉及的数据中心，是指基于虚拟化的在互联网络基础设施上传递、加速、展示、计算、存储数据信息，即可以在一台计算机上运行多个虚拟机，也可在多个环境间共享这一台计算机的资源，不同的虚拟机可以在同一台计算机上运行不同的操作系统以及多个应用程序。The data center according to the embodiment of the present invention refers to transmitting, accelerating, displaying, calculating, and storing data information on the internet infrastructure based on virtualization, that is, multiple virtual machines can be run on one computer, or multiple Environments share the resources of this computer. Different virtual machines can run different operating systems and multiple applications on the same computer.

本发明实施例涉及的系统，旨在解决现有技术中如何维护服务功能链的服务节点的正常运行来为数据中心提供高可靠性的服务的技术问题。The system according to the embodiment of the present invention aims to solve the technical problem of how to maintain the normal operation of the service node of the service function chain to provide a highly reliable service for the data center in the prior art.

下面以具体地实施例对本发明的技术方案进行详细说明。下面这几个具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例不再赘述。The technical solutions of the present invention will be described in detail below with specific embodiments. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.

图1为本发明提供的一种虚拟化数据中心调度系统实施例一功能模块图，如图1所示，该系统包括：服务定义和编排模块10、数据中心管理平台20和网络控制器30；1 is a functional block diagram of a virtualized data center scheduling system according to an embodiment of the present invention. As shown in FIG. 1, the system includes: a service definition and orchestration module 10, a data center management platform 20, and a network controller 30;

所述服务定义和编排模块10，设置为提供服务类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点；The service definition and orchestration module 10 is configured to provide a service interface, a resource template, and a reliability requirement level definition interface, and receive a service orchestration policy generation service function chain, where the service reliability requirement level includes an abnormal alarm, and a single service node automatically Recovery, differentiated service primary and secondary nodes and equal service active and standby nodes;

具体的，所述服务定义和编排模块10接受租户或者用户根据所述虚拟化数据中心调度系统提供的服务类型、资源模板和可靠性需求级别定义接口，自定义服务编排策略，确定需要的服务可靠性需求级别，生成服务功能链，其中，所述服务可靠性需求级别包括：异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点。Specifically, the service definition and orchestration module 10 accepts a tenant or a user to define an interface according to a service type, a resource template, and a reliability requirement level provided by the virtualized data center scheduling system, and customizes a service orchestration policy to determine a requirement. The service reliability requirement level generates a service function chain, wherein the service reliability requirement level includes: an abnormal alarm, a single service node automatic recovery, a differentiated service active standby node, and an equal service active standby node.

所述数据中心管理平台20，设置为根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点；The data center management platform 20 is configured to allocate a resource generation service node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the resource of the data center according to the service reliability requirement level. Recovering the service node;

具体的，所述数据中心管理平台20根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，为服务节点处理服务业务提供保证，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点。Specifically, the data center management platform 20 allocates a resource generating service node according to the defined service type and the resource template to provide a guarantee for the service node to process the service service, and after determining the abnormality of the service node, according to the service reliability. The demand level schedules resources of the data center to recover the service node.

所述网络控制器30，设置为制定所述服务节点的数据包转发路径。The network controller 30 is configured to formulate a packet forwarding path of the serving node.

具体的，所述虚拟化数据中心调度系统的网络通常采用控制面和转发面分离方案，由所述网络控制器30统一控制转发面设备，制定所述服务节点的数据包转发策略。Specifically, the network of the virtualized data center scheduling system generally adopts a control plane and a forwarding plane separation scheme, and the network controller 30 uniformly controls the forwarding plane device to formulate a packet forwarding policy of the service node.

本发明实施例提供的一种虚拟化数据中心调度系统，包括：服务定义和编排模块、数据中心管理平台和网络控制器，通过服务定义和编排模块提供服务的类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点，数据中心管理平台根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点，网络控制器制定所述服务节点的数据包转发路径，从而使得在服务节点出现异常时，可以根据用户所定义的服务类型、资源模板和服务可靠性需求级别来恢复服务节点，进而保证了功能服务链的服务节点的正常运行，提高了数据中心的高可靠性服务。A virtualized data center scheduling system provided by an embodiment of the present invention includes: a service definition and orchestration module, a data center management platform, and a network controller, and provides a service type, a resource template, and a reliability requirement level through a service definition and orchestration module. The definition interface, the receiving service orchestration policy generates a service function chain, and the service reliability requirement level includes an abnormal alarm, a single service node automatic recovery, a differentiated service active standby node, and an equal service active standby node, and the data center management platform is defined according to the definition The service type and the resource template are allocated to the resource generation service node of the data center. After determining the abnormality of the service node, the resource of the data center is scheduled to be restored according to the service reliability requirement level, and the network controller formulates the service. The packet forwarding path of the node, so that when the service node is abnormal, the service node can be restored according to the service type, resource template, and service reliability requirement level defined by the user, thereby ensuring the normal operation of the service node of the functional service chain. , increased the number High reliability and service center.

图2为本发明提供的一种虚拟化数据中心调度系统实施例二功能模块图，如图2所示，所述数据中心管理平台20包括：设备资源管理平台210和虚拟化服务管理平台220，其中，2 is a functional block diagram of a second embodiment of a virtualized data center scheduling system according to the present invention. As shown in FIG. 2, the data center management platform 20 includes: a device resource management platform 210 and a virtualization service management platform 220. among them,

所述设备资源管理平台210，设置为根据所定义的服务类型和资源模板分配数据中心的硬件设备资源，管理和维护所述硬件设备资源，提供所述硬件设备资源的性能度量数据，确定所述服务节点是否有异常；The device resource management platform 210 is configured to allocate, according to the defined service type and the resource template, the hardware device resource of the data center, manage and maintain the hardware device resource, provide performance metric data of the hardware device resource, and determine the Whether the service node has an exception;

所述虚拟化服务管理平台220，设置为根据所定义的服务类型和资源模板分配数据中心的虚拟化资源生成服务节点，在确定所述服务节点异常后，根据服务可靠性需求级别调度所述数据中心的虚拟化资源恢复所述服务节点The virtualization service management platform 220 is configured to allocate a virtualized resource generation service node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the service according to the service reliability requirement level. The virtualized resource of the data center recovers the service node

具体的，所述数据中心管理平台20包括：设备资源管理平台210和虚拟化服务管理平台220，其中，所述设备资源管理平210设置为管理和维护数据中心硬件设备资源，提供硬件设备资源性能度量数据，如：服务节点对应的计算机或者服务器的存储、计算、网络等硬件性能指标，根据性能度量数据可以检测服务节点的运行状态来判断所述服务节点是否异常，并发现硬件设备资源有异常则提供故障告警。所述虚拟化服务管理平台220设置为根据所定义的服务类型和资源模板分配数据中心的虚拟化资源生成服务节点，在确定所述服务节点异常后，根据所述服务可靠性需求级别对所述服务节点提供基于硬件设备资源上的虚拟化资源管理和调度，包括计算、存储和网络虚拟化资源来恢复所述服务节点。Specifically, the data center management platform 20 includes: a device resource management platform 210 and a virtualization service management platform 220, wherein the device resource management platform 210 is configured to manage and maintain data center hardware device resources, and provide hardware device resource performance. Metric data, such as hardware performance indicators such as storage, calculation, and network of the computer or server corresponding to the service node. According to the performance metric data, the running status of the service node can be detected to determine whether the service node is abnormal, and the hardware device resource is abnormal. A fault alarm is provided. The virtualization service management platform 220 is configured to allocate a virtualized resource generation service node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, according to the service reliability requirement level, The service node provides virtualized resource management and scheduling based on hardware device resources, including computing, storage, and network virtualization resources to recover the service node.

进一步地，图3为本发明提供的一种虚拟化数据中心调度系统实施例三功能模块图，如图3所示，所述系统还包括：服务检测模块40；Further, FIG. 3 is a functional block diagram of a third embodiment of a virtualized data center scheduling system according to the present invention. As shown in FIG. 3, the system further includes: a service detecting module 40;

所述服务检测模块40，设置为当所述功能服务链存在服务主节点和服务备节点时，检测所述服务主节点和所述服务备节点服务状态的一致性和接收数据包的时效性，确定所述服务主节点和所述服务备节点是否异常。The service detection module 40 is configured to detect, when the service service chain has a service primary node and a service standby node, consistency of service status of the service primary node and the service standby node, and timeliness of receiving data packets, Determining whether the service primary node and the service standby node are abnormal.

具体的，所述服务检测模块40在所述服务功能链存在服务主节点和服务备节点时，记录所述服务主节点和所述服务备节点的映射关系，该映射关系可以是一个服务主节点对应一个服务备节点，也可以是对应多个服务备节点，所述服务主节点和所述服务备节点一起组成一个服务集合，并检测这个服务集合的主备服务节点的服务状态的一致性，即所述服务主节点和所述服务备节点处理后的数据是否一致，从而来判断所述服务主节点和所述服务备节点服务是否需要进行同步，同时，还检测接收数据包的时效性，根据接收服务主节点或者服务备节点的数据包的时间是否超时来判断所述服务主节点和所述服务备节点是否异常，举例来说，所述服务检测模块40如果在T1时刻收到服务主节点数据包X，在T2时刻收到服务备节点数据包X，只有当T1与T2的时间差值小于限制值时，所述服务主节点和服务备节点正常，反之异常。看是否有延迟，从而可以确定述服务主节点和所述服务备节点是否异常。Specifically, the service detecting module 40 records a mapping relationship between the service master node and the service standby node when the service function node has a service master node and a service standby node, and the mapping relationship may be a service master node. Corresponding to a service standby node, which may also be a corresponding multiple service standby node, the service master node and the service standby node together form a service set, and detect the consistency of service status of the active and standby service nodes of the service set. That is, whether the data processed by the service master node and the service standby node are consistent, so as to determine whether the service master node and the service standby node service need to be synchronized, and also detect the timeliness of receiving the data packet. Determining whether the service master node and the service standby node are abnormal according to whether the time of receiving the data packet of the service master node or the service standby node is out of time. For example, the service detection module 40 receives the service master at time T1. Node data packet X receives the service standby node data packet X at time T2, only when the time difference between T1 and T2 is less than the limit When the value of the service node and the master node serving standby normal, abnormal and vice versa. It is determined whether there is a delay, so that it can be determined whether the service primary node and the service standby node are abnormal.

进一步地，所述服务检测模块40在确定所述服务主节点异常后，还设置为：Further, after determining that the service master node is abnormal, the service detecting module 40 is further configured to:

重新确定服务主备关系，设原服务备节点为服务主节点，设原服务主节点为服务备节点。Re-determine the service master-slave relationship, set the original service standby node as the service master node, and set the original service master node as the service standby node.

具体的，当服务检测模块40在确定所述服务主节点异常后，为了尽快回复服务，所述服务检测模块40还设置为重新确定所述服务主节点和所述服务备节点的关系，可以设原服务备节点为服务主节点，原服务主节点为服务备节点，以保证服务的不中断。Specifically, when the service detecting module 40 determines that the service primary node is abnormal, in order to restore the service as soon as possible, the service The service detection module 40 is further configured to re-determine the relationship between the service master node and the service standby node. The original service standby node may be the service master node, and the original service master node is the service standby node to ensure uninterrupted service.

进一步地，在上述实施例的基础上，如图3所示，上述系统还包括：服务备份模块50；Further, based on the foregoing embodiment, as shown in FIG. 3, the system further includes: a service backup module 50;

所述服务备份模块50，设置为若所述服务主节点和所述服务备节点服务状态不一致时，则确定所述服务备节点需要向所述服务主节点同步，同步所述服务主节点的内存和镜像文件至服务备节点。The service backup module 50 is configured to: if the service primary node and the service standby node have different service states, determine that the service standby node needs to synchronize with the service primary node, and synchronize the memory of the service primary node. And the image file to the service standby node.

具体的，当所述服务检测模块40检测到所述服务主节点和所述服务备节点的服务状态不一致时，则确定所述服务备节点需要向服务主节点同步，由服务检测模块40触发同步请求至服务备份模块50，由所述服务备份模块50以增量方式同步主节点内存和镜像文件至服务备节点，这样服务主备节点同步为非周期性工作，节省数据中心的资源，且最小化影响服务主节点处理能力。Specifically, when the service detection module 40 detects that the service status of the service primary node and the service standby node is inconsistent, it is determined that the service standby node needs to synchronize with the service primary node, and the service detection module 40 triggers synchronization. The request to the service backup module 50, the service backup module 50 synchronizes the primary node memory and the image file to the service standby node in an incremental manner, so that the service active and standby nodes synchronize to work non-periodically, saving data center resources and minimizing It affects the processing power of the service master node.

进一步地，在上述实施例的基础上，所述服务备份模块50还设置为按照预设时间间隔创建快照存储，若存在远程容灾数据中心，将按照预设时间间隔同步所述服务备节点的镜像文件至远程容灾数据中心。Further, on the basis of the foregoing embodiment, the service backup module 50 is further configured to create a snapshot storage according to a preset time interval. If a remote disaster recovery data center exists, the service standby node is synchronized according to a preset time interval. Mirror the file to the remote disaster recovery data center.

具体的，所述服务备份模块50还设置为按照一定的时间间隔创建快照存储，该间隔时间可以根据情况具体设置，并定时将所述服务备节点的镜像文件同步至容灾数据中心为了保障数据中心的数据安全性和可靠性，如果有容灾数据中心，该容灾数据中心可以对数据中心进行健康状态监视和功能切换，当一处系统因意外(如火灾、地震等)停止工作时，整个系统可以切换到另一处，使得该系统功能可以继续正常工作。Specifically, the service backup module 50 is further configured to create a snapshot storage according to a certain time interval, and the interval time may be specifically set according to the situation, and the image file of the standby node is synchronously synchronized to the disaster recovery data center to ensure data. The data security and reliability of the center. If there is a disaster recovery data center, the disaster recovery data center can perform health status monitoring and function switching on the data center. When a system stops working due to an accident (such as a fire, an earthquake, etc.), The entire system can be switched to another location so that the system functions can continue to function normally.

进一步地，在上述实施例的基础上，所述虚拟化服务管理平台220设置为在确定所述服务节点异常后，根据服务可靠性需求级别调度所述数据中心的虚拟化资源恢复所述服务节点，是指：Further, on the basis of the foregoing embodiment, the virtualization service management platform 220 is configured to, after determining that the service node is abnormal, scheduling the virtualized resource of the data center to restore the service node according to a service reliability requirement level. ,Refers to:

所述虚拟化服务管理平台220设置为当所述服务可靠性需求级别为异常告警时，在确定所述服务节点异常后，生成告警提示；The virtualization service management platform 220 is configured to generate an alarm prompt after determining that the service node is abnormal, when the service reliability requirement level is abnormal.

当所述服务可靠性需求级别为单服务节点自动恢复时，在确定所述服务节点异常后，按照单服务节点恢复策略恢复所述服务节点；When the service reliability requirement level is automatically restored by the single service node, after determining that the service node is abnormal, the service node is restored according to the single service node recovery policy;

当所述服务可靠性需求级别为差异化或平等服务主备节点时，为服务节点生成服务主节点和服务备节点，在确定所述服务备节点异常后，按照单服务节点恢复策略恢复所述服务备节点，并向所述服务主节点完成同步，在确定所述服务主节点异常后，切换原服务备节点为服务主节点，原服务主节点为服务备节点，按照单服务节点恢复策略恢复所述原服务主节点。Generating a service main section for the service node when the service reliability requirement level is a differentiated or equal service primary and secondary node After determining that the service standby node is abnormal, the point and the service standby node recover the service standby node according to the single service node recovery policy, and complete synchronization with the service primary node, and after determining that the service primary node is abnormal, switch The original service standby node is the service primary node, and the original service primary node is the service standby node, and the original service primary node is restored according to the single service node recovery policy.

具体的，所述虚拟化服务管理平台220设置为当所述服务可靠性需求级别为异常告警时，可以通过该服务节点的硬件性能度量数据检测所述服务节点，当服务不可用时会生成告警提示，可以由系统管理员或相关用户人工干预处理的异常。Specifically, the virtualization service management platform 220 is configured to detect the service node by using the hardware performance metric data of the service node when the service reliability requirement level is abnormal, and generate an alarm prompt when the service is unavailable. An exception that can be handled manually by a system administrator or related user.

所述虚拟化服务管理平台220设置为当所述服务可靠性需求级别为单服务节点自动恢复时，根据所述设备资源管理平台210提供的硬件设备资源性能度量数据对所述服务节点执行重启、迁移或重建操作。The virtualization service management platform 220 is configured to perform a restart on the service node according to the hardware device resource performance metric data provided by the device resource management platform 210 when the service reliability requirement level is automatically restored by the single service node. Migration or rebuild operation.

所述虚拟化服务管理平台220设置为当所述服务可靠性需求级别为差异化或平等服务主备节点时，为服务节点生成服务主节点和服务备节点，在确定所述服务备节点异常后，按照所述设备资源管理平台210提供的硬件设备资源性能度量数据对所述服务节点执行重启、迁移或重建操作恢复所述服务备节点，并向所述服务主节点完成同步，在确定所述服务主节点异常后，切换原服务备节点为服务主节点，原服务主节点为服务备节点，按照所述设备资源管理平台210提供的硬件设备资源性能度量数据对所述服务节点执行重启、迁移或重建操作恢复所述原服务主节点。The virtualization service management platform 220 is configured to generate a service primary node and a service standby node for the service node when the service reliability requirement level is a differentiated or equal service primary standby node, after determining that the service standby node is abnormal Performing a restart, migration, or reconstruction operation on the service node according to the hardware device resource performance metric data provided by the device resource management platform 210 to restore the service standby node, and completing synchronization to the service primary node, in determining the After the service master node is abnormal, the original service standby node is the service master node, and the original service master node is the service standby node, and the service node is restarted and migrated according to the hardware device resource performance metric data provided by the device resource management platform 210. Or the rebuild operation restores the original service master node.

如上所述，所述单服务节点恢复策略，包括：As described above, the single service node recovery strategy includes:

若仅所述服务节点提供的服务异常，其对应的虚拟计算、网络连接和存储资源正常，则重启所述服务节点；If only the service provided by the service node is abnormal, and the corresponding virtual computing, network connection, and storage resources are normal, the service node is restarted;

若所述服务节点所对应的虚拟计算或网络资源异常，则根据所述服务节点的存储资源镜像进行迁移操作恢复所述服务节点的服务功能；If the virtual computing or network resource corresponding to the service node is abnormal, performing a migration operation according to the storage resource image of the service node to restore a service function of the service node;

若所述服务节点存储资源异常，导致所述服务节点镜像文件使用异常，有快照资源时，则尝试用所述服务节点的最新快照重建所述服务节点，或者没有快照资源时，则由虚拟化服务管理平台根据所述服务节点的虚拟资源模板，重建所述服务节点。If the service node storage resource is abnormal, causing the service node image file to use abnormality, if there is a snapshot resource, attempting to reconstruct the service node with the latest snapshot of the service node, or when there is no snapshot resource, then virtualizing The service management platform reconstructs the service node according to the virtual resource template of the service node.

具体的，所述性能度量数据可以体现出具体异常的原因，如：存储异常、计算异常或者网络异常。在所述服务可靠性需求级别为单服务节点自动恢复时，若仅服务节点异常，其所在计算节点虚拟计算、网络和存储资源均正常，则所述虚拟化服务管理平台220执行重启该服务节点，若服务节点所在计算节点虚拟计算或网络资源异常，则根据其共享存储镜像，则所述虚拟化服务管理平台220执行迁移操作，尝试恢复服务功能，若服务节点存储资源异常，导致其镜像文件使用异常，则所述虚拟化服务管理平台220尝试用该服务节点的最新快照重建服务节点，若无快照资源，则由虚拟化服务管理平台220根据其虚拟资源模板，重建该服务节点。Specifically, the performance metric data may reflect a specific abnormality, such as a storage exception, a calculation abnormality, or a network abnormality. When the service reliability requirement level is automatically restored for the single service node, if only the service node is abnormal, and the virtual computing, network, and storage resources of the computing node are normal, the virtualization service management platform 220 performs the restart. The service node, if the virtual node of the compute node or the network resource of the service node is abnormal, according to the shared storage image, the virtualization service management platform 220 performs a migration operation and attempts to restore the service function. If the storage node stores the resource abnormally, the If the image file usage is abnormal, the virtualization service management platform 220 attempts to reconstruct the service node with the latest snapshot of the service node. If there is no snapshot resource, the virtualization service management platform 220 reconstructs the service node according to its virtual resource template.

进一步地，所述虚拟化服务管理平台设置为当所述服务可靠性需求级别为差异或平等服务主备节点时，为服务节点生成服务主节点和服务备节点之后，还设置为：Further, the virtualization service management platform is configured to: when the service reliability requirement level is a difference or an equal service primary standby node, after the service primary node and the service standby node are generated for the service node, further configured to:

当所述服务可靠性需求级别为差异化服务主备节点时，分配述服务主节点和所述服务备节点不同的虚拟资源，所述虚拟资源分布在不同的硬件设备上；When the service reliability requirement level is a differentiated service primary and secondary node, different virtual resources are allocated to the service primary node and the service standby node, and the virtual resources are distributed on different hardware devices;

当所述服务可靠性需求级别为平等服务主备节点时，分配所述服务主节点和所述服务备节点相同的虚拟资源，所述虚拟资源分布在不同的硬件设备上。When the service reliability requirement level is an equal service primary and secondary node, the same virtual resource is allocated to the service primary node and the service standby node, and the virtual resources are distributed on different hardware devices.

具体的，当所述服务可靠性需求级别为差异化服务主备节点时，所述虚拟化服务管理平台220会分配述服务主节点和所述服务备节点不同的虚拟资源，所述服务备节点的虚拟资源相对于服务主节点的虚拟资源存在精简，并分布在不同的硬件设备上。以达到节约资源的效果，以便在检测到主服务异常时，可以自动化切换服务至备节点，实现服务不中断。Specifically, when the service reliability requirement level is a differentiated service active and standby node, the virtualization service management platform 220 allocates different virtual resources, which are different from the service primary node and the service standby node, and the service standby node The virtual resources are thinned relative to the virtual resources of the service master node and distributed on different hardware devices. In order to achieve the effect of saving resources, in order to detect the abnormality of the main service, the switching service can be automatically automated to the standby node, and the service is not interrupted.

具体的，当所述服务可靠性需求级别为平等服务主备节点时，所述虚拟化服务管理平台220分配所述服务主节点和所述服务备节点相同的虚拟资源，并分布在不同的硬件设备上，以便在检测到主服务异常时，因为有相同的虚拟资源，可以更加迅捷的主备服务切换，对整个服务链产生影响最小化。Specifically, when the service reliability requirement level is an equal service active standby node, the virtualization service management platform 220 allocates the same virtual resource of the service primary node and the service standby node, and is distributed in different hardware. On the device, in order to detect the main service exception, because of the same virtual resources, the switch between the active and standby services can be more rapid, and the impact on the entire service chain is minimized.

进一步地，在上述实施例的基础上，所述虚拟化服务管理平台当所述服务可靠性需求级别为差异化或者平等服务主备节点时，在确定所述服务主节点异常后，切换所述服务主节点，是指：Further, on the basis of the foregoing embodiment, when the service reliability requirement level is a differentiated or equal service active standby node, the virtualization service management platform switches the service after determining that the service primary node is abnormal. The service master node means:

请求所述网络控制器改变服务端口链，替换所述服务主节点为服务备节点来处理服务业务。The network controller is requested to change a service port chain, and the service master node is replaced by a service standby node to process a service service.

具体的，在确定所述服务主节点异常后，切换所述服务主节点为服务备节点时，需要发送请求给所述网络控制器，以改变服务端口链，用服务备节点替换服务主节点来处理服务业务，可以使得服务不中断，保证整个服务功能链的正常运转。 Specifically, after determining that the service primary node is abnormal, when the serving primary node is a serving standby node, a request is sent to the network controller to change a service port chain, and the service standby node is replaced by the service standby node. Dealing with the service business can make the service uninterrupted and ensure the normal operation of the entire service function chain.

进一步地，在上述实施例的基础上，所述虚拟化服务管理平台替换所述服务主节点为服务备节点来处理服务业务之前，还包括：Further, on the basis of the foregoing embodiment, before the virtualized service management platform replaces the service master node as a service standby node to process the service service, the method further includes:

当所述服务可靠性需求级别为差异化服务主备节点时，为所述服务备节点在线扩容与所述服务主节点相等的虚拟化资源。When the service reliability requirement level is the differentiated service primary and secondary nodes, the service standby node is online expanded with the virtualized resources equal to the service primary node.

具体的，当所述服务可靠性需求级别为差异化服务主备节点时，所述虚拟化服务管理平台分配给所述服务主节点和服务备节点的资源不同，在所述服务主节点出现异常时，切换所述服务主节点为服务备节点，因此，需要给服务备节点进行在线扩容，使得所述服务备节点与所述异常的服务主节点拥有相等的虚拟化资源，以便所述服务备节点有能力处理异常的服务主节点之前处理的业务。Specifically, when the service reliability requirement level is a differentiated service primary and secondary node, the resources allocated by the virtualization service management platform to the service primary node and the service standby node are different, and an abnormality occurs in the service primary node. The service primary node is switched to be a service standby node. Therefore, the service standby node needs to be expanded online, so that the service standby node and the abnormal service primary node have equal virtualized resources, so that the service is prepared. The node has the ability to handle the business that was previously processed by the abnormal service master node.

进一步地，在上述实施例的基础上，所述虚拟化服务管理平台在确定所述服务主节点异常后，按照单服务节点恢复策略恢复所述服务主节点，还包括：Further, on the basis of the foregoing embodiment, after the determining that the service primary node is abnormal, the virtualized service management platform recovers the service primary node according to the single service node recovery policy, and further includes:

当所述服务可靠性需求级别为差异化服务主备节点时，恢复所述服务主节点点时获取与原服务备节点相等的虚拟化资源。When the service reliability requirement level is a differentiated service primary and secondary node, the virtualized resource equal to the original serving standby node is obtained when the service primary node is restored.

具体的，若服务可靠性需求等级为差异化服务主备节点时，按照单服务节点恢复策略恢复原服务主节点时，按照原服务备节点虚拟资源规格分配的虚拟化资源，以使得可以处理业务原服务备节点的服务业务。Specifically, if the service reliability requirement level is the differentiated service primary and secondary nodes, when the original service primary node is restored according to the single service node recovery policy, the virtualized resources allocated according to the original service standby node virtual resource specifications are used to enable the service to be processed. The service service of the original service standby node.

所述网络控制器30在当所述服务可靠性需求级别为差异或平等服务主备节点时，制定所述服务主节点和服务备节点数据转发策略，所述服务主节点和服务备节点会接收上游服务节点的请求，将所述服务主节点和服务备节点处理的数据发送到下游服务节点，并转发至所述服务检测模块，即在进行服务的过程中，服务主节点和服务备节点都会进行数据处理，从而在服务主节点异常时，服务备节点还可以继续服务，从而保证了服务服务功能链的正常运行。The network controller 30 formulates the data forwarding policy of the service primary node and the service standby node when the service reliability requirement level is a difference or an equal service primary and secondary node, and the service primary node and the service standby node receive The request of the upstream service node sends the data processed by the service master node and the service standby node to the downstream service node, and forwards the data to the service detection module, that is, in the process of performing the service, the service master node and the service standby node both Data processing is performed so that when the service master node is abnormal, the service standby node can continue to serve, thereby ensuring the normal operation of the service service function chain.

进一步地，图4为本发明实施例提供的一种虚拟化数据中心调度方法实施例一的流程示意图，如图4所示，所述方法包括：Further, FIG. 4 is a schematic flowchart of Embodiment 1 of a method for scheduling a virtualized data center according to an embodiment of the present invention. As shown in FIG. 4, the method includes:

S101、提供服务的类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点；S101. The interface for providing a service type, a resource template, and a reliability requirement level, and receiving a service orchestration policy to generate a service function chain, where the service reliability requirement level includes an abnormal alarm, a single service node automatic recovery, and a differentiated service active and standby node. And equal service primary and secondary nodes;

S102、根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点；S102. Allocate a resource of the data center according to the defined service type and the resource template to generate a service node. After the service node is abnormal, scheduling the resource of the data center according to the service reliability requirement level to restore the service node;

S103、制定所述服务节点的数据包转发路径。S103. Formulate a packet forwarding path of the service node.

本发明实施例提供的一种虚拟化数据中心调度方法，包括：提供服务的类型、资源模板和可靠性需求级别的定义接口，接收服务编排策略生成服务功能链，所述服务可靠性需求级别包括异常告警、单服务节点自动恢复、差异化服务主备节点和平等服务主备节点，根据所定义的服务类型和资源模板分配数据中心的资源生成服务节点，在确定所述服务节点异常后，按照服务可靠性需求级别调度所述数据中心的资源恢复所述服务节点，制定所述服务节点的数据包转发路径，从而使得在服务节点出现异常时，可以根据用户所定义的服务类型、资源模板和服务可靠性需求级别来恢复服务节点，进而保证了功能服务链的服务节点的正常运行，提高了数据中心的高可靠性服务。A virtualized data center scheduling method provided by the embodiment of the present invention includes: providing a service interface type, a resource template, and a reliability requirement level definition interface, and receiving a service orchestration policy generation service function chain, where the service reliability requirement level includes An abnormal alarm, a single service node automatic recovery, a differentiated service primary standby node, and an equal service active standby node, and a data center resource generation service node is allocated according to the defined service type and resource template, and after determining the service node abnormality, according to The service reliability requirement level schedules the resources of the data center to recover the service node, and formulates a data packet forwarding path of the service node, so that when an abnormality occurs in the service node, the service type, resource template, and The service reliability requirement level is used to restore the service node, thereby ensuring the normal operation of the service node of the functional service chain and improving the high reliability service of the data center.

进一步地，在上述实施例的基础上，上述S102包括：Further, based on the foregoing embodiment, the foregoing S102 includes:

根据所定义的服务类型和资源模板分配数据中心的硬件设备资源，管理和维护所述硬件设备资源，提供所述硬件设备资源的性能度量数据，确定所述服务节点是否有异常；Allocating hardware device resources of the data center according to the defined service type and the resource template, managing and maintaining the hardware device resource, providing performance metric data of the hardware device resource, and determining whether the service node is abnormal;

根据所定义的服务类型和资源模板分配数据中心的虚拟化资源生成服务节点，在确定所述服务节点异常后，根据服务可靠性需求级别调度所述数据中心的虚拟化资源恢复所述服务节点。And assigning a service node to the virtualized resource of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the virtualized resource of the data center to restore the service node according to the service reliability requirement level.

本发明实施例提供的方法，可以执行上述装置实施例，其实现原理和技术效果类似，在此不再赘述。The method provided by the embodiment of the present invention can be used to implement the foregoing device embodiment, and the implementation principle and the technical effect are similar, and details are not described herein again.

进一步地，在上述实施例的基础上，上述S102根据所定义的服务类型和资源模板分配数据中心的虚拟化资源生成服务节点之后，还包括：Further, on the basis of the foregoing embodiment, after the foregoing S102 generates the service node by using the virtualized resource of the data center according to the defined service type and the resource template, the method further includes:

当所述功能服务链存在服务主节点和服务备节点时，记录所述服务主节点和所述服务备节点的映射关系，检测所述服务主节点和所述服务备节点服务状态的一致性和数据包的时效性，确定服务主节点和所述服务备节点是否异常。When the service service node has a service primary node and a service standby node, the mapping relationship between the service primary node and the service standby node is recorded, and the service status of the service primary node and the service standby node is detected. The timeliness of the data packet determines whether the service master node and the service standby node are abnormal.

进一步地，在上述实施例的基础上，上述在确定所述服务主节点异常后，还包括： Further, on the basis of the foregoing embodiment, after determining that the service master node is abnormal, the method further includes:

进一步地，在上述实施例的基础上，在上述检测所述服务主节点和所述服务备节点服务状态的一致性之后，还包括：Further, after the foregoing detecting the consistency of the service status of the service master node and the service standby node, the method further includes:

若所述服务主节点和所述服务备节点服务状态不一致时，则确定所述服务备节点需要向所述服务主节点同步，同步所述服务主节点的内存和镜像文件至服务备节点。If the service master node and the service standby node have different service states, determine that the service standby node needs to synchronize with the service master node, and synchronize the memory and image files of the service master node to the service standby node.

进一步地，在上述实施例的基础上，在上述确定所述服务备节点需要向所述服务主节点同步，同步所述服务主节点的内存和镜像文件至服务备节点之后，还包括：Further, on the basis of the foregoing embodiment, after determining that the service standby node needs to synchronize with the service master node, and synchronizing the memory and the image file of the service master node to the service standby node, the method further includes:

按照预设时间间隔创建快照存储，若存在远程容灾数据中心，将按照预设时间间隔同步所述服务备节点的镜像文件至远程容灾数据中心。The snapshot storage is created at the preset interval. If the remote disaster recovery data center exists, the image of the standby node is synchronized to the remote disaster recovery data center.

进一步地，在上述实施例的基础上，上述S102具体为：Further, based on the foregoing embodiment, the foregoing S102 is specifically:

当所述服务可靠性需求级别为异常告警时，所述虚拟化服务管理平台在确定所述服务节点异常后，生成告警提示；When the service reliability requirement level is an abnormal alarm, the virtualization service management platform generates an alarm prompt after determining that the service node is abnormal;

当所述服务可靠性需求级别为单服务节点自动恢复时，所述虚拟化服务管理平台在确定所述服务节点异常后，按照单服务节点恢复策略恢复所述服务节点；When the service reliability requirement level is automatically restored by the single service node, the virtualized service management platform recovers the service node according to the single service node recovery policy after determining that the service node is abnormal;

当所述服务可靠性需求级别为差异化或平等服务主备节点时，所述虚拟化服务管理平台为服务节点生成服务主节点和服务备节点，在确定所述服务备节点异常后，按照单服务节点恢复策略恢复所述服务备节点，并向所述服务主节点完成同步，在确定所述服务主节点异常后，切换所述服务主节点，按照单服务节点恢复策略恢复所述服务主节点。When the service reliability requirement level is a differentiated or equal service primary and secondary node, the virtualization service management platform generates a service primary node and a service standby node for the service node, and after determining that the service standby node is abnormal, according to the single The service node recovery policy restores the service standby node, and completes synchronization with the service primary node. After determining that the service primary node is abnormal, the service primary node is switched, and the service primary node is restored according to a single service node recovery policy. .

本发明实施例提供的方法，可以执行上述装置实施例，其实现原理和技术效果类似，在此不再赘述。 The method provided by the embodiment of the present invention can be used to implement the foregoing device embodiment, and the implementation principle and the technical effect are similar, and details are not described herein again.

具体的，所述单服务节点恢复策略，包括：Specifically, the single service node recovery policy includes:

进一步地，在上述实施例的基础上，当所述服务可靠性需求级别为差异或平等服务主备节点时，为服务节点生成服务主节点和服务备节点，包括：Further, on the basis of the foregoing embodiment, when the service reliability requirement level is a difference or an equal service primary and secondary node, generating a service primary node and a service standby node for the service node, including:

进一步地，在上述实施例的基础上，当所述服务可靠性需求级别为差异化或者平等服务主备节点时，在确定所述服务主节点异常后，切换所述服务主节点，包括：Further, on the basis of the foregoing embodiment, when the service reliability requirement level is a differentiated or equal service primary and secondary node, after determining that the service primary node is abnormal, switching the service primary node includes:

进一步地，在上述实施例的基础上，在替换所述服务主节点为服务备节点来处理服务业务之前，还包括： Further, on the basis of the foregoing embodiment, before the service primary node is replaced by the serving standby node to process the service service, the method further includes:

进一步地，在上述实施例的基础上，在确定所述服务主节点异常后，按照单服务节点恢复策略恢复所述服务主节点之后，还包括：Further, on the basis of the foregoing embodiment, after determining that the service primary node is abnormal, after recovering the service primary node according to the single service node recovery policy, the method further includes:

当所述服务可靠性需求级别为差异化服务主备节点时，恢复所述服务主节点时获取与原服务备节点相等的虚拟化资源。When the service reliability requirement level is a differentiated service primary and secondary node, the virtualized resource equal to the original service standby node is obtained when the service primary node is restored.

虽然本发明所揭露的实施方式如上，但所述的内容仅为便于理解本发明而采用的实施方式，并非用以限定本发明。任何本发明所属领域内的技术人员，在不脱离本发明所揭露的精神和范围的前提下，可以在实施的形式及细节上进行任何的修改与变化，但本发明的专利保护范围，仍须以所附的权利要求书所界定的范围为准。 While the embodiments of the present invention have been described above, the described embodiments are merely for the purpose of understanding the invention and are not intended to limit the invention. Any modification and variation in the form and details of the embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. The scope defined by the appended claims shall prevail.

Claims

A virtualized data center scheduling system includes: a service definition and orchestration module, a data center management platform, and a network controller;

The service definition and orchestration module is configured to provide a service interface, a resource template, and a reliability requirement level definition interface, and receive a service orchestration policy generation service function chain, where the service reliability requirement level includes an abnormal alarm, and a single service node automatically recovers Differentiated service active and standby nodes and equal service active and standby nodes;

The data center management platform is configured to allocate a resource node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling resource recovery of the data center according to a service reliability requirement level. The service node;

The network controller is configured to formulate a packet forwarding path of the service node.

The system of claim 1, wherein the data center management platform comprises: a device resource management platform and a virtualization service management platform;

The device resource management platform is configured to allocate hardware device resources of the data center according to the defined service type and the resource template, manage and maintain the hardware device resources, provide performance metric data of the hardware device resources, and determine the service. Whether the node has an exception;

The virtualization service management platform is configured to allocate a virtualized resource generation service node of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the data center according to a service reliability requirement level. The virtualized resource restores the service node.

The system of claim 2, wherein the system further comprises: a service detection module;

The service detection module is configured to detect, when the service service chain has a service primary node and a service standby node, the consistency of the service status of the service primary node and the service standby node, and the timeliness of receiving the data packet, and determine Whether the service primary node and the service standby node are abnormal.

The system according to claim 3, wherein the service detecting module further sets the following after determining that the service master node is abnormal:

Re-determine the service master-slave relationship, set the original service standby node as the service master node, and set the original service master node as the service standby node.

The system of claim 4, wherein the system further comprises: a service backup module;

The service backup module is configured to: if the service primary node and the service standby node have different service states, determine that the service standby node needs to synchronize with the service primary node, and synchronize the memory of the service primary node and Mirror the file to the service standby node.

The system of claim 5, wherein the service backup module is further configured to create a snapshot storage according to a preset time interval, and if there is a remote disaster recovery data center, the mirror of the service standby node is synchronized according to a preset time interval. Files to a remote disaster recovery data center.

The system according to claim 6, wherein the virtualization service management platform is configured to, after determining that the service node is abnormal, scheduling the virtualized resource of the data center to restore the service node according to a service reliability requirement level, Refers to:

The virtualization service management platform is configured to generate an alarm prompt after determining that the service node is abnormal, when the service reliability requirement level is abnormal.

When the service reliability requirement level is automatically restored by the single service node, after determining that the service node is abnormal, the service node is restored according to the single service node recovery policy;

And when the service reliability requirement level is a differentiated or equal service primary and secondary node, generating a service primary node and a service standby node for the service node, and after determining that the service standby node is abnormal, recovering according to the single service node recovery policy. Serving the standby node, and completing synchronization with the service primary node. After determining that the service primary node is abnormal, the original service standby node is switched as the service primary node, and the original service primary node is the service standby node, and the recovery is performed according to the single service node recovery policy. The original service master node.

The system of claim 7 wherein said single serving node recovery policy comprises:

If only the service provided by the service node is abnormal, and the corresponding virtual computing, network connection, and storage resources are normal, the service node is restarted;

If the virtual computing or network resource corresponding to the service node is abnormal, performing a migration operation according to the storage resource image of the service node to restore a service function of the service node;

If the service node storage resource is abnormal, causing the service node image file to use abnormality, if there is a snapshot resource, attempting to reconstruct the service node with the latest snapshot of the service node, or when there is no snapshot resource, then virtualizing The service management platform reconstructs the service node according to the virtual resource template of the service node.

The system according to claim 7, wherein the virtualization service management platform is configured to generate a service primary node and a service standby node for the service node when the service reliability requirement level is a difference or an equal service primary standby node, Refers to:

When the service reliability requirement level is a differentiated service primary and secondary node, different virtual resources are allocated to the service primary node and the service standby node, and the virtual resources are distributed on different hardware devices;

When the service reliability requirement level is an equal service primary and secondary node, the same virtual resource is allocated to the service primary node and the service standby node, and the virtual resources are distributed on different hardware devices.

The system according to claim 9, wherein the virtualization service management platform is configured to switch after determining that the service primary node is abnormal when the service reliability requirement level is a differentiated or equal service primary standby node. The service master node refers to:

The network controller is requested to change a service port chain, and the service master node is replaced by a service standby node to process a service service.

The system of claim 10, wherein the virtualization service management platform is configured to replace the service main section Before the point is the service standby node to handle the service business, it also includes:

When the service reliability requirement level is the differentiated service primary and secondary nodes, the service standby node is online expanded with the virtualized resources equal to the service primary node.

The system according to claim 11, wherein the virtualization service management platform is configured to restore the service master node according to a single service node recovery policy after determining that the service master node is abnormal, and further configured to:

When the service reliability requirement level is a differentiated service primary and secondary node, the virtualized resource equal to the original service standby node is obtained when the service primary node is restored.

A virtualized data center scheduling method includes:

Providing a service interface type, a resource template, and a reliability requirement level definition interface, and receiving a service orchestration policy generation service function chain, where the service reliability requirement level includes an abnormal alarm, a single service node automatic recovery, a differentiated service active standby node, and an equalization Serving the active and standby nodes;

Allocating, according to the defined service type and the resource template, the resource generation service node of the data center, after determining that the service node is abnormal, scheduling the resource of the data center according to the service reliability requirement level to restore the service node;

A packet forwarding path of the service node is formulated.

The method according to claim 13, wherein the resource generation service node according to the defined service type and resource template allocation data center, after determining that the service node is abnormal, scheduling the data according to a service reliability requirement level The central resource restores the service node, including:

Allocating hardware device resources of the data center according to the defined service type and the resource template, managing and maintaining the hardware device resource, providing performance metric data of the hardware device resource, and determining whether the service node is abnormal;

And assigning a service node to the virtualized resource of the data center according to the defined service type and the resource template, and after determining that the service node is abnormal, scheduling the virtualized resource of the data center to restore the service node according to the service reliability requirement level.

The method of claim 14, wherein after the service node is allocated according to the defined service type and the resource template to allocate the virtualized resource of the data center, the method further includes:

When the service service node has a service primary node and a service standby node, the mapping relationship between the service primary node and the service standby node is recorded, and the service status of the service primary node and the service standby node is detected. The timeliness of the data packet determines whether the service master node and the service standby node are abnormal.

The method according to claim 15, wherein after determining that the service master node is abnormal, the method further comprises:

The method according to claim 16, wherein said detecting said service master node and said service standby node service state After the consistency of the state, it also includes:

If the service master node and the service standby node have different service states, determine that the service standby node needs to synchronize with the service master node, and synchronize the memory and image files of the service master node to the service standby node.

The method of claim 17, wherein the determining that the service standby node needs to synchronize with the service master node, after synchronizing the memory and image files of the service master node to the service standby node, further includes:

The snapshot storage is created at the preset interval. If the remote disaster recovery data center exists, the image of the standby node is synchronized to the remote disaster recovery data center.

The method of claim 18, wherein, after determining that the service node is abnormal, scheduling the virtualized resource of the data center to restore the service node according to a service reliability requirement level comprises:

When the service reliability requirement level is an abnormal alarm, the virtualization service management platform generates an alarm prompt after determining that the service node is abnormal;

When the service reliability requirement level is automatically restored by the single service node, the virtualized service management platform recovers the service node according to the single service node recovery policy after determining that the service node is abnormal;

When the service reliability requirement level is a differentiated or equal service primary and secondary node, the virtualization service management platform generates a service primary node and a service standby node for the service node, and after determining that the service standby node is abnormal, according to the single The service node recovery policy restores the service standby node, and completes synchronization with the service primary node. After determining that the service primary node is abnormal, the service primary node is switched, and the service primary node is restored according to a single service node recovery policy. .

The method of claim 19, wherein the single serving node recovery policy comprises:

The method according to claim 19, wherein when the service reliability requirement level is a difference or an equal service primary node, generating a service master node and a service standby node for the service node, including:

The method of claim 21, wherein, when the service reliability requirement level is a differentiated or equal service primary and secondary node, after determining that the service primary node is abnormal, switching the service primary node comprises:

The method according to claim 22, further comprising: before replacing the service master node as a service standby node to process service services, further comprising:

The method of claim 23, wherein when the service reliability requirement level is a differentiated or equal service primary and backup node, recovering the service master node according to the single service node recovery policy, further comprising:

The virtualized resource restored by the service primary node is equal to the virtualized resource of the original serving standby node.