[go: up one dir, main page]

CN107070699A - Controller monitoring is managed in storage system redundancy design method and device - Google Patents

Controller monitoring is managed in storage system redundancy design method and device Download PDF

Info

Publication number
CN107070699A
CN107070699A CN201710125434.8A CN201710125434A CN107070699A CN 107070699 A CN107070699 A CN 107070699A CN 201710125434 A CN201710125434 A CN 201710125434A CN 107070699 A CN107070699 A CN 107070699A
Authority
CN
China
Prior art keywords
controller
management
ncsi
monitoring
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710125434.8A
Other languages
Chinese (zh)
Inventor
刘希猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710125434.8A priority Critical patent/CN107070699A/en
Publication of CN107070699A publication Critical patent/CN107070699A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/22Arrangements for detecting or preventing errors in the information received using redundant apparatus to increase reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/40Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass for recovering from a failure of a protocol instance or entity, e.g. service redundancy protocols, protocol state redundancy or protocol service redirection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention belongs to the technical field of storage system monitoring management, disclose the redundancy design method that controller monitoring is managed in a kind of storage system, different network interface cards are connected to by NCSI dual links including the baseboard management controller on controller mainboard, baseboard management controller transfers information to external management elements by NCSI dual links and network interface card, the NCSI double-strand road networks supported using A A or A S redundancy strategies, running substrate Management Controller.The invention also discloses the Redundancy Design device that controller monitoring in storage system is managed, including NCSI dual links sharing module, information transfer module and monitoring management strategy operation module.The present invention is shared by NCSI dual links, when network interface card or external management elements failure, improves the reliability of controller monitoring management.

Description

存储系统中控制器监控管理的冗余设计方法及装置Redundancy design method and device for controller monitoring and management in storage system

技术领域technical field

本发明属于存储系统监控管理的技术领域,特别是涉及一种存储系统中控制器监控管理的冗余设计方法及装置。The invention belongs to the technical field of storage system monitoring and management, in particular to a redundant design method and device for controller monitoring and management in a storage system.

背景技术Background technique

当前的数据中心部署中,磁盘阵列得到了广泛的运用。为了提高存储设备的高可用,数据链路上控制器数量从双控发展到多控,全路径上实现了冗余设计。另一方面,海量数据人性化和智能化的管理成为一大需求,日益成为存储管理中的热点。面对日益增长的系统可靠性要求,用户不仅期望数据路径全冗余,更希望监控管理也实现冗余设计、热替换。In the current data center deployment, disk arrays are widely used. In order to improve the high availability of storage devices, the number of controllers on the data link has been developed from dual controllers to multiple controllers, and a redundant design has been implemented on the entire path. On the other hand, humanized and intelligent management of massive data has become a major demand, and has increasingly become a hot spot in storage management. In the face of increasing system reliability requirements, users not only expect full redundancy of data paths, but also hope that monitoring and management can also achieve redundant design and hot replacement.

过去通用磁盘阵列系统均集成SAS expander,而近期出现了较多双控制器磁盘阵列机头产品。控制器上更强调处理器、缓存、接口的扩展性,不再集成具备SES管理功能的SAS expander。同时,业界这种高端多控存储系统多采用模块化设计,系统中各模块可进行热插拔维护。对系统的设计提出了更高要求。此类系统的监控管理不能使用传统SES,而又不能像服务器那样直接采用单一BMC芯片提供监控管理服务。In the past, general-purpose disk array systems were integrated with SAS expander, but more dual-controller disk array head products have appeared recently. More emphasis is placed on the scalability of processors, caches, and interfaces on the controller, and the SAS expander with SES management functions is no longer integrated. At the same time, this kind of high-end multi-control storage system in the industry mostly adopts a modular design, and each module in the system can be hot-swapped for maintenance. Higher requirements are placed on the design of the system. The monitoring and management of such systems cannot use traditional SES, and cannot directly use a single BMC chip to provide monitoring and management services like a server.

发明内容Contents of the invention

本发明目的是提供一种存储系统中控制器监控管理的冗余设计方法及装置,本发明通过NCSI双链路共享,当网卡或者外部管理单元出现故障时,提升了控制器监控管理的可靠性。The purpose of the present invention is to provide a redundant design method and device for controller monitoring and management in a storage system. The present invention uses NCSI dual-link sharing, and when the network card or external management unit fails, the reliability of controller monitoring and management is improved. .

为了实现上述目的,本发明采用以下的技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种存储系统中控制器监控管理的冗余设计方法,包括以下步骤:A redundant design method for controller monitoring and management in a storage system, comprising the following steps:

控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡;The baseboard management controller on the controller motherboard is connected to different network cards through NCSI dual links;

基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;The baseboard management controller transmits information to the external management unit through the NCSI dual link and the network card;

采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络。Adopt the A-A or A-S redundancy strategy, and run the NCSI dual-link network supported by the baseboard management controller.

优选地,在控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡之前,还包括:控制器主板上的基板管理控制器收集和监控板卡上的信息。Preferably, before the baseboard management controller on the controller mainboard is connected to different network cards through NCSI dual links, the method further includes: the baseboard management controller on the controller mainboard collects and monitors information on the boards.

优选地,在基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元之后,还包括:将NCSI双链路对应的双网络端口与外部管理单元作为隔离网络。Preferably, after the baseboard management controller transmits the information to the external management unit through the NCSI dual link and the network card, the method further includes: using the dual network port corresponding to the NCSI dual link and the external management unit as an isolated network.

优选地,所述存储内部的监控管理独立网络的IP地址根据控制器ID和外部管理单元ID初始化生成之后固化。Preferably, the IP address of the internal monitoring and management independent network of the storage is initially generated according to the controller ID and the external management unit ID and then solidified.

优选地,所述A-A冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,基板管理控制器的两个独立网段IP地址同时对外传输信息。Preferably, the A-A redundancy strategy specifically includes: when the network card fails or one of the external management units fails, the IP addresses of the two independent network segments of the baseboard management controller transmit information to the outside at the same time.

优选地,所述A-S冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,与基板管理控制器连接的两条NCSI链路采用浮动IP漂移策略,浮动IP漂移策略即基板管理控制器获取到系统通知的外部状态时,将IP漂移至正确的链路进行监控管理。Preferably, the A-S redundancy strategy specifically includes: when the network card fails or one of the external management units fails, the two NCSI links connected to the baseboard management controller adopt a floating IP drift strategy, and the floating IP drift strategy That is, when the baseboard management controller obtains the external status notified by the system, it drifts the IP to the correct link for monitoring and management.

本发明还提供一种存储系统中控制器监控管理的冗余设计装置,包括:The present invention also provides a redundant design device for controller monitoring and management in a storage system, including:

NCSI双链路共享模块,用于控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡;NCSI dual-link sharing module, used to connect the baseboard management controller on the controller motherboard to different network cards through NCSI dual-link;

信息传送模块,用于基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;The information transmission module is used for the baseboard management controller to transmit information to the external management unit through the NCSI dual link and the network card;

监控管理策略运行模块,用于采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络。The monitoring management strategy operation module is used to adopt the A-A or A-S redundancy strategy to run the NCSI dual-link network supported by the baseboard management controller.

优选地,还包括:信息收集监控模块,用于控制器主板上的基板管理控制器收集和监控板卡上的信息。Preferably, it also includes: an information collection and monitoring module, used for the baseboard management controller on the controller main board to collect and monitor the information on the board.

优选地,还包括:隔离网络设置模块,用于将NCSI双链路对应的双网络端口与外部管理单元作为隔离网络。Preferably, it also includes: an isolated network setting module, configured to use the dual network port corresponding to the NCSI dual link and the external management unit as an isolated network.

优选地,所述A-A冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,基板管理控制器的两个独立网段IP地址同时对外传输信息;所述A-S冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,与基板管理控制器连接的两条NCSI链路采用浮动IP漂移策略,浮动IP漂移策略即基板管理控制器获取到系统通知的外部状态时,将IP漂移至正确的链路进行监控管理。Preferably, the A-A redundancy strategy specifically includes: when the network card fails or one of the external management units fails, the IP addresses of the two independent network segments of the baseboard management controller transmit information to the outside at the same time; the A-S redundancy strategy The strategy specifically includes: when the network card fails or one of the external management units fails, the two NCSI links connected to the baseboard management controller adopt the floating IP drifting strategy. The floating IP drifting strategy means that the baseboard management controller obtains the system When notifying the external state, the IP drifts to the correct link for monitoring and management.

与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

1.本发明的方法区别于以往存储阵列采用SAS expander和SES监控协议,在硬件部分,控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡,基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元,在软件部分,采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络,从而通过软硬件结合实现了控制器监控管理的冗余设计,有效地提高整个存储系统中控制器监控管理的可用性和实用性,提升了存储系统的可用可维护等级,本发明的方法区别于服务器中常用的单路NCSI共享,实现了管理端口的冗余。1. The method of the present invention is different from the previous storage array using SAS expander and SES monitoring protocol. In the hardware part, the baseboard management controller on the controller motherboard is connected to different network cards through NCSI dual links, and the baseboard management controller is connected to different network cards through NCSI dual links. The link and network card transmit information to the external management unit. In the software part, the A-A or A-S redundancy strategy is adopted to run the NCSI dual-link network supported by the baseboard management controller, thereby realizing the redundant monitoring and management of the controller through the combination of software and hardware. The design can effectively improve the usability and practicability of controller monitoring and management in the entire storage system, and improve the available and maintainable level of the storage system. The method of the present invention is different from the single-channel NCSI sharing commonly used in servers, and realizes the management port. redundancy.

2.本发明NCSI双链路对应的双网络端口与外部管理单元作为隔离网络,用作存储系统内部监控管理的独立网络,将其他网络用于提供用户操作或监控系统状态的外部网络,实现用户监控与系统监控网络的隔离,提升了系统的可靠性。2. The dual network port corresponding to the NCSI dual link of the present invention and the external management unit are used as an isolated network, which is used as an independent network for internal monitoring and management of the storage system, and other networks are used to provide external networks for user operation or monitoring system status, so as to realize user The isolation of monitoring and system monitoring network improves the reliability of the system.

3. 采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络,保障在网卡或者外部管理单元出现故障时,系统监控服务不中断,提升了存储系统中控制器监控管理的可靠性。3. Adopt the A-A or A-S redundancy strategy and run the NCSI dual-link network supported by the baseboard management controller to ensure that the system monitoring service is not interrupted when the network card or the external management unit fails, which improves the monitoring and management of the controller in the storage system. reliability.

附图说明Description of drawings

图1是本发明存储系统中控制器监控管理的冗余设计方法的流程示意图;Fig. 1 is a schematic flow chart of a redundant design method for controller monitoring and management in a storage system of the present invention;

图2为NCSI双链路共享的逻辑框图;Fig. 2 is a logical block diagram of NCSI dual-link sharing;

图3是本发明存储系统中控制器监控管理的冗余设计装置的结构示意图。FIG. 3 is a schematic structural diagram of a redundant design device for controller monitoring and management in the storage system of the present invention.

具体实施方式detailed description

为了便于理解,对本发明中出现的部分名词作以下解释说明:For ease of understanding, the following explanations are made to some nouns appearing in the present invention:

BMC:基板管理控制器(Baseboard Management Controller),服务器系统中,我们通常使用BMC来对主板的健康状况进行监控和管理,主板上的一些重要的参数如电压、温度、功耗等都是通过BMC监控记录的。BMC: Baseboard Management Controller (Baseboard Management Controller), in the server system, we usually use BMC to monitor and manage the health status of the motherboard. Some important parameters on the motherboard such as voltage, temperature, power consumption, etc. are all passed through the BMC monitoring records.

NCSI:NCSI(Network Controller Sideband Interface)就是一个由分布式管理任务组(Distributed Management Task Force, DMTF)定义的用于支持服务器带外管理的边带接口网络控制器的工业标准,由一个管理控制器和多个网络控制器组成。DMTF 为NCSI 定义了完整的基于以太网的控制命令请求和应答标准,此外,NCSI 还具备单线程,超时重传等机制。NCSI: NCSI (Network Controller Sideband Interface) is an industry standard defined by the Distributed Management Task Force (DMTF) to support server out-of-band management sideband interface network controllers. A management controller and multiple network controllers. DMTF has defined complete Ethernet-based control command request and response standards for NCSI. In addition, NCSI also has mechanisms such as single thread and timeout retransmission.

下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述:Below in conjunction with accompanying drawing and embodiment, the specific embodiment of the present invention is described in further detail:

本实施例提供一种存储系统中控制器监控管理的冗余设计方法,包括:This embodiment provides a redundant design method for controller monitoring and management in a storage system, including:

控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡;The baseboard management controller on the controller motherboard is connected to different network cards through NCSI dual links;

基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;The baseboard management controller transmits information to the external management unit through the NCSI dual link and the network card;

采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络。Adopt the A-A or A-S redundancy strategy, and run the NCSI dual-link network supported by the baseboard management controller.

请参考图1和图2,图1是本发明存储系统中控制器监控管理的冗余设计方法的流程示意图;图2为NCSI双链路共享的逻辑框图;本实施例提供一种存储系统中控制器监控管理的冗余设计方法,包括以下步骤:Please refer to FIG. 1 and FIG. 2. FIG. 1 is a schematic flowchart of a redundant design method for controller monitoring and management in a storage system of the present invention; FIG. 2 is a logical block diagram of NCSI dual-link sharing; this embodiment provides a storage system A redundant design method for controller monitoring and management, comprising the following steps:

步骤S101,控制器主板上的基板管理控制器收集和监控板卡上的信息;Step S101, the baseboard management controller on the controller mainboard collects and monitors information on the board;

控制器0和控制器1主板上的BMC收集和监控板卡上的信息。The BMCs on the motherboards of controller 0 and controller 1 collect and monitor information on the boards.

步骤S102,控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡,该网卡可以是独立网卡或者系统CPU sharelink的网卡;Step S102, the baseboard management controller on the controller mainboard is connected to different network cards through NCSI dual links, and the network cards can be independent network cards or system CPU sharelink network cards;

控制器0和控制器1主板上的BMC通过NCSI双链路分别连接至NIC0和NIC1。The BMCs on the motherboards of controller 0 and controller 1 are respectively connected to NIC0 and NIC1 through NCSI dual links.

步骤S103,基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;Step S103, the baseboard management controller transmits the information to the external management unit through the NCSI dual link and the network card;

控制器0和控制器1主板上的BMC通过一条NCSI链路和NIC0将信息传送至外部管理单元A,BMC通过另外一条NCSI链路和NIC1将信息传送至外部管理单元B。The BMC on the motherboard of controller 0 and controller 1 transmits information to external management unit A through an NCSI link and NIC0, and the BMC transmits information to external management unit B through another NCSI link and NIC1.

步骤S104,将NCSI双链路对应的双网络端口与外部管理单元作为隔离网络,用作存储系统内部监控管理的独立网络,将其他网络用于提供用户操作或者监控系统状态的外部网络,实现用户监控与系统监控网络的隔离,提升了系统的可靠性;由于与用户接入网络隔离,存储内部的监控管理独立网络的IP地址根据控制器ID和外部管理单元ID初始化生成之后固化,方便存储系统的部署,提升可用性和可靠性。Step S104, using the dual network port corresponding to the NCSI dual link and the external management unit as an isolated network, used as an independent network for internal monitoring and management of the storage system, and using other networks as an external network for providing user operations or monitoring system status, realizing user The isolation of the monitoring and system monitoring network improves the reliability of the system; due to the isolation from the user access network, the IP address of the internal monitoring and management independent network of the storage is initialized and generated according to the controller ID and the external management unit ID and then solidified, which is convenient for the storage system deployment to improve availability and reliability.

步骤S105,采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络。Step S105, using the A-A or A-S redundancy strategy to run the NCSI dual-link network supported by the baseboard management controller.

所述A-A冗余策略具体包括,当网卡出现故障或者外部管理单元中的某一个出现故障时,基板管理控制器的两个独立网段IP地址同时对外传输信息。所述A-S冗余策略具体包括,当网卡出现故障或者外部管理单元中的某一个出现故障时,与基板管理控制器连接的两条NCSI链路采用浮动IP漂移策略,浮动IP漂移策略即基板管理控制器获取到系统通知的外部状态时,将IP漂移至正确的链路进行监控管理。针对存储系统中的策略切换由上层存储软件来控制,该控制器中心可以通过串口、网络、本地远端独立单元上汇聚的各类信息进行策略生成和控制。The A-A redundancy strategy specifically includes that when the network card fails or one of the external management units fails, the IP addresses of the two independent network segments of the baseboard management controller transmit information to the outside at the same time. The A-S redundancy strategy specifically includes that when the network card fails or one of the external management units fails, the two NCSI links connected to the baseboard management controller adopt a floating IP drifting strategy, and the floating IP drifting strategy is the baseboard management When the controller obtains the external status notified by the system, it will drift the IP to the correct link for monitoring and management. The policy switching in the storage system is controlled by the upper-layer storage software, and the controller center can generate and control policies through various information collected on serial ports, networks, and local and remote independent units.

本发明通过NCSI双链路共享和冗余策略的设计,可以有效地提高整个存储系统中控制器监控管理的可用性和可靠性。The present invention can effectively improve the usability and reliability of controller monitoring and management in the whole storage system through the design of NCSI double-link sharing and redundancy strategy.

本实施例还提供一种存储系统中控制器监控管理的冗余设计装置,包括:This embodiment also provides a redundant design device for controller monitoring and management in a storage system, including:

NCSI双链路共享模块,用于控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡;NCSI dual-link sharing module, used to connect the baseboard management controller on the controller motherboard to different network cards through NCSI dual-link;

信息传送模块,用于基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;The information transmission module is used for the baseboard management controller to transmit information to the external management unit through the NCSI dual link and the network card;

监控管理策略运行模块,用于采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络。The monitoring management strategy operation module is used to adopt the A-A or A-S redundancy strategy to run the NCSI dual-link network supported by the baseboard management controller.

请参考图3,图3是本发明存储系统中控制器监控管理的冗余设计装置的结构示意图;本实施例提供一种存储系统中控制器监控管理的冗余设计装置,包括:信息收集监控模块301、NCSI双链路共享模块302、信息传送模块303、隔离网络设置模块304和监控管理策略运行模块305,所述信息收集监控模块301依次顺序与NCSI双链路共享模块302、信息传送模块303、隔离网络设置模块304和监控管理策略运行模块305连接。Please refer to FIG. 3. FIG. 3 is a schematic structural diagram of a redundant design device for controller monitoring and management in a storage system according to the present invention; this embodiment provides a redundant design device for controller monitoring and management in a storage system, including: information collection and monitoring Module 301, NCSI dual-link sharing module 302, information transmission module 303, isolated network setting module 304 and monitoring management policy operation module 305, described information collection and monitoring module 301 and NCSI dual-link sharing module 302, information transmission module in sequence 303. The isolated network setting module 304 is connected to the monitoring management policy running module 305.

信息收集监控模块301,用于控制器主板上的基板管理控制器收集和监控板卡上的信息;The information collection and monitoring module 301 is used for the baseboard management controller on the controller main board to collect and monitor the information on the board;

NCSI双链路共享模块302,用于控制器主板上的基板管理控制器通过NCSI双链路连接至不同的网卡;The NCSI dual-link sharing module 302 is used to connect the baseboard management controller on the controller motherboard to different network cards through the NCSI dual-link;

信息传送模块303,用于基板管理控制器通过NCSI双链路和网卡将信息传送至外部管理单元;The information transmission module 303 is used for the baseboard management controller to transmit information to the external management unit through the NCSI dual link and the network card;

隔离网络设置模块304,用于将NCSI双链路对应的双网络端口与外部管理单元作为隔离网络;The isolated network setting module 304 is used to use the dual network port corresponding to the NCSI dual link and the external management unit as an isolated network;

监控管理策略运行模块305,用于采用A-A或者A-S冗余策略,运行基板管理控制器支持的NCSI双链路网络,其中,所述A-A冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,基板管理控制器的两个独立网段IP地址同时对外传输信息,所述A-S冗余策略具体包括:当网卡出现故障或者外部管理单元中的某一个出现故障时,与基板管理控制器连接的两条NCSI链路采用浮动IP漂移策略,浮动IP漂移策略即基板管理控制器获取到系统通知的外部状态时,将IP漂移至正确的链路进行监控管理。The monitoring management policy operation module 305 is used to adopt the A-A or A-S redundancy strategy to run the NCSI dual-link network supported by the baseboard management controller, wherein the A-A redundancy strategy specifically includes: when the network card fails or the external management unit When one of the external management units fails, the IP addresses of the two independent network segments of the baseboard management controller transmit information to the outside at the same time. The A-S redundancy strategy specifically includes: when the network card fails or one of the external management units fails, communicate The two NCSI links connected to the baseboard management controller adopt the floating IP drifting strategy. The floating IP drifting strategy means that when the baseboard management controller obtains the external status notified by the system, it will drift the IP to the correct link for monitoring and management.

以上所示仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。What is shown above is only a preferred embodiment of the present invention. It should be pointed out that for those of ordinary skill in the art, some improvements and modifications can also be made without departing from the principles of the present invention. It should be regarded as the protection scope of the present invention.

Claims (10)

1. the redundancy design method that controller monitoring is managed in a kind of storage system, it is characterised in that comprise the following steps:
Baseboard management controller on controller mainboard is connected to different network interface cards by NCSI dual links;
Baseboard management controller transfers information to external management elements by NCSI dual links and network interface card;
The NCSI double-strand road networks supported using A-A or A-S redundancy strategies, running substrate Management Controller.
2. the redundancy design method that controller monitoring is managed in storage system according to claim 1, it is characterised in that Baseboard management controller on controller mainboard is connected to by NCSI dual links before different network interface cards, in addition to:Controller The information on board is collected and monitored to baseboard management controller on mainboard.
3. the redundancy design method that controller monitoring is managed in storage system according to claim 1, it is characterised in that Baseboard management controller is transferred information to after external management elements by NCSI dual links and network interface card, in addition to:By NCSI The corresponding dual network ports of dual link are used as isolation network with external management elements.
4. the redundancy design method that controller monitoring is managed in storage system according to claim 3, it is characterised in that institute The IP address for stating the monitoring management separate network inside storage initializes generation according to controller ID and external management elements ID Solidify afterwards.
5. the redundancy design method that controller monitoring is managed in storage system according to claim 1, it is characterised in that institute A-A redundancy strategies are stated to specifically include:When some failure in network interface card failure or external management elements, substrate Two independent network segment IP address of Management Controller externally transmit information simultaneously.
6. the redundancy design method that controller monitoring is managed in storage system according to claim 1, it is characterised in that institute A-S redundancy strategies are stated to specifically include:When some failure in network interface card failure or external management elements, with base Two NCSI links of board management controller connection are using Floating IP address drift strategy, and Floating IP address drift strategy is substrate management control When device gets the external status of notifications, IP is drifted into correct link and is monitored management.
7. the Redundancy Design device that controller monitoring is managed in a kind of storage system, it is characterised in that including:
NCSI dual link sharing modules, the baseboard management controller on controller mainboard is connected to not by NCSI dual links Same network interface card;
Information transfer module, external management list is transferred information to for baseboard management controller by NCSI dual links and network interface card Member;
Monitoring management strategy runs module, for using A-A or A-S redundancy strategies, what running substrate Management Controller was supported NCSI double-strand road networks.
8. the Redundancy Design device that controller monitoring is managed in storage system according to claim 7, it is characterised in that also Including:
The information on board is collected and monitored to information monitoring module, the baseboard management controller on controller mainboard.
9. the Redundancy Design device that controller monitoring is managed in storage system according to claim 7, it is characterised in that also Including:
Isolation network setup module, for regarding the corresponding dual network ports of NCSI dual links and external management elements as separation net Network.
10. the Redundancy Design device that controller monitoring is managed in storage system according to claim 7, it is characterised in that
The A-A redundancy strategies are specifically included:When some failure in network interface card failure or external management elements When, two independent network segment IP address of baseboard management controller externally transmit information simultaneously;
The A-S redundancy strategies are specifically included:When some failure in network interface card failure or external management elements When, two NCSI links being connected with baseboard management controller are using Floating IP address drift strategy, and Floating IP address drift strategy is substrate When Management Controller gets the external status of notifications, IP is drifted into correct link and is monitored management.
CN201710125434.8A 2017-03-04 2017-03-04 Controller monitoring is managed in storage system redundancy design method and device Pending CN107070699A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710125434.8A CN107070699A (en) 2017-03-04 2017-03-04 Controller monitoring is managed in storage system redundancy design method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710125434.8A CN107070699A (en) 2017-03-04 2017-03-04 Controller monitoring is managed in storage system redundancy design method and device

Publications (1)

Publication Number Publication Date
CN107070699A true CN107070699A (en) 2017-08-18

Family

ID=59621876

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710125434.8A Pending CN107070699A (en) 2017-03-04 2017-03-04 Controller monitoring is managed in storage system redundancy design method and device

Country Status (1)

Country Link
CN (1) CN107070699A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807897A (en) * 2017-11-24 2018-03-16 郑州云海信息技术有限公司 A kind of universal network interface card framework
CN108365998A (en) * 2018-01-03 2018-08-03 郑州云海信息技术有限公司 A method of verification NCSI stability
CN110673710A (en) * 2019-09-12 2020-01-10 苏州浪潮智能科技有限公司 A method, device, device and medium for resetting a server chassis
CN115733737A (en) * 2023-01-10 2023-03-03 苏州浪潮智能科技有限公司 Method for managing IP drift and storage machine frame

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040059862A1 (en) * 2002-09-24 2004-03-25 I-Bus Corporation Method and apparatus for providing redundant bus control
CN1819525A (en) * 2004-11-15 2006-08-16 英特尔公司 Intelligent platform management bus switch system
CN101179603A (en) * 2006-11-09 2008-05-14 上海贝尔阿尔卡特股份有限公司 Method and device for controlling user network access in IPv6 network
CN104133799A (en) * 2014-08-06 2014-11-05 曙光信息产业(北京)有限公司 Multi-network-card NCSI management system
CN104182306A (en) * 2014-08-08 2014-12-03 成都致云科技有限公司 Low-cost intelligent breakdown instant switching method for cloud host
CN204009884U (en) * 2014-08-06 2014-12-10 曙光信息产业(北京)有限公司 A kind of many network interface cards NCSI management system
CN104731727A (en) * 2015-03-25 2015-06-24 浪潮集团有限公司 Double control monitoring and management system and method for storage system
CN105159851A (en) * 2015-07-02 2015-12-16 浪潮(北京)电子信息产业有限公司 Multi-controller storage system
CN106385366A (en) * 2016-08-31 2017-02-08 迈普通信技术股份有限公司 TRILL network management method and device
CN106406793A (en) * 2016-09-13 2017-02-15 广东威创视讯科技股份有限公司 Identifier configuration method and system and IP (Internet Protocol) address allocation method and system for node machine

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040059862A1 (en) * 2002-09-24 2004-03-25 I-Bus Corporation Method and apparatus for providing redundant bus control
CN1819525A (en) * 2004-11-15 2006-08-16 英特尔公司 Intelligent platform management bus switch system
CN101179603A (en) * 2006-11-09 2008-05-14 上海贝尔阿尔卡特股份有限公司 Method and device for controlling user network access in IPv6 network
CN104133799A (en) * 2014-08-06 2014-11-05 曙光信息产业(北京)有限公司 Multi-network-card NCSI management system
CN204009884U (en) * 2014-08-06 2014-12-10 曙光信息产业(北京)有限公司 A kind of many network interface cards NCSI management system
CN104182306A (en) * 2014-08-08 2014-12-03 成都致云科技有限公司 Low-cost intelligent breakdown instant switching method for cloud host
CN104731727A (en) * 2015-03-25 2015-06-24 浪潮集团有限公司 Double control monitoring and management system and method for storage system
CN105159851A (en) * 2015-07-02 2015-12-16 浪潮(北京)电子信息产业有限公司 Multi-controller storage system
CN106385366A (en) * 2016-08-31 2017-02-08 迈普通信技术股份有限公司 TRILL network management method and device
CN106406793A (en) * 2016-09-13 2017-02-15 广东威创视讯科技股份有限公司 Identifier configuration method and system and IP (Internet Protocol) address allocation method and system for node machine

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807897A (en) * 2017-11-24 2018-03-16 郑州云海信息技术有限公司 A kind of universal network interface card framework
CN108365998A (en) * 2018-01-03 2018-08-03 郑州云海信息技术有限公司 A method of verification NCSI stability
CN110673710A (en) * 2019-09-12 2020-01-10 苏州浪潮智能科技有限公司 A method, device, device and medium for resetting a server chassis
CN110673710B (en) * 2019-09-12 2021-06-11 苏州浪潮智能科技有限公司 Server case resetting method, device, equipment and medium
CN115733737A (en) * 2023-01-10 2023-03-03 苏州浪潮智能科技有限公司 Method for managing IP drift and storage machine frame
WO2024148853A1 (en) * 2023-01-10 2024-07-18 苏州元脑智能科技有限公司 Management ip drift method and storage frame

Similar Documents

Publication Publication Date Title
CN101651559B (en) A method for failover of storage services in a dual-controller storage system
JP6317856B2 (en) Smooth controller change in redundant configuration between clusters
US6609213B1 (en) Cluster-based system and method of recovery from server failures
EP3158455B1 (en) Modular switched fabric for data storage systems
US6446141B1 (en) Storage server system including ranking of data source
US6553408B1 (en) Virtual device architecture having memory for storing lists of driver modules
US9916113B2 (en) System and method for mirroring data
US8266472B2 (en) Method and system to provide high availability of shared data
CN106886366B (en) Storage medium, system and method for using an extender for storage area network management
TW202041061A (en) System and method for configuration drift detection and remediation
US7434107B2 (en) Cluster network having multiple server nodes
CN102622279B (en) Redundancy control system, method and Management Controller
CN104731727B (en) A kind of dual control storage system monitoring management system and method
CN108153622A (en) The method, apparatus and equipment of a kind of troubleshooting
US9705984B2 (en) System and method for sharing data storage devices
JP2004530972A (en) Twin-connection failover for file servers that maintain full performance in the presence of failures
CN107070699A (en) Controller monitoring is managed in storage system redundancy design method and device
CN105446657A (en) Method for monitoring RAID card
CN104077199A (en) Isolation method and system for high availability cluster based on shared disk
CN103023973A (en) Cluster server designing method based on CPCI (Compact Peripheral Component Interconnect) structure
US7373546B2 (en) Cluster network with redundant communication paths
CN113342261A (en) Server and control method applied to same
US20060174085A1 (en) Storage enclosure and method for the automated configuration of a storage enclosure
US11567834B2 (en) Data center storage availability architecture using rack-level network fabric
US8565067B2 (en) Apparatus, system, and method for link maintenance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170818