[go: up one dir, main page]

CN117687697A - Batch adaptation method and system for heterogeneous computing power cards of server - Google Patents

Batch adaptation method and system for heterogeneous computing power cards of server Download PDF

Info

Publication number
CN117687697A
CN117687697A CN202311658627.1A CN202311658627A CN117687697A CN 117687697 A CN117687697 A CN 117687697A CN 202311658627 A CN202311658627 A CN 202311658627A CN 117687697 A CN117687697 A CN 117687697A
Authority
CN
China
Prior art keywords
pcie card
bios
configuration parameters
bmc
pcie
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311658627.1A
Other languages
Chinese (zh)
Inventor
邓艳山
袁振涛
蔡财义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fiberhome Supermicro Information And Technology Co ltd
Original Assignee
Fiberhome Supermicro Information And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fiberhome Supermicro Information And Technology Co ltd filed Critical Fiberhome Supermicro Information And Technology Co ltd
Priority to CN202311658627.1A priority Critical patent/CN117687697A/en
Publication of CN117687697A publication Critical patent/CN117687697A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4411Configuring for operating with peripheral devices; Loading of device drivers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/177Initialisation or configuration control
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)

Abstract

The application provides a batch adaptation method and a batch adaptation system for heterogeneous computing power cards of a server, which are used for receiving a first configuration file containing PCIE card configuration parameters input by a user through centralized management software and sending the first configuration file to a corresponding BMC; the BIOS sends a first acquisition command of PCIE card configuration parameters to the BMC in the PEI phase; when the BMC receives the first acquisition command, the PCIE card configuration parameters in the first configuration file are sent to the BIOS, so that the BIOS initializes the PCIE card according to the PCIE card configuration parameters, batch adjustment of the configuration parameters of different PCIE cards under the condition that BIOS source codes are not modified is realized, maintenance cost of a server can be reduced, influence of PCIE card adaptation on service downtime is reduced, and stability of a server system is effectively improved.

Description

Batch adaptation method and system for heterogeneous computing power cards of server
Technical Field
The application relates to the technical field of computers, in particular to a batch adaptation method and a batch adaptation system for heterogeneous computing cards of a server.
Background
The data center generally deploys a large number of computing servers with different configuration types, and a plurality of heterogeneous PCIE computing cards are additionally inserted into the computing servers to run services of the intelligent computing center, such as FPGA accelerator cards, AI accelerator cards, DPU accelerator cards, NVME memory cards and the like.
When the PCIE cards are tested in a factory, verification is generally performed based on a specific hardware platform, compatibility verification cannot be performed based on all hardware platforms, and as different CPU architecture hardware platforms have differences in PCIE time sequence, auto-negotiation and default configuration, firmware of the PCIE cards can not work stably on different hardware platforms, various problems such as PCIE link failure, PCIE link rate bandwidth abnormality, abnormal reset of the PCIE cards, failure in finding out name display errors of the PCIE cards and the PCIE cards occur probabilistically. The root cause of these problems is caused by communication or configuration abnormality of the two parties when the CPU and firmware of the PCIE card perform PCIE communication and configuration in the BIOS startup phase of the server. Meanwhile, only one type of the legacy or the UEFI option rom of the PCIE card may be adapted to the hardware platform, and since the firmware designs of the server CPU platform and the PCIE card are strongly related, the current UEFI BIOS software performs initialization only once when performing PCIE device scanning and initialization, if the initialization fails or is abnormal, no error correction or fault tolerance mechanism exists, which results in lower reliability of heterogeneous PCIE cards and affects reliability and availability of computing resources.
When the PCIE computing card resource and the server have the adaptation problem, a large amount of BIOS code modification adaptation and repeated verification are needed to work reliably, wherein the operation comprises the adjustment of the parameter configuration of the card, the modification of the setup configuration item and the like, the workload is extremely large, and the excessive adaptation difficulty of the type of the card is large. If the adaptation is performed by upgrading the firmware of the BIOS or the card, the single adaptation efficiency is very low, the time for erasing 32 Mspi flash is long, the operation of updating the simple flash data is 3 minutes, the risk of upgrading failure exists, the service downtime is too long, in addition, the starting delay time of the DPU card of the bare metal server is not fixed, the current adaptation difficulty is large and time-consuming, and the application deployment of the heterogeneous calculation force is seriously affected.
Therefore, how to adapt heterogeneous computing cards on a server in batches is a technical problem to be solved.
Disclosure of Invention
The application provides a batch adaptation method and a batch adaptation system for heterogeneous computing cards of a server, which can solve the technical problem that the batch adaptation of the heterogeneous computing cards on the server cannot be performed in the prior art.
In a first aspect, an embodiment of the present application provides a batch adaptation method for heterogeneous computing power cards of a server, where the method includes:
the centralized management software receives a first configuration file containing PCIE card configuration parameters input by a user and sends the first configuration file to the corresponding BMC;
the BIOS sends a first acquisition command of PCIE card configuration parameters to the BMC in the PEI phase;
and when the BMC receives the first acquisition command, the PCIE card configuration parameters in the first configuration file are sent to the BIOS, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
With reference to the first aspect, in one implementation manner, the sending, by the BIOS, a first acquisition command of PCIE card configuration parameters to the BMC in a PEI phase includes:
the BIOS sends an inquiry command of whether to reconfigure PCIE card configuration parameters to the BMC in the PEI phase;
when the BMC receives the inquiry command, if the first configuration file is received, a reconfiguration mark is returned to the BIOS, otherwise, a reconfiguration mark which is not needed is returned to the BIOS;
the BIOS sends the first get command to the BMC upon receiving the reconfiguration flag.
In some embodiments, the method further comprises:
and when the BIOS receives the reconfiguration-free mark, initializing the PCIE card according to the stored PCIE card curing configuration parameters or a BIOS default flow.
In some embodiments, when the BIOS receives the reconfiguration-unnecessary flag, performing PCIE card initialization according to the stored PCIE card curing configuration parameters or a BIOS default procedure, including:
when the BIOS receives the reconfiguration-free mark, detecting whether a flash area of the BIOS stores PCIE card curing configuration parameters or not; if yes, initializing the PCIE card according to the PCIE card curing configuration parameters; otherwise, a second acquisition command of the PCIE card curing configuration parameters is sent to the BMC;
when the BMC receives the second acquisition command, detecting whether the PCIE card curing configuration parameters are stored in a flash area of the BMC, and if so, sending the PCIE card curing configuration parameters to the BIOS; otherwise, sending a signal without PCIE card curing configuration parameters to the BIOS;
if the BIOS receives the PCIE card curing configuration parameters sent by the BMC, initializing a PCIE card according to the PCIE card curing configuration parameters sent by the BMC; and if the signal without the PCIE card curing configuration parameters is received, initializing the PCIE card according to a BIOS default flow.
In some embodiments, after initializing the PCIE card according to the PCIE card configuration parameter or the PCIE card curing configuration parameter sent by the BMC, the BIOS further includes:
judging whether the PCIE card is successfully initialized;
if the PCIE card is initialized successfully, solidifying the current PCIE card configuration parameters into a flash area of the BIOS to form new PCIE card solidification configuration parameters, and sending the new PCIE card solidification configuration parameters to the BMC so that the BMC stores the new PCIE card solidification configuration parameters into the flash area of the BMC;
if the PCIE card initialization fails, the PCIE card initialization is performed according to a BIOS default flow.
In some embodiments, the method further comprises:
the centralized management software receives a query command of PCIE card configuration parameters input by a user and sends the query command to the BMC;
when the BMC receives the query command, detecting whether PCIE card curing configuration parameters are stored in a flash area of the BMC; if yes, the PCIE card curing configuration parameters stored in the flash area of the BMC are sent to the centralized management software; otherwise, sending a second configuration file of PCIE card configuration parameters in the interaction page of the BIOS to the centralized management software.
In some embodiments, the PCIE card configuration parameters include a slot number, a scan delay time, a number of repeated initialization times, a optionrom, PCIE bandwidth of whether to load a card, a PCIE rate, and a device name of the PCIE card.
In a second aspect, an embodiment of the present application provides a batch adaptation system for heterogeneous computing power cards of a server, the system including:
centralized management software, a baseboard management controller BMC and a basic input output system BIOS;
the centralized management software is used for receiving a first configuration file containing PCIE card configuration parameters input by a user and sending the first configuration file to the corresponding BMC;
the BIOS is used for sending a first acquisition command of PCIE card configuration parameters to the BMC in a PEI phase;
and the BMC is used for sending the PCIE card configuration parameters in the first configuration file to the BIOS when the first acquisition command is received, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
With reference to the first aspect, in an implementation manner, the BIOS is further configured to send, in a PEI phase, an inquiry command to the BMC about whether to reconfigure PCIE card configuration parameters;
the BMC is further used for returning a reconfiguration mark to the BIOS if the first configuration file is received when the inquiry command is received, otherwise, returning a reconfiguration mark to the BIOS;
the BIOS is further configured to send the first get command to the BMC upon receiving the reconfiguration flag.
In some embodiments, the BIOS is further configured to perform PCIE card initialization according to the stored PCIE card curing configuration parameters or a BIOS default procedure when the reconfiguration-unnecessary flag is received.
In some embodiments, the BIOS is further configured to detect, when the reconfiguration-unnecessary flag is received, whether a flash area of the BIOS stores PCIE card curing configuration parameters; if yes, initializing the PCIE card according to the PCIE card curing configuration parameters; otherwise, a second acquisition command of the PCIE card curing configuration parameters is sent to the BMC;
the BMC is further configured to detect, when the second acquisition command is received, whether the PCIE card curing configuration parameter is stored in a flash area of the BMC, and if yes, send the PCIE card curing configuration parameter to the BIOS; otherwise, sending a signal without PCIE card curing configuration parameters to the BIOS;
the BIOS is further configured to initialize a PCIE card according to the PCIE card curing configuration parameter sent by the BMC if the PCIE card curing configuration parameter sent by the BMC is received; and if the signal without the PCIE card curing configuration parameters is received, initializing the PCIE card according to a BIOS default flow.
In some embodiments, the BIOS is further to:
after performing PCIE card initialization according to the PCIE card configuration parameters or the PCIE card curing configuration parameters sent by the BMC, judging whether the PCIE card initialization is successful or not;
if the PCIE card is initialized successfully, solidifying the current PCIE card configuration parameters into a flash area of the BIOS to form new PCIE card solidification configuration parameters, and sending the new PCIE card solidification configuration parameters to the BMC so that the BMC stores the new PCIE card solidification configuration parameters into the flash area of the BMC;
if the PCIE card initialization fails, the PCIE card initialization is performed according to a BIOS default flow.
In some embodiments, the centralized management software is further configured to receive a query command of a PCIE card configuration parameter input by a user, and send the query command to the BMC;
the BMC is further used for detecting whether PCIE card curing configuration parameters are stored in a flash area of the BMC when the query command is received; if yes, the PCIE card curing configuration parameters stored in the flash area of the BMC are sent to the centralized management software; otherwise, sending a second configuration file of PCIE card configuration parameters in the interaction page of the BIOS to the centralized management software.
In some embodiments, the PCIE card configuration parameters include a slot number, a scan delay time, a number of repeated initialization times, a optionrom, PCIE bandwidth of whether to load a card, a PCIE rate, and a device name of the PCIE card.
The beneficial effects that technical scheme that this application embodiment provided include at least: the method has the advantages that the PCIE initialization parameter configuration can be customized for various heterogeneous PCIE computing cards in a data center server in batches, the adaptation efficiency of the computing cards is improved, the problem of compatibility of firmware or the options of the PCIE cards with a hardware platform can be shielded, the configuration parameters of the PCIE cards are adjusted under the condition that BIOS source codes are not modified through PCIE card customization parameter configuration, configuration files corresponding to the configuration parameters can be defined by different PCIE cards, the adaptation of different PCIE cards can be simultaneously carried out, the adaptation flexibility and the robustness are improved, only the configuration parameters of the different PCIE cards are modified in batches, the firmware of the BIOS or the PCIE cards is not required to be upgraded, the maintenance cost is effectively reduced, the influence of the adaptation on downtime of services is reduced, the system reliability is improved due to the fact that the firmware is not required to be upgraded, the batch inquiry of data of the cards of a plurality of nodes of the data center is supported, and the fault diagnosis characteristic of the system is improved.
Drawings
Fig. 1 is a schematic flow chart of a batch adaptation method of heterogeneous computing power cards of a server;
FIG. 2 is a schematic diagram of heterogeneous computing power card adaptation flow of centralized management software;
fig. 3 is a schematic diagram of a heterogeneous computing power card adaptation flow of a BMC;
fig. 4 is a schematic diagram of a heterogeneous computing power card adaptation flow of the BIOS.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the foregoing drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and "third," etc. are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order, and are not limited to the fact that "first," "second," and "third" are not identical.
In the description of embodiments of the present application, "exemplary," "such as," or "for example," etc., are used to indicate an example, instance, or illustration. Any embodiment or design described herein as "exemplary," "such as" or "for example" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary," "such as" or "for example," etc., is intended to present related concepts in a concrete fashion.
In the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and in addition, in the description of the embodiments of the present application, "plural" means two or more than two.
In some of the processes described in the embodiments of the present application, a plurality of operations or steps occurring in a particular order are included, but it should be understood that these operations or steps may be performed out of the order in which they occur in the embodiments of the present application or in parallel, the sequence numbers of the operations merely serve to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the processes may include more or fewer operations, and the operations or steps may be performed in sequence or in parallel, and the operations or steps may be combined.
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In a first aspect, an embodiment of the present application provides a batch adaptation method of server heterogeneous computing cards, and the batch adaptation method of server heterogeneous computing cards is described below with reference to a batch adaptation system of server heterogeneous computing cards.
The batch adaptation system of the heterogeneous computing power cards of the server comprises centralized management software, a Baseboard Management Controller (BMC) and a Basic Input Output System (BIOS). The BIOS is an important component in the server system, stores the core parameters and programs of the server operation, and can guide the server to start. The baseboard management controller BMC (Baseboard Manager Controller) can remotely manage the BIOS, and the BIOS can obtain corresponding data from the BMC and update the corresponding data during the startup process. The centralized management software is a newly added module of the application, and particularly is a centralized management software node newly added to the data center, which can receive configuration files input by a user and also can feed back needed information to the user.
As shown in fig. 1, the overall idea of the batch adaptation method of the heterogeneous computing power cards of the server provided in the embodiment of the application includes:
step S101, the centralized management software receives a first configuration file including PCIE card configuration parameters input by a user, and sends the first configuration file to a corresponding BMC.
Step S102, the BIOS sends a first obtaining command of the PCIE card configuration parameter to the BMC in the PEI phase.
Step S103, when receiving the first obtaining command, the BMC sends the PCIE card configuration parameters in the first configuration file to the BIOS, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
The PCIE card configuration parameters include slot numbers of each PCIE card, scan delay time, number of repeated initialization, optionrom, PCIE bandwidth of whether to load the card, PCIE rate, and device name of the PCIE card. The pcie bandwidth refers to the connection modes of x1, x2, x4, etc., and the pcie rate is the protocol type of pcie, such as pcie1.0, pcie3.0, etc.
It should be noted that, when the server heterogeneous PCIE computing cards need to be adapted, the user may input a first configuration file including one PCIE card or a plurality of PCIE card configuration parameters into the centralized management software. After the centralized management software receives the first configuration file, determining the PCIE card and the corresponding BMC which need to be adapted according to the PCIE card configuration parameters such as the slot number of the PCIE card, and sending the corresponding first configuration file to the BCM of the corresponding server through a redfish protocol, and simultaneously sending a reconfiguration mark which indicates that the parameters need to be reconfigured to the BMC. The reconfiguration flag is labeled "1" in this embodiment.
As a preferred embodiment, the BIOS sends a first acquisition command of PCIE card configuration parameters to the BMC in PEI phase, including: the BIOS sends an inquiry command of whether to reconfigure PCIE card configuration parameters to the BMC in the PEI phase; when the BMC receives the inquiry command, if the first configuration file is received, a reconfiguration mark is returned to the BIOS, otherwise, a reconfiguration mark which is not needed is returned to the BIOS; the BIOS sends the first get command to the BMC upon receiving the reconfiguration flag. The PEI (Pre-EFI Initialization) phase of BIOS is the wake-up CPU and memory initialization phase.
In some embodiments, the BIOS may send a query command to the BMC through the intelligent platform management interface IPMI (Intelligent Platform Management Interface).
It should be noted that, the BMC may simultaneously monitor and receive the first configuration file and the reconfiguration flag sent by the centralized management software, and the query command sent by the BIOS.
When the BMC receives the first configuration file and the reconfiguration flag sent by the centralized management software, the reconfiguration flag and PCIE card configuration parameters in the first configuration file are stored in a flash area of the BMC.
When the BMC receives an inquiry command sent by the BIOS, checking whether a PCIE card configuration parameter sent by the centralized management software is stored in a flash area of the BMC, if the PCIE card configuration parameter is received, indicating that the parameter of the PCIE card in the server needs to be reconfigured, and therefore returning a corresponding reconfiguration mark of '1' to the BIOS; if the flash area of the BMC does not have the PCIE card configuration parameters sent by the centralized management software, the parameters of the PCIE card in the server do not need to be reconfigured at present, and therefore a reconfiguration-free mark '0' is returned to the BIOS. It will be appreciated that the BMC, when not receiving the first configuration file and the reconfiguration flag sent by the centralized management software, returns a no reconfiguration flag "0" to the BIOS by default.
Therefore, when the BIOS receives the reconfiguration flag "1" returned by the BMC, the parameter of the PCIE card to be reconfigured can be determined, a first acquisition command of the PCIE card configuration parameter is sent to the BMC through the IPMI to acquire the PCIE card configuration parameter, and then the BIOS initializes the PCIE card according to the PCIE card configuration parameter after receiving the configuration parameter returned by the BMC to complete the updating of the configuration parameter of the corresponding PCIE card. The compatibility problem of the option of shielding the BIOS firmware or the PCI card and the hardware platform is realized, the adaptation parameters of the PCI card are adjusted under the condition of not modifying the BIOS source code by customizing the parameter configuration through the PCIE card, the configuration file of one PCIE configuration can be defined by different cards, the adaptation can be simultaneously carried out by different PCIE cards, the adaptation flexibility and the robustness are improved, the adaptation efficiency of the PCIE card is also improved,
further, when receiving the inquiry command of the BIOS, if the first configuration file sent by the centralized management software is not received, the BMC returns a reconfiguration-unnecessary flag to the BIOS, and when receiving the reconfiguration-unnecessary flag, the BIOS indicates that the PCIE card parameter is not required to be reconfigured, and then the PCIE card is initialized according to the stored PCIE card curing configuration parameter or the BIOS default procedure.
Specifically, when the BIOS receives the mark "0" without reconfiguration, it detects whether a flash area of the BIOS stores PCIE card curing configuration parameters; if yes, initializing the PCIE card according to the PCIE card curing configuration parameters; otherwise, the BIOS sends a second acquisition command of the PCIE card curing configuration parameters to the BMC through the IPMI channel.
It should be noted that, the PCIE card curing configuration parameters are configuration parameters of the PCIE card after the last initialization is completed, and after the last initialization of the PCIE card is successful, the PCIE card curing configuration parameters are stored in the flash area of the BIOS, so as to form corresponding PCIE card curing configuration parameters. When the user does not need to update the configuration parameters of the PCIE card, and when the PCIE card curing configuration parameters are stored in the flash area of the BIOS, the BIOS performs PCIE card initialization according to the stored PCIE card curing configuration parameters, and when the PCIE card curing configuration parameters are not stored in the flash area of the BIOS, the PCIE card curing configuration parameters may be acquired from the BMC through a second acquisition command.
When the BMC receives the second acquisition command, detecting whether the PCIE card curing configuration parameters are stored in a flash area of the BMC, and if so, sending the PCIE card curing configuration parameters to the BIOS; otherwise, a signal without PCIE card solidifying configuration parameters is sent to the BIOS.
If the BIOS receives the PCIE card curing configuration parameters sent by the BMC, initializing a PCIE card according to the PCIE card curing configuration parameters sent by the BMC; and if the signal without the PCIE card curing configuration parameters is received, initializing the PCIE card according to a BIOS default flow.
It should be noted that, the PCIE card curing configuration parameters may be stored in the flash area of the BIOS, and may also be stored in the flash area of the BMC, and when the PCIE card curing configuration parameters stored in the flash area of the BIOS are lost, the BIOS acquires the PCIE card curing configuration parameters from the BMC through the second acquiring command. When the BMC returns a signal without the curing configuration parameters of the PCIE card to the BIOS, the BMC is initialized by using a default flow of the BIOS, wherein the signal indicates that the BMC does not have the curing configuration parameters of the PCIE card or the BMC is not ready or fails to acquire at the moment.
Further, after initializing the PCIE card according to the PCIE card configuration parameter or the PCIE card curing configuration parameter sent by the BMC, the BIOS further includes:
and judging whether the PCIE card is successfully initialized.
And if the PCIE card is successfully initialized, solidifying the current PCIE card configuration parameters into the flash area of the BIOS to form new PCIE card solidification configuration parameters, and sending the new PCIE card solidification configuration parameters to the BMC so that the BMC stores the new PCIE card solidification configuration parameters into the flash area of the BMC. Under the condition that all PCIE cards in the BIOS stage are initialized successfully, the BIOS solidifies the current PCIE card configuration parameters into the flash area of the BIOS, and sends the solidified PCIE card configuration parameters to the BMC through a redfish protocol, so that the PCIE card is initialized when the PCIE card configuration is not updated, or the PCIE card configuration parameters are used for batch inquiry.
If the PCIE card initialization fails, the PCIE card is initialized according to a BIOS default flow, so that the PCIE card is initialized as soon as possible, and downtime of the server is reduced.
It should be noted that, the BMC stores the configuration parameters of the PCIE card sent by the centralized management software and the configuration parameters of the PCIE card fixed phone sent by the BIOS in two different flash areas of the BMC, so as to avoid the two data being lost at the same time.
According to the method, the configuration files corresponding to the PCIE card configuration based on the PCIE slots of the server are added, PCI configuration parameters of the PCIE slots are defined through the configuration files, analysis of the files is added in the BIOS source code, and initialization and solidification configuration processing of the PCIE card are conducted based on the files. The management function of the file is added in the centralized management software, the centralized management software carries out interaction processing on the configuration file through the redfish and the BMC, and the BMC carries out interaction processing on the configuration file through the IPMI and the BIOS, so that batch configuration and query functions on different PCIE cards of different servers are realized.
As a preferred real-time manner, the present application can perform batch query on PCIE card configuration parameters in addition to batch adaptation on PCIE card configuration parameters.
Specifically, the user may input a PCIE card configuration parameter query command in the centralized management software, where the centralized management software receives the query command of the PCIE card configuration parameter input by the user, and sends the query command to the BMC. When the BMC receives the query command, detecting whether PCIE card curing configuration parameters sent by the BIOS are stored in a flash area of the BMC; if yes, the PCIE card curing configuration parameters stored in the flash area of the BMC are sent to the centralized management software; otherwise, the current BIOS is indicated to update the PCIE card without using the customized configuration parameters input by the user, and the default parameters of the BIOS and the initialization of the process are used, and the second configuration file of the configuration parameters of the PCIE card in the interaction page setup of the BIOS is sent to the centralized management software.
It should be noted that, in this embodiment, the centralized management software and the BMC are configured to implement sending of instructions and files through a redfish protocol, and the open standard is easy to expand new functions.
Preferably, the first configuration file and the second configuration file can be lightweight data exchange format JSON files, JSON is independent of any specific platform or language, data exchange can be performed between different programming languages and operating systems, the data volume of network transmission can be reduced, and the transmission efficiency is improved. The first configuration file may be a first JSON file, and the second configuration file may be a second JSON file.
In order to better explain the batch adaptation method and the query method of the server heterogeneous computing power card in this embodiment, the flow of the centralized management software, the BMC and the BIOS when performing batch adaptation and query of the server heterogeneous computing power card will be described below.
The flow of batch adaptation and query of heterogeneous computing cards of a server by the centralized management software is shown in fig. 2:
step S201, starting running of centralized management software;
step S202, monitoring PCIE card related commands input by a user;
step S203, monitoring a first JSON file;
step S204, a reconfiguration mark of the PCIE card is sent to the BMC through the redfish;
step S205, a first JSON file is sent to the BMC through a redfish;
step S206, monitoring a PCIE card configuration parameter query command;
step S207, a query command of PCIE card configuration parameters is sent to the BMC through the redfish.
Further, the flow of batch adaptation and query of the server heterogeneous computing power card by the BMC is shown in fig. 3:
step S301, starting up the BMC;
step S302, monitoring a reconfiguration mark of a PCIE card and a first JSON file sent by centralized management software;
step S303, analyzing the first JSON file to obtain PCIE card configuration parameters;
step S304, storing a reconfiguration flag of the PCIE card and a configuration parameter of the PCIE card, which are sent by the centralized management software, in a first flash area of the BMC;
step S305, monitoring an inquiry command sent by the BIOS;
step S306, detecting whether the PCIE card configuration parameters exist in the first flash area;
step S307, a reconfiguration flag is sent to the BIOS or the reconfiguration flag is not needed;
step S308, monitoring an inquiry command sent by the BIOS, and sending PCIE card configuration data stored in a first flash area of the BMC to the BIOS;
step 309, monitor the new PCIE card curing configuration parameters sent by the BIOS;
step S310, storing new PCIE card curing configuration parameters in a second flash area of the BMC;
step S311, monitoring a query command of PCIE card configuration parameters sent by centralized management software;
step S312, checking whether PCIE card curing configuration parameters are stored in a second flash area of the BMC; if yes, go to step S313; otherwise, step S314 is entered.
Step S313, the PCIE card curing configuration parameters stored in the second flash area of the BMC are sent to the centralized management software;
step S314, a second JSON file of the PCIE card configuration parameters in the interaction page setup of the BIOS is sent to the centralized management software.
Further, the flow of batch adaptation and query of the heterogeneous computing power card of the server by the BIOS is shown in fig. 4:
step S401, BIOS is started;
step S402, a first acquisition command of PCIE card configuration parameters is sent to the BMC;
step S403, judging whether the BMC returns a reconfiguration flag; if yes, go to step S404, otherwise, go to step S406;
step S404, a first acquisition command is sent to the BMC, and PCIE configuration parameters sent by the BMC are received;
step S405, initializing a PCIE card according to the received PCIE configuration parameters.
Step S406, detecting whether a flash area of the BIOS stores PCIE card curing configuration parameters, if yes, entering step S407, otherwise entering step S408;
step S407, initializing a PCIE card according to the PCIE card curing configuration parameters in the flash area of the BIOS;
step S408, a PCIE card curing configuration parameter second acquisition command is sent to the BMC;
step S409, if the BMC returns to the PCIE card curing configuration parameters, initializing the PCIE card according to the PCIE card curing configuration parameters;
step S410, if the BMC returns a signal without PCIE card curing configuration parameters, initializing the PCIE card according to a BIOS default flow;
step S411, judging whether the PCIE card initialization is successful; if yes, go to step S412; otherwise, the process advances to step S410.
Step S412, solidifying the current PCIE card configuration parameters to the flash area of the BIOS to form new PCIE card solidification configuration parameters;
step S413, the new PCIE card curing configuration parameters are sent to the BMC.
In a second aspect, embodiments of the present application further provide a batch adaptation system for heterogeneous computing power cards of a server, where the system includes: centralized management software, a baseboard management controller BMC and a basic input output system BIOS;
the centralized management software is used for receiving a first configuration file containing PCIE card configuration parameters input by a user and sending the first configuration file to the corresponding BMC;
the BIOS is used for sending a first acquisition command of PCIE card configuration parameters to the BMC in a PEI phase;
and the BMC is used for sending the PCIE card configuration parameters in the first configuration file to the BIOS when the first acquisition command is received, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
In some embodiments, the BIOS is further configured to send an inquiry command to the BMC in a PEI phase about whether to reconfigure PCIE card configuration parameters;
the BMC is further used for returning a reconfiguration mark to the BIOS if the first configuration file is received when the inquiry command is received, otherwise, returning a reconfiguration mark to the BIOS;
the BIOS is further configured to send the first get command to the BMC upon receiving the reconfiguration flag.
In some embodiments, the BIOS is further configured to perform PCIE card initialization according to the stored PCIE card curing configuration parameters or a BIOS default procedure when the reconfiguration-unnecessary flag is received.
In some embodiments, the BIOS is further configured to detect, when the reconfiguration-unnecessary flag is received, whether a flash area of the BIOS stores PCIE card curing configuration parameters; if yes, initializing the PCIE card according to the PCIE card curing configuration parameters; otherwise, a second acquisition command of the PCIE card curing configuration parameters is sent to the BMC;
the BMC is further configured to detect, when the second acquisition command is received, whether the PCIE card curing configuration parameter is stored in a flash area of the BMC, and if yes, send the PCIE card curing configuration parameter to the BIOS; otherwise, sending a signal without PCIE card curing configuration parameters to the BIOS;
the BIOS is further configured to initialize a PCIE card according to the PCIE card curing configuration parameter sent by the BMC if the PCIE card curing configuration parameter sent by the BMC is received; and if the signal without the PCIE card curing configuration parameters is received, initializing the PCIE card according to a BIOS default flow.
In some embodiments, the BIOS is further to:
after performing PCIE card initialization according to the PCIE card configuration parameters or the PCIE card curing configuration parameters sent by the BMC, judging whether the PCIE card initialization is successful or not;
if the PCIE card is initialized successfully, solidifying the current PCIE card configuration parameters into a flash area of the BIOS to form new PCIE card solidification configuration parameters, and sending the new PCIE card solidification configuration parameters to the BMC so that the BMC stores the new PCIE card solidification configuration parameters into the flash area of the BMC;
if the PCIE card initialization fails, the PCIE card initialization is performed according to a BIOS default flow.
In some embodiments, the centralized management software is further configured to receive a query command of a PCIE card configuration parameter input by a user, and send the query command to the BMC;
the BMC is further used for detecting whether PCIE card curing configuration parameters are stored in a flash area of the BMC when the query command is received; if yes, the PCIE card curing configuration parameters stored in the flash area of the BMC are sent to the centralized management software; otherwise, sending a second configuration file of PCIE card configuration parameters in the interaction page of the BIOS to the centralized management software.
In some embodiments, the PCIE card configuration parameters include a slot number, a scan delay time, a number of repeated initialization times, a optionrom, PCIE bandwidth of whether to load a card, a PCIE rate, and a device name of the PCIE card.
The function implementation of each module in the batch adaptation system of the server heterogeneous computing power card corresponds to each step in the batch adaptation method embodiment of the server heterogeneous computing power card, and the functions and implementation processes of the modules are not described in detail herein.
It should be noted that, the foregoing embodiment numbers are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising several instructions for causing a terminal device to perform the method described in the various embodiments of the present application.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims of the present application.

Claims (10)

1. A method for batch adaptation of heterogeneous computing cards of a server, comprising:
the centralized management software receives a first configuration file containing PCIE card configuration parameters input by a user and sends the first configuration file to a corresponding BMC;
the BIOS sends a first acquisition command of PCIE card configuration parameters to the BMC in the PEI phase;
and when the BMC receives the first acquisition command, the PCIE card configuration parameters in the first configuration file are sent to the BIOS, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
2. The batch adaptation method of heterogeneous computing power cards of claim 1, wherein the BIOS sending a first get command of PCIE card configuration parameters to the BMC in PEI phase comprises:
the BIOS sends an inquiry command of whether to reconfigure PCIE card configuration parameters to the BMC in the PEI phase;
when the BMC receives the inquiry command, if the first configuration file is received, a reconfiguration mark is returned to the BIOS, otherwise, a reconfiguration mark which is not needed is returned to the BIOS;
the BIOS sends the first get command to the BMC upon receiving the reconfiguration flag.
3. The method for batch adaptation of server heterogeneous computing power cards of claim 2, further comprising:
and when the BIOS receives the reconfiguration-free mark, initializing the PCIE card according to the stored PCIE card curing configuration parameters or a BIOS default flow.
4. The batch adaptation method of server heterogeneous computing power cards of claim 3, wherein when the BIOS receives the reconfiguration-free flag, performing PCIE card initialization according to stored PCIE card curing configuration parameters or a BIOS default procedure includes:
when the BIOS receives the reconfiguration-free mark, detecting whether a flash area of the BIOS stores PCIE card curing configuration parameters or not; if yes, initializing the PCIE card according to the PCIE card curing configuration parameters; otherwise, a second acquisition command of the PCIE card curing configuration parameters is sent to the BMC;
when the BMC receives the second acquisition command, detecting whether the PCIE card curing configuration parameters are stored in a flash area of the BMC, and if so, sending the PCIE card curing configuration parameters to the BIOS; otherwise, sending a signal without PCIE card curing configuration parameters to the BIOS;
if the BIOS receives the PCIE card curing configuration parameters sent by the BMC, initializing a PCIE card according to the PCIE card curing configuration parameters sent by the BMC; and if the signal without the PCIE card curing configuration parameters is received, initializing the PCIE card according to a BIOS default flow.
5. The batch adaptation method of heterogeneous computing power cards of the server of claim 4, wherein the BIOS, after initializing a PCIE card according to the PCIE card configuration parameter or the PCIE card curing configuration parameter sent by the BMC, further comprises:
judging whether the PCIE card is successfully initialized;
if the PCIE card is initialized successfully, solidifying the current PCIE card configuration parameters into a flash area of the BIOS to form new PCIE card solidification configuration parameters, and sending the new PCIE card solidification configuration parameters to the BMC so that the BMC stores the new PCIE card solidification configuration parameters into the flash area of the BMC;
if the PCIE card initialization fails, the PCIE card initialization is performed according to a BIOS default flow.
6. The method for batch adaptation of server heterogeneous computing power cards of claim 2, further comprising:
the centralized management software receives a query command of PCIE card configuration parameters input by a user and sends the query command to the BMC;
when the BMC receives the query command, detecting whether PCIE card curing configuration parameters are stored in a flash area of the BMC; if yes, the PCIE card curing configuration parameters stored in the flash area of the BMC are sent to the centralized management software; otherwise, sending a second configuration file of PCIE card configuration parameters in the interaction page of the BIOS to the centralized management software.
7. The batch adaptation method of heterogeneous computing power cards of a server according to claim 2, wherein the PCIE card configuration parameters include slot numbers of each PCIE card, scan delay time, number of repeated initialization, optionrom, PCIE bandwidth of whether to load the card, PCIE rate, and device name of the PCIE card.
8. A batch adaptation system for heterogeneous computing cards of a server, the method comprising: centralized management software, a baseboard management controller BMC and a basic input output system BIOS;
the centralized management software is used for receiving a first configuration file containing PCIE card configuration parameters input by a user and sending the first configuration file to the corresponding BMC;
the BIOS is used for sending a first acquisition command of PCIE card configuration parameters to the BMC in a PEI phase;
and the BMC is used for sending the PCIE card configuration parameters in the first configuration file to the BIOS when the first acquisition command is received, so that the BIOS performs PCIE card initialization according to the PCIE card configuration parameters.
9. The server heterogeneous computing power card batch adaptation system of claim 8, wherein:
the BIOS is further used for sending an inquiry command of whether to reconfigure PCIE card configuration parameters to the BMC in the PEI phase;
the BMC is further used for returning a reconfiguration mark to the BIOS if the first configuration file is received when the inquiry command is received, otherwise, returning a reconfiguration mark to the BIOS;
the BIOS is further configured to send the first get command to the BMC upon receiving the reconfiguration flag.
10. The server heterogeneous computing power card batch adaptation system of claim 9, wherein:
and the BIOS is also used for initializing the PCIE card according to the stored PCIE card curing configuration parameters or BIOS default flow when the reconfiguration-free mark is received.
CN202311658627.1A 2023-12-04 2023-12-04 Batch adaptation method and system for heterogeneous computing power cards of server Pending CN117687697A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311658627.1A CN117687697A (en) 2023-12-04 2023-12-04 Batch adaptation method and system for heterogeneous computing power cards of server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311658627.1A CN117687697A (en) 2023-12-04 2023-12-04 Batch adaptation method and system for heterogeneous computing power cards of server

Publications (1)

Publication Number Publication Date
CN117687697A true CN117687697A (en) 2024-03-12

Family

ID=90125741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311658627.1A Pending CN117687697A (en) 2023-12-04 2023-12-04 Batch adaptation method and system for heterogeneous computing power cards of server

Country Status (1)

Country Link
CN (1) CN117687697A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119201225A (en) * 2024-11-28 2024-12-27 苏州元脑智能科技有限公司 Baseboard management controller adaptation method and program product for basic input and output system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119201225A (en) * 2024-11-28 2024-12-27 苏州元脑智能科技有限公司 Baseboard management controller adaptation method and program product for basic input and output system
CN119201225B (en) * 2024-11-28 2025-03-14 苏州元脑智能科技有限公司 Baseboard management controller adaptation method and program product for basic input and output system

Similar Documents

Publication Publication Date Title
CN110096424B (en) Test processing method and device, electronic equipment and storage medium
EP0845742A2 (en) Methods and systems for booting a computer in a distributed computing system
CN111144839B (en) Project construction method, continuous integration system and terminal equipment
EP2456257A1 (en) Method and system for upgrading wireless data card
CN111752637B (en) Multi-service inspection management method and device, computer equipment and storage medium
CN106817241A (en) A kind of updating management method, upgrade method and device
US11539612B2 (en) Testing virtualized network functions
CN111506358B (en) Method and device for updating container configuration
EP3091435A1 (en) Resource management method and device for terminal system
CN105072398B (en) A kind of device updating method and device
CN117687697A (en) Batch adaptation method and system for heterogeneous computing power cards of server
CN110737444A (en) Remote self-adaptive dynamic deployment method and system for operating system based on firmware
GB2583904A (en) Commissioning a virtualised network function
CN108345476A (en) A kind of method and apparatus of batch processing hardware configuration information
CN112905197A (en) Information processing method, device and system, electronic equipment and storage medium
CN119728689A (en) Cluster node management method, device, equipment and medium
CN112559124A (en) Model management system and target operation instruction processing method and device
CN114564326B (en) A method and system for performing abnormal scanning on applications of a kubernetes cluster
CN112667498B (en) Server building method, device, computer equipment and readable storage medium
CN114780410A (en) APP automatic test method, device, equipment and storage medium
CN116931962A (en) Version deployment method and device, electronic equipment and storage medium
CN114244776A (en) Message sending method, system, device, equipment and medium
KR102068830B1 (en) Server Validation Automation and Management System
CN113472611A (en) Method and device for acquiring WiFi signal strength and readable storage medium
CN118626400B (en) Software testing system for core framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination