[go: up one dir, main page]

CN108984120B - Storage device path error processing method and related device - Google Patents

Storage device path error processing method and related device Download PDF

Info

Publication number
CN108984120B
CN108984120B CN201810713250.8A CN201810713250A CN108984120B CN 108984120 B CN108984120 B CN 108984120B CN 201810713250 A CN201810713250 A CN 201810713250A CN 108984120 B CN108984120 B CN 108984120B
Authority
CN
China
Prior art keywords
error
decision table
code decision
processing
error code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810713250.8A
Other languages
Chinese (zh)
Other versions
CN108984120A (en
Inventor
耿芳忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810713250.8A priority Critical patent/CN108984120B/en
Publication of CN108984120A publication Critical patent/CN108984120A/en
Application granted granted Critical
Publication of CN108984120B publication Critical patent/CN108984120B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0635Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer And Data Communications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请实施例公开了一种存储设备路径错误的处理方法以及相关装置,用提高处理装置处理错误的灵活性。本申请实施例方法包括:通过输入输出控制IO control接口获取第二错误代码决策表,所述第二错误代码决策表为用户对第一错误代码决策表进行修改得到的,所述第二错误代码决策表包括至少一个错误与处理所述错误的规则的对应关系,所述错误为处理装置给存储设备发送请求的过程中所出现的错误;根据所述第二错误代码决策表处理所述错误。

Figure 201810713250

The embodiments of the present application disclose a method for processing path errors of a storage device and a related device, which are used to improve the flexibility of the processing device in handling errors. The method of the embodiment of the present application includes: obtaining a second error code decision table through an input and output control IO control interface, where the second error code decision table is obtained by modifying the first error code decision table by a user, and the second error code decision table is obtained by modifying the first error code decision table by a user. The decision table includes a correspondence between at least one error and a rule for processing the error, where the error is an error that occurs in the process of the processing apparatus sending a request to the storage device; the error is processed according to the second error code decision table.

Figure 201810713250

Description

Storage device path error processing method and related device
Technical Field
The embodiment of the application relates to the field of data storage, in particular to a method for processing a path error of a storage device and a related device.
Background
The multi-path technology (MPIO) accesses the network storage device through one or more physical links, and may use fault tolerance, traffic load balancing, fine-grained IO scheduling policy, and the like, so as to provide higher availability and performance advantages for the network storage application. In general, the operating system of the application host has multipath software supporting multipath function, and the multipath software is composed of a multipath driver and a multipath management tool for path management. The multipath driver is a module belonging to the kernel of the operating system, and realizes the functions of path identification, path aggregation, path selection (load balancing), error processing and the like. The multipath management tool runs on a user layer and provides functions of path management and performance data statistics.
Under the condition that the application host accesses the storage device in a storage multipath connection mode, if an error occurs in the process of sending an IO request to the storage device, the processing device calls an error processing function to process the error.
In the prior art, an error handling mechanism for a processing device to call an error handling function to solve a path error is compiled kernel code, which is not modifiable. Thus, the error handling function may not be modified after the processing device has started running. If the error handling function is to be modified, the kernel code needs to be recompiled and the processing apparatus is reconstructed, thus reducing the flexibility of the processing apparatus to handle errors.
Disclosure of Invention
The embodiment of the application provides a processing method and a related device for a storage device path error, which are used for improving the flexibility of processing the error by a processing device.
In a first aspect, an embodiment of the present application provides a method for processing a path error of a storage device, including:
when the processing device is ready to process an error, the processing device may obtain a second error code decision table through the IO control interface, where the second error code decision table is obtained by modifying the first error code decision table by a user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process in which the processing device sends a request to the storage device;
after the processing device obtains the second error code decision table through the IO control interface, the processing device may process the error according to the second error code decision table.
In the embodiment of the present application, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive an instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of multipath software to handle errors.
According to a first aspect, in a first implementation manner of the first aspect of this embodiment of the present application, the second error code decision table includes: an operating system interface layer error, a small computer system interface SCSI command layer error, a processing rule corresponding to the operating system interface layer error or the SCSI command layer error.
In the embodiment of the present application, the contents of the second error code decision table are listed, so that the feasibility of the scheme can be improved.
According to the first aspect, in a second implementation manner of the first aspect of this embodiment of the present application, the processing the error according to the second error code decision table includes:
replacing the error code decision table of the kernel layer with the second error code decision table;
and processing the error according to the error code decision table of the kernel layer.
In the embodiment of the present application, a step of processing the error according to the second error code decision table is introduced, so that the feasibility of the scheme can be improved.
According to the second implementation manner of the first aspect, in a third implementation manner of the first aspect of this embodiment of the present application, the processing the error according to the error code decision table of the kernel layer includes:
when an error prompt message is received, determining a target error corresponding to the error prompt message;
judging whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
and if the error code decision table of the kernel layer has a processing rule corresponding to the target error, processing the target error according to the processing rule.
In the embodiment of the present application, a step of processing the error by the error code decision table of the kernel layer is introduced, so that the feasibility of the scheme can be improved.
According to a fourth implementation form of the first aspect of this embodiment, the method further comprises:
and if the error code decision table of the kernel layer does not have the processing rule corresponding to the target error, prompting a user to modify the second error code decision table.
In the embodiment of the present application, a result obtained when it is determined that the processing rule corresponding to the target error does not exist in the error code decision table of the kernel layer is listed, so that implementation flexibility of the scheme can be enhanced.
According to the first aspect, in a fifth implementation manner of the first aspect of the embodiments of the present application, the method further includes:
when an IO request sent to the storage device has an error, acquiring an error path code, wherein the error path code is used for indicating the position of the error;
judging whether the first error code decision table can process the error, wherein the error processing comprises updating parameters of the path;
if the first error code decision table cannot handle the error, a request to modify the error code decision table is sent.
In the embodiment of the application, when an error occurs in the process of sending the IO request to the storage device, the processing device first determines whether the first error code decision table can process the error, and then executes the next operation according to the determination result, so that the implementation flexibility of the scheme can be enhanced.
According to a fifth implementation manner of the first aspect, in a sixth implementation manner of the first aspect of this embodiment of the present application, the determining whether the first error code decision table can handle the error includes:
acquiring the error path code;
querying the table entry in the first error code decision table;
judging whether the first error code decision table has the table item corresponding to the error path code;
if the entry corresponding to the error path code exists in the first error code decision table, determining that the first error code decision table can handle the error;
if the entry corresponding to the error path code is not in the first error code decision table, determining that the first error code decision table is not capable of handling the error.
In the embodiment of the application, a specific step of judging whether the first error code decision table can process the error is introduced, so that the feasibility of the scheme can be improved.
In a second aspect, an embodiment of the present application provides a processing apparatus, which performs the method in the foregoing first aspect, and includes:
an obtaining unit, configured to obtain a second error code decision table through an IO control interface, where the second error code decision table is obtained by modifying a first error code decision table by a user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process of sending a request to a storage device by a processing apparatus;
and the processing unit is used for processing the error according to the second error code decision table.
In the embodiment of the present application, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive an instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of the processing means to handle errors.
According to a second aspect, in a first implementation manner of the second aspect of this embodiment of the present application, the second error code decision table includes:
an operating system interface layer error, a small computer system interface SCSI command layer error, a processing rule corresponding to the operating system interface layer error or the SCSI command layer error.
In the embodiment of the present application, the contents of the second error code decision table are listed, so that the feasibility of the scheme can be improved.
According to a second aspect, in a second implementation manner of the second aspect of this embodiment of the present application, the processing unit includes:
a replacement subunit, configured to replace the error code decision table of the kernel layer with the second error code decision table;
and the processing subunit is used for processing the error according to the error code decision table of the kernel layer.
In the embodiment of the present application, a step of processing the error according to the second error code decision table is introduced, so that the feasibility of the scheme can be improved.
According to a second implementation form of the second aspect, in a third implementation form of the second aspect of this application example, the processing subunit includes:
the determining module is used for determining a target error corresponding to the error prompt message when the error prompt message is received;
the judging module is used for judging whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
and the processing module is used for processing the target error according to the processing rule when the processing rule corresponding to the target error exists in the error code decision table of the kernel layer.
In the embodiment of the present application, a step of processing the error by the error code decision table of the kernel layer is introduced, so that the feasibility of the scheme can be improved.
According to a fourth implementation manner of the second aspect of the embodiments of the present application, the processing apparatus further includes:
and the prompting unit is used for prompting a user to modify the second error code decision table when the processing rule corresponding to the target error does not exist in the error code decision table of the kernel layer.
In the embodiment of the present application, a result obtained when it is determined that the processing rule corresponding to the target error does not exist in the error code decision table of the kernel layer is listed, so that implementation flexibility of the scheme can be enhanced.
In a third aspect, an embodiment of the present application provides a processing apparatus, which performs the method in the foregoing first aspect, and includes:
a processor, a memory, a bus, and a communication interface;
the processor, the memory and the input/output device are connected with the bus;
the processor controls the communication interface to obtain a second error code decision table by obtaining the second error code decision table, and stores the second error code decision table in the memory, wherein the second error code decision table is obtained by modifying the first error code decision table by a user, the second error code decision table comprises a corresponding relation between at least one error and a rule for processing the error, and the error is an error generated in the process of sending a request to the storage equipment by the processing device;
the processor processes the error according to the second error code decision table.
It should be noted that the communication interface may be an IO control interface.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method of any of the preceding first aspects.
In a fifth aspect, the present application provides a computer program product, which is characterized in that when the computer program product runs on a computer, the computer is caused to execute the method according to any one of the preceding first aspects.
In a sixth aspect, the present application provides a chip system comprising a processor for enabling a network device to implement the functions referred to in the above aspects, e.g. to transmit or process data and/or information referred to in the above methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the network device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment of the present application, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive an instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of multipath software to handle errors.
Drawings
Fig. 1 is a diagram of a storage area network framework according to an embodiment of the present application;
FIG. 2 is a block diagram of another storage area network framework according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for processing a path error of a storage device according to an embodiment of the present disclosure;
FIG. 4 is a schematic flowchart of another method for processing a path error of a storage device according to an embodiment of the present disclosure;
fig. 5 is a schematic view of a processing apparatus according to an embodiment of the present disclosure.
Detailed Description
The embodiment of the application provides a processing method and a related device for a storage device path error, which are used for improving the flexibility of processing the error by a processing device.
Some terms referred to in the embodiments of the present application are described below.
A processing device: a geometry having a function of multipath driver software is included in the processing device.
Error: the sign is a sign for prompting the occurrence of a network error, and refers to all reasons or events which cause the system not to work according to the intention of a user, and the common causes are 407 errors, 405 errors, 401 errors and 404 errors. In the embodiment of the application, the multipath driver generates an exception in the process of sending an IO request to a storage device.
Error processing: in the programming process, the program cannot run normally due to the existence of some errors, and the processing of the errors to make the program run correctly is called error processing. Error handling functions are important aspects of compiler performance and play a very important role in helping programmers modify programs as quickly as possible.
The following describes a system architecture to which the embodiments of the present application are adapted.
In an enterprise-level information system, an application Host (Host) for processing a service request and a Storage device (Storage) for storing data are connected to each other through a Storage Area Network (SAN). In order to improve redundancy and IO throughput, a storage multi-path connection mode is usually adopted, that is, an application host accesses a storage device through multiple physical paths at the same time. As shown in fig. 1, there are two initial ports IP0 and IP1 on the application host, and two target ports TP0 and TP1 on the storage device, and when the application host accesses a Logical Unit Number (LUN) in the storage device through the SAN network connection, there are 4 paths: path0(IP0, TP0), Path1(IP0, TP1), Path2(IP1, TP0), Path3(IP1, TP 1).
The structure of multi-path software (MPIO) in the os of the application host is shown in fig. 2, which is a part of the processing device in the embodiment of the present application. The processing device at least comprises multipath software, and the multipath software is composed of a multipath driver of a kernel layer and a multipath management tool of a user layer. The multipath driver is a module belonging to the kernel of the operating system, and realizes the functions of path identification, path aggregation, path selection (load balancing), error processing and the like. The multipath management tool runs on a user layer and provides functions of path management and performance data statistics. In the embodiment of the application, an IO control interface between a multipath driver and a multipath management tool is newly defined, an error code decision table configuration file is set in the multipath management tool of a user layer, and an error code decision table is set in the multipath driver of a kernel layer, so that the error code decision table is transmitted between the user layer and the kernel layer. The configuration file of the error code decision table comprises the error code decision table and other files required by a user to modify the error code decision table. If an error occurs when the multi-path driver sends an IO request to the storage device through the selected path, the error handling module of the multi-path driver may determine how to handle the IO request and how to handle the error according to the type of the error and the indication of the error code decision table.
In this embodiment, it should be noted that the processing device in this embodiment may be a server, or a processor in the server, or a chip in the server, or other devices, and this is not limited herein. In this embodiment and the following embodiments, only the processing device is taken as an example for description.
For convenience of understanding, a specific flow in this embodiment is described below, and as shown in fig. 3, is a method for processing a storage device path error provided in this embodiment, where a processing apparatus in the method performs the following steps, including:
301. acquiring a second error code decision table through an IO control interface;
when the processing device is ready to process an error, the processing device may obtain a second error code decision table through the IO control interface, where the second error code decision table is obtained by modifying the first error code decision table by a user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process in which the processing device sends a request to the storage device.
302. The error is processed according to the second error code decision table.
After the processing device obtains the second error code decision table through the IO control interface, the processing device may process the error according to the second error code decision table.
In this embodiment, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive the instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of multipath software to handle errors.
While the method for processing the storage device path error in this embodiment is described above, another embodiment of the method for processing the storage device path error in this embodiment is described below, and as shown in fig. 3, another embodiment of the method for processing the storage device path error in this embodiment includes:
401. acquiring a second error code decision table through an input/output control IO control interface;
in this embodiment, when a user modifies a second error code decision table, the processing device obtains the second error code decision table through the IO control interface, where the second error code decision table is obtained by modifying the first error code decision table by the user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process in which the processing device sends a request to the storage device.
In this embodiment, a set of IO control interfaces is defined, and the interfaces are used to implement synchronization of an error code decision table between a multipath management tool of a user layer and a processing device of a kernel layer. When the multi-path management tool reads the ERROR code decision TABLE of the processing device by calling the interface, an instruction Io _ CTL (CTL _ GET _ ERROR _ POLICY _ TABLE, void) is adopted. When the multi-path management tool reads the ERROR code decision TABLE of the processing device through updating the interface, an instruction Io _ CTL (CTL _ SET _ ERROR _ POLICY _ TABLE, void TABLE) is adopted. The code refers to a source file written by a programmer in a language supported by a development tool, and is a set of definite rule systems for representing information in a discrete form by characters, symbols or signal elements. The principles of code design include uniqueness, standardization and versatility, extensibility and stability, ease of identification and memory, strive for shortness and format unification, and ease of modification. It should be noted that the instruction for updating or reading the interface by the multipath management tool in this embodiment is not limited to the instruction described above, and the embodiment and the following embodiments are described by taking the two codes as examples.
In this embodiment, the decision table is a tabular graphic tool, and is suitable for describing situations where there are many processing and determining conditions, and various conditions are combined with each other and there are multiple decision schemes. The error code decision table may correspond to a plurality of conditions and actions to be performed after the conditions are satisfied in a manner that accurately and concisely describes complex logic. Unlike control statements in conventional program languages, the error code decision table can clearly express the direct connection of a plurality of independent conditions and a plurality of actions. As shown in table 1, the error code decision table at least includes: the host status (host _ status), SCSI command layer error (SCSI _ status), error handling rules (action), number of retransmissions (paramcount), retransmission interval (interval), and upper limit of the number of retransmissions (count) are applied.
TABLE 1
Figure BDA0001716969560000101
402. Replacing the error code decision table of the kernel layer with the second error code decision table;
in this embodiment, the processing device replaces the original error code decision table of the kernel layer in the kernel layer with a second error code decision table, where the second error code decision table is obtained from a multipath management tool in the user layer through an IO control interface. In this embodiment, the error code decision table of the core layer is replaced with the second error code decision table, so that when the error code decision table is updated next, the second error code decision table is the error code decision table of the core layer.
403. When an error prompt message is received, determining a target error corresponding to the error prompt message;
in this embodiment, when an error occurs when the multipath driver sends the IO request to the storage device, the processing device receives an error notification message, where the error notification message may be one message or multiple messages, and the details are not limited herein. The error prompting message carries an error path code, where the error path code is used to indicate the reason for the error, and the reason for the error includes: the location of the error, the error time of the error, or the number of occurrences of the error. In this embodiment, the processing device may determine a corresponding target error according to the error path code in the error prompt message. The error prompt message may also carry the number of times of error prompt, the number of stages of error early warning, or the time to process the error, which is not limited herein.
404. Judging whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
in this embodiment, after the processing device receives the error prompt message and determines the target error, it is determined whether a processing rule corresponding to the target error exists in the error code decision table of the core layer, where the content of the error code decision table of the core layer is the same as the parameter in the second error code decision table in the foregoing, and the processing rule refers to an entry in the error code decision table of the core layer. As shown in table 1, the error code decision table at least includes: the host status (host _ status), SCSI command layer error (SCSI _ status), error handling rules (action), number of retransmissions (paramcount), retransmission interval (interval), and upper limit of the number of retransmissions (count) are applied. Wherein at least one SCSI command layer error corresponds to at least one error handling rule. The one SCSI command layer error in the error code decision table may correspond to a plurality of error handling rules, and the one error handling rule in the error code decision table may correspond to a plurality of SCSI command layer errors. In this embodiment and the following embodiments, only the case where a SCSI command layer error corresponds to an error handling rule is described.
The error code decision table is explained below with the first and second behavior examples in table 1.
As shown in Table 1, when the application host status in the error path code corresponding to the target error is application host code 0 and the SCSI error code 0 is fetched by the SCSI command layer error, the corresponding processing rule (action) exists in the error code decision table.
If the error code decision table of the kernel layer has a processing rule corresponding to the target error, execute step 405;
if the error code decision table of the kernel layer does not have the processing rule corresponding to the target error, step 406 is executed.
405. Processing the target error according to the processing rule;
in this embodiment, when a processing rule corresponding to the target error exists in the error code decision table of the kernel layer, the processing device processes the target error according to the processing rule. The error code decision table is explained below using the first and fourth rows in table 1 as an example. As shown in table 1, when the application host state in the error path code corresponding to the target error is the application host code 0 and the SCSI command layer error fetches the SCSI error code 2, the error code decision table has a corresponding processing rule (action), which is to retransmit the IO request (retry _ other) via another path, and the number of current retransmissions (param count) is 2, the retransmission time interval (interva) is 10 unit times, and the upper limit of the number of retransmissions (count) is 3. The unit time may be 0.001ms, 0.002ms, or 0.0001ms, and is not particularly limited herein.
406. The user is prompted to modify the second error code decision table.
In this embodiment, when the error code decision table of the kernel layer does not have the processing rule corresponding to the target error, the processing device prompts the user to modify the second error code decision table. When the user receives the prompt that the user needs to modify the second error code decision table, the user interface displays the entry of the second error code decision table in the kernel layer, the error path code, or other information that may indicate how the user modifies the second error code decision table, which is not limited herein. The user modifies the second error code decision table according to the information to obtain a third error code decision table, and then step 401 is executed to enter the next cycle of modifying the error code decision table.
In this embodiment, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive the instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of multipath software to handle errors.
In the above description of the method for processing a storage device path error in this embodiment, a processing apparatus 500 in this embodiment is described below, and as shown in fig. 5, an embodiment of the processing apparatus 500 in this embodiment includes:
an obtaining unit 501, configured to obtain a second error code decision table through an IO control interface, where the second error code decision table is obtained by modifying a first error code decision table by a user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process of sending a request to a storage device by a processing apparatus;
a processing unit 502, configured to process the error according to the second error code decision table.
The processing unit 502 includes:
a replacement subunit 5021, configured to replace the error code decision table of the kernel layer with the second error code decision table;
the processing subunit 5022 is configured to process the error according to the error code decision table of the kernel layer.
The processing subunit 5022 includes:
a determining module 50221, configured to determine, when an error prompting message is received, a target error corresponding to the error prompting message;
a determining module 50222, configured to determine whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
the processing module 50223 is configured to process the target error according to a processing rule corresponding to the target error when the processing rule exists in the error code decision table of the kernel layer.
The processing device 500 further comprises:
a prompting unit 503, configured to prompt a user to modify the second error code decision table when the processing rule corresponding to the target error does not exist in the error code decision table of the kernel layer.
In this embodiment, the error code decision table may implement transmission at a user layer and a kernel layer of the processing apparatus through an IO control interface. No matter whether the processing device has errors in the process of sending the IO request to the storage device, the processing device may receive the instruction for modifying the error code decision table at any time, receive the second error code decision table carried in the instruction through the IO control interface, and modify the first error code decision table according to the second error code decision table carried in the instruction. Thereby increasing the flexibility of multipath software to handle errors.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the unit is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (9)

1. A method for processing a path error of a storage device is characterized by comprising the following steps:
acquiring a second error code decision table through an input/output control (IO) control interface, wherein the second error code decision table is obtained by modifying a first error code decision table by a user, the second error code decision table comprises a corresponding relation between at least one error and a rule for processing the error, and the error is an error generated in the process of sending a request to a storage device by a processing device;
processing the error according to the second error code decision table;
the first error code decision table and the second error code decision table include at least: applying host state, SCSI command layer error, error processing rule, retransmission times, retransmission time interval and retransmission times upper limit;
and the IO control interface is used for realizing the synchronization of the error code decision table between the multipath management tool of the user layer and the processing device of the kernel layer.
2. The method of claim 1, wherein the second error code decision table comprises: the error of an interface layer of an operating system, the error of a SCSI command layer of a small computer system interface, and the error of the interface layer of the operating system or the error of the SCSI command layer.
3. The method of claim 1, wherein the processing the error according to the second error code decision table comprises:
replacing the error code decision table of the kernel layer with the second error code decision table;
and processing the error according to an error code decision table of the kernel layer.
4. The method of claim 3, wherein the processing the error according to the error code decision table of the kernel layer comprises:
when an error prompt message is received, determining a target error corresponding to the error prompt message;
judging whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
and if the processing rule corresponding to the target error exists in the error code decision table of the kernel layer, processing the target error according to the processing rule.
5. The method of claim 4, further comprising:
and if the processing rule corresponding to the target error does not exist in the error code decision table of the kernel layer, prompting a user to modify the second error code decision table.
6. A processing apparatus, comprising:
an obtaining unit, configured to obtain a second error code decision table through an input/output control IO control interface, where the second error code decision table is obtained by modifying a first error code decision table by a user, the second error code decision table includes a correspondence between at least one error and a rule for processing the error, and the error is an error occurring in a process of sending a request to a storage device by a processing apparatus;
a processing unit for processing the error according to the second error code decision table;
the first error code decision table and the second error code decision table include at least: applying host state, SCSI command layer error, error processing rule, retransmission times, retransmission time interval and retransmission times upper limit;
and the IO control interface is used for realizing the synchronization of the error code decision table between the multipath management tool of the user layer and the processing device of the kernel layer.
7. The processing apparatus according to claim 6, wherein the processing unit comprises:
a replacement subunit, configured to replace the error code decision table of the kernel layer with the second error code decision table;
and the processing subunit is used for processing the error according to the error code decision table of the kernel layer.
8. The processing apparatus as in claim 7, wherein the processing subunit comprises:
the device comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a target error corresponding to an error prompt message when the error prompt message is received;
the judging module is used for judging whether a processing rule corresponding to the target error exists in an error code decision table of the kernel layer;
and the processing module is used for processing the target error according to the processing rule when the processing rule corresponding to the target error exists in the error code decision table of the kernel layer.
9. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 5.
CN201810713250.8A 2018-06-29 2018-06-29 Storage device path error processing method and related device Active CN108984120B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810713250.8A CN108984120B (en) 2018-06-29 2018-06-29 Storage device path error processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810713250.8A CN108984120B (en) 2018-06-29 2018-06-29 Storage device path error processing method and related device

Publications (2)

Publication Number Publication Date
CN108984120A CN108984120A (en) 2018-12-11
CN108984120B true CN108984120B (en) 2021-11-09

Family

ID=64539897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810713250.8A Active CN108984120B (en) 2018-06-29 2018-06-29 Storage device path error processing method and related device

Country Status (1)

Country Link
CN (1) CN108984120B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075206A (en) * 2007-07-03 2007-11-21 北京控制工程研究所 Active software fault-tolerant method based on linked list
CN101727297A (en) * 2008-10-30 2010-06-09 株式会社日立制作所 Storage device, data path failure switching method of storage controller internal network
CN101853189A (en) * 2010-05-26 2010-10-06 北京航空航天大学 A Java-based exception handling device and its exception handling method
CN102047683A (en) * 2008-04-08 2011-05-04 叠拓有限公司 Dynamic fault analysis for a centrally managed network element in a telecommunications system
CN105893190A (en) * 2016-06-28 2016-08-24 浪潮(北京)电子信息产业有限公司 Diagnosis processing method and system for multi-path IO errors
CN106775487A (en) * 2016-12-27 2017-05-31 郑州云海信息技术有限公司 A kind of multipath stores the treating method and apparatus of failure
CN107678875A (en) * 2017-09-29 2018-02-09 北京深思数盾科技股份有限公司 A kind of fault detect and self-repairing method, device, terminal and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7036043B2 (en) * 2001-12-28 2006-04-25 Storage Technology Corporation Data management with virtual recovery mapping and backward moves
US7320084B2 (en) * 2003-01-13 2008-01-15 Sierra Logic Management of error conditions in high-availability mass-storage-device shelves by storage-shelf routers
CN107633026B (en) * 2017-08-30 2019-12-17 深圳云天励飞技术有限公司 data synchronization exception handling method and device and server

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075206A (en) * 2007-07-03 2007-11-21 北京控制工程研究所 Active software fault-tolerant method based on linked list
CN102047683A (en) * 2008-04-08 2011-05-04 叠拓有限公司 Dynamic fault analysis for a centrally managed network element in a telecommunications system
CN101727297A (en) * 2008-10-30 2010-06-09 株式会社日立制作所 Storage device, data path failure switching method of storage controller internal network
CN101853189A (en) * 2010-05-26 2010-10-06 北京航空航天大学 A Java-based exception handling device and its exception handling method
CN105893190A (en) * 2016-06-28 2016-08-24 浪潮(北京)电子信息产业有限公司 Diagnosis processing method and system for multi-path IO errors
CN106775487A (en) * 2016-12-27 2017-05-31 郑州云海信息技术有限公司 A kind of multipath stores the treating method and apparatus of failure
CN107678875A (en) * 2017-09-29 2018-02-09 北京深思数盾科技股份有限公司 A kind of fault detect and self-repairing method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN108984120A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN101887380B (en) Optimize the distribution of the application performed in multiple platform system
US12093712B2 (en) Method and apparatus for handling memory failure, electronic device and storage medium
CN107590070B (en) Business process debugging method and device
US7171546B2 (en) CPU life-extension apparatus and method
US20090083467A1 (en) Method and System for Handling Interrupts Within Computer System During Hardware Resource Migration
CN109165034A (en) POS machine upgrade method, device, equipment and storage medium based on OTA
CN115168256A (en) Interrupt control method, interrupt controller, electronic device, medium, and chip
CN108984120B (en) Storage device path error processing method and related device
CN112711527B (en) Debugging method and device for real-time process, target machine and storage medium
CN115269326A (en) Task processing method, device, medium and equipment based on chip monitoring system
CN114691224A (en) Equipment loading system and method and electronic equipment
CN111209086A (en) Bare metal virtualization implementation method based on autonomous platform
CN119961040A (en) An exception handling method, device, equipment and storage medium for an embedded system
US10958597B2 (en) General purpose ring buffer handling in a network controller
CN112559336A (en) Method, device and system for adaptively debugging heterogeneous computing chip and mainboard chip
US11861214B2 (en) Memory device forensics and preparation
CN116611375A (en) Software and hardware collaborative simulation platform and software and hardware testing method
CN113342698B (en) Test environment scheduling method, computing device and storage medium
CN117669443A (en) Chip prototype verification method, device, equipment and medium
CN116755828A (en) Parameter value configuration method, analysis method, electronic device and storage medium
CN113722011B (en) Application interface starting method, device, equipment and storage medium
US12468533B2 (en) Instruction execution control apparatus and method using correspondence information between different processor instruction sets
CN115098064B (en) Function-level intrinsically secure software development system and compilation and debugging method
EP4664290A2 (en) Improved root cause analysis (rca) -based node recovery
US11652683B2 (en) Failure notification system, failure notification method, failure notification device, and failure notification program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant