[go: up one dir, main page]

WO2025232323A1 - System anomaly diagnosis method and apparatus, storage medium, and electronic device - Google Patents

System anomaly diagnosis method and apparatus, storage medium, and electronic device

Info

Publication number
WO2025232323A1
WO2025232323A1 PCT/CN2025/079139 CN2025079139W WO2025232323A1 WO 2025232323 A1 WO2025232323 A1 WO 2025232323A1 CN 2025079139 W CN2025079139 W CN 2025079139W WO 2025232323 A1 WO2025232323 A1 WO 2025232323A1
Authority
WO
WIPO (PCT)
Prior art keywords
subsystem
anomaly
joint
diagnosis
anomaly diagnosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2025/079139
Other languages
French (fr)
Chinese (zh)
Inventor
刘燚
李权威
刘子千
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL Digital Technology Co Ltd
Original Assignee
Shenzhen TCL Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL Digital Technology Co Ltd filed Critical Shenzhen TCL Digital Technology Co Ltd
Publication of WO2025232323A1 publication Critical patent/WO2025232323A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis

Definitions

  • This application relates to the field of system anomaly handling technology, specifically to a system anomaly diagnosis method, device, storage medium, and electronic device.
  • the system is widely used in various industries.
  • timely and reliable detection is usually required to carry out relevant processing to ensure system stability.
  • This application provides a system anomaly diagnosis scheme that enables rapid connection of subsystems within the system and joint anomaly diagnosis of multiple subsystems with low performance overhead, effectively improving the reliability of anomaly diagnosis and reliably ensuring system stability.
  • a system anomaly diagnosis method includes multiple subsystems and an anomaly diagnosis framework.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems.
  • the method is applied to the anomaly diagnosis framework and includes: in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service, distributing a corresponding message processing thread; receiving the anomaly information through the message processing thread and distributing the anomaly information to the subsystem agent corresponding to the target subsystem; and performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.
  • the method further includes: making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • the method before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if the subsystems in the multiple subsystems are detected to have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.
  • the step of performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the plurality of subsystems, to obtain a joint anomaly diagnosis result includes: performing anomaly diagnosis through the subsystem agent corresponding to the target subsystem based on the anomaly information to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.
  • obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result includes: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.
  • the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • the anomaly diagnosis framework further includes a framework performance self-test module; the method further includes: acquiring anomaly diagnosis processing related data in the system; and analyzing the anomaly diagnosis processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.
  • a system anomaly diagnosis device includes multiple subsystems and an anomaly diagnosis framework.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems.
  • the device is applied to the anomaly diagnosis framework and includes: a connection processing module, configured to distribute a corresponding message processing thread in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service; a data distribution module, configured to receive the anomaly information through the message processing thread and distribute the anomaly information to the subsystem agent corresponding to the target subsystem; and a joint diagnosis module, configured to perform joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem and in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.
  • the device after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the device further includes an anomaly repair module, used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • an anomaly repair module used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • the device before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the device further includes a proxy creation module, configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.
  • a proxy creation module configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.
  • the joint diagnosis module is configured to: perform anomaly diagnosis based on the anomaly information through the subsystem agent corresponding to the target subsystem to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.
  • the joint diagnosis module is configured to: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.
  • the device further includes a data reporting module, configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • a data reporting module configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • the anomaly diagnosis framework further includes a framework performance self-testing module; the framework performance self-testing module is used to: acquire anomaly diagnosis and processing related data in the system; and analyze the anomaly diagnosis and processing related data through the framework performance self-testing module to obtain the framework performance in the anomaly diagnosis framework.
  • a storage medium stores a computer program thereon, which, when executed by a computer's processor, causes the computer to perform the methods described in the embodiments of this application.
  • an electronic device may include: a memory storing a computer program; and a processor reading the computer program stored in the memory to execute the methods described in the embodiments of this application.
  • a computer program product or computer program includes computer instructions stored in a computer-readable storage medium.
  • a processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the methods provided in the various optional implementations described in the embodiments of this application.
  • the system includes multiple subsystems and an anomaly diagnosis framework.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems.
  • the method is applied to the anomaly diagnosis framework and includes: in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service, distributing a corresponding message processing thread; receiving the anomaly information through the message processing thread and distributing the anomaly information to the subsystem agent corresponding to the target subsystem; and performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.
  • an anomaly diagnosis framework is configured in the system.
  • the framework includes an inter-process communication service and subsystem agents corresponding to each subsystem.
  • the framework quickly distributes the corresponding message processing thread to receive and distribute the anomaly information. This allows for rapid connection between the subsystems.
  • the subsystem agent corresponding to the target subsystem it quickly links with the subsystem agents corresponding to other subsystems in multiple subsystems to perform joint anomaly diagnosis. This allows for the combined capabilities of multiple subsystems to accurately obtain joint anomaly diagnosis results. Overall, it can achieve rapid connection of subsystems in the system and joint anomaly diagnosis with multiple subsystems under low performance overhead, effectively improving the reliability of anomaly diagnosis in the system and reliably ensuring system stability.
  • Figure 1 shows a flowchart of a system anomaly diagnosis method according to an embodiment of this application.
  • Figure 2 shows a flowchart of a subsystem agent creation process according to an embodiment of this application.
  • Figure 3 shows a flowchart of a system anomaly joint diagnosis according to an embodiment of this application.
  • Figure 4 shows a block diagram of a system anomaly diagnosis device according to an embodiment of this application.
  • Figure 5 shows a block diagram of an electronic device according to an embodiment of this application.
  • the terms “comprising,” “including,” or any other variations thereof are intended to cover non-exclusive inclusion, such that a method or apparatus that includes a list of elements includes not only the elements expressly described, but also other elements not expressly listed, or elements inherent to implementing the method or apparatus.
  • an element defined by the phrase “comprising a" does not exclude the presence of other related elements (e.g., steps in the method or units in the apparatus, such as portions of circuitry, processors, programs, or software, etc.) in the method or apparatus that includes that element.
  • system anomaly diagnosis method provided in this disclosure includes a series of steps, but the system anomaly diagnosis method provided in this disclosure is not limited to the steps described.
  • system anomaly diagnosis device provided in this disclosure includes a series of units, but the device provided in this disclosure is not limited to the units explicitly described, but may also include units that need to be set up for obtaining relevant information or processing based on information.
  • Figure 1 schematically illustrates a flowchart of a system anomaly diagnosis method according to an embodiment of this application.
  • the subject executing this system anomaly diagnosis method can be any device with processing capabilities, such as a television, computer, mobile phone, smartwatch, and home appliance.
  • the device can install a system, which includes multiple subsystems and an anomaly diagnosis framework.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems.
  • the method can be applied to the anomaly diagnosis framework.
  • the abnormal diagnosis method of this system may include steps S110 to S130.
  • Step S110 In response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, a corresponding message processing thread is distributed; Step S120: The abnormal information is received through the message processing thread and distributed to the subsystem agent corresponding to the target subsystem; Step S130: The subsystem agent corresponding to the target subsystem, together with the subsystem agents corresponding to other subsystems in the multiple subsystems, performs joint abnormal diagnosis to obtain a joint abnormal diagnosis result.
  • the system includes at least multiple subsystems and an anomaly diagnosis framework.
  • These subsystems may include, for example, wireless network (Wi-Fi), audio, Bluetooth, multimedia, and kernel subsystems, which can reside at the native layer, system layer, or application layer.
  • Wi-Fi wireless network
  • audio audio
  • Bluetooth multimedia
  • kernel subsystems which can reside at the native layer, system layer, or application layer.
  • the anomaly diagnosis framework is used to jointly diagnose anomalies across these multiple subsystems; this framework can reside at the native layer.
  • the anomaly diagnosis framework can include pre-created or dynamically created cross-process communication services such as UnixDomainSocket Server or AIDL Service.
  • This framework can monitor messages or data sent by multiple subsystems across processes using these services.
  • the framework in response to anomaly information detected by the cross-process communication service from a target subsystem (which can be any one of the multiple subsystems), the framework can immediately dispatch the corresponding message processing thread of the target subsystem to receive and distribute the anomaly information, thus enabling the framework to quickly connect to that target subsystem. Since the target subsystem can be any one of the multiple subsystems, the framework can quickly connect to any subsystem within the entire network.
  • the anomaly diagnosis framework can also create subsystem agents corresponding to each subsystem.
  • Subsystem agents can obtain data from the corresponding subsystem and agent modules with the corresponding subsystem's anomaly repair capabilities.
  • Subsystem agents can interact with each other through messages to collaboratively perform anomaly diagnosis.
  • the anomaly diagnosis framework can receive anomaly information through a message processing thread and distribute the anomaly information to the subsystem agent corresponding to the target subsystem. Through the subsystem agent corresponding to the target subsystem, it can perform joint anomaly diagnosis in conjunction with the subsystem agents corresponding to other subsystems in multiple subsystems, thereby accurately obtaining the joint anomaly diagnosis results.
  • an anomaly diagnosis framework is configured in the system.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each subsystem.
  • the anomaly diagnosis framework can quickly distribute the corresponding message processing thread to receive and distribute the anomaly information, thereby quickly connecting each subsystem.
  • the subsystem agent corresponding to the target subsystem it can quickly link with the subsystem agents corresponding to other subsystems in multiple subsystems to perform joint anomaly diagnosis. It can combine the capabilities of multiple subsystems to perform joint anomaly diagnosis and accurately obtain the joint anomaly diagnosis result. Overall, it can quickly connect the subsystems in the system and combine the capabilities of multiple subsystems to perform joint anomaly diagnosis with low performance overhead, effectively improving the reliability of anomaly diagnosis in the system and reliably ensuring system stability.
  • the method before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: step S210, detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; step S220, if it is detected that the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.
  • the anomaly diagnosis framework encapsulates connection interfaces, which are used by subsystems to access the anomaly diagnosis framework.
  • connection interfaces are Software Development Kit (SDK) interfaces (APIs), which can include application (APP), system, and kernel SDK interfaces.
  • SDK Software Development Kit
  • APIs APIs
  • APP application
  • system system
  • kernel SDK interfaces kernel SDK interfaces
  • the anomaly diagnosis framework can detect whether a subsystem in multiple subsystems has connected to the anomaly diagnosis framework through an encapsulated connection interface. If a subsystem is detected to have connected to the anomaly diagnosis framework through an encapsulated connection interface, the anomaly diagnosis framework can create a subsystem agent corresponding to that subsystem. Through the subsystem agent, the data of that subsystem and its anomaly repair capabilities can be obtained.
  • the anomaly diagnosis framework can create subsystem agents corresponding to these three subsystems.
  • the anomaly diagnosis framework can then obtain the data of these three subsystems and their anomaly repair capabilities through the subsystem agents.
  • the step of performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain the joint anomaly diagnosis result may specifically include:
  • Step S310 The subsystem agent corresponding to the target subsystem performs anomaly diagnosis based on the anomaly information to obtain a first anomaly diagnosis result;
  • Step S320 If the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, a collaborative diagnosis message is sent to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem;
  • Step S330 The subsystem agent corresponding to the first joint subsystem performs anomaly diagnosis to obtain a second anomaly diagnosis result;
  • Step S340 The joint anomaly diagnosis result is obtained based on the first anomaly diagnosis result and the second anomaly diagnosis result.
  • the subsystem agent corresponding to the target subsystem can perform anomaly diagnosis based on the anomaly information to obtain the first anomaly diagnosis result.
  • the Media subsystem when a user launches a video-on-demand (VOD) application, the Media subsystem detects buffering, playback, stuttering, and audio-visual synchronization information.
  • VOD video-on-demand
  • the Media subsystem detects an anomaly, it sends the anomaly information to the anomaly diagnosis framework through a specific channel (a channel used by the subsystem to send anomaly information to the anomaly diagnosis framework).
  • the anomaly diagnosis framework detects the anomaly information sent by the target subsystem through the inter-process communication service, it can immediately dispatch the message processing thread corresponding to the target subsystem to receive the anomaly information.
  • the message processing thread then distributes the anomaly information to the Media subsystem's corresponding Media subsystem agent, which in turn obtains the anomaly information.
  • the Media subsystem agent inside the abnormal diagnosis framework can analyze and diagnose the abnormal information, and obtain the first abnormal diagnosis result as "the video axis and audio axis cannot be associated, and a data buffering problem caused by insufficient playback buffer is found".
  • the subsystem agent corresponding to the target subsystem determines whether collaborative diagnosis by other first joint subsystems is required based on the first anomaly diagnosis result. If so, a collaborative diagnosis message is sent from the subsystem agent corresponding to the target subsystem to the subsystem agent corresponding to the first joint subsystem.
  • the first joint subsystem includes one or more subsystems from multiple subsystems within the system, excluding the target subsystem.
  • the subsystem agent corresponding to the target subsystem determines the first joint subsystem based on the first anomaly diagnosis result by: querying the first anomaly category corresponding to the first anomaly diagnosis result from a preset diagnosis table, and further querying the first joint subsystem corresponding to that first anomaly category from the preset diagnosis table.
  • the preset diagnosis table may include preset anomaly categories corresponding to different anomaly diagnosis results and joint subsystems corresponding to different anomaly categories.
  • the preset diagnosis table may be configured within the anomaly diagnosis framework, and the subsystem agent corresponding to the target subsystem can obtain the preset diagnosis table from the anomaly diagnosis framework.
  • the Media subsystem agent of the Media subsystem can query the preset diagnosis table to find that the first anomaly category corresponding to the first anomaly diagnosis result is "Category 1".
  • the joint subsystems corresponding to "Category 1" in the preset diagnosis table include the Wi-Fi subsystem and the Audio subsystem. Therefore, the Media subsystem agent can further determine that other first joint subsystems need to perform collaborative diagnosis, and the first joint subsystems specifically include the Wi-Fi subsystem and the Audio subsystem.
  • the subsystem agent corresponding to the first joint subsystem in the anomaly diagnosis framework can perform further anomaly diagnosis to obtain a second anomaly diagnosis result.
  • the Wi-Fi subsystem agent can combine the Wi-Fi diagnosis subsystem to diagnose the Wi-Fi situation, confirming the Wi-Fi hardware status, network connection status, network connection quality, etc., thereby obtaining the diagnostic result for the Wi-Fi module;
  • the Audio subsystem agent can combine Media keyframe data to perform audio-visual analysis, such as analyzing Media decoding efficiency and Audio decoding efficiency, to obtain the diagnostic result for the Audio module. Therefore, the second anomaly diagnosis result can include the diagnostic results for both the Wi-Fi module and the Audio module.
  • the subsystem agent corresponding to the target subsystem in the anomaly diagnosis framework can obtain the first and second anomaly diagnosis results.
  • the set of the first and second anomaly diagnosis results can be used as the joint anomaly diagnosis result.
  • this joint diagnosis mechanism can quickly link multiple systems to obtain accurate joint anomaly diagnosis results, ensuring the reliability of system anomaly diagnosis.
  • system anomalies can be reliably repaired, ensuring system operational stability.
  • obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result may include: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.
  • a collaborative diagnosis message is sent to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem.
  • the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result by: querying the second anomaly category corresponding to the second anomaly diagnosis result from a preset diagnosis table, and further querying the second joint subsystem corresponding to the second anomaly category from the preset diagnosis table.
  • the preset diagnosis table may include preset anomaly categories corresponding to different anomaly diagnosis results and joint subsystems corresponding to different anomaly categories.
  • the preset diagnosis table may be configured in the anomaly diagnosis framework, and the subsystem agent corresponding to the first joint subsystem can obtain the preset diagnosis table from the anomaly diagnosis framework.
  • the Wi-Fi subsystem agent can query the pre-defined diagnostic table to find that the diagnostic result of the Wi-Fi module (belonging to the second abnormal diagnostic result) corresponds to the second abnormal category "Category 2-1". If the joint subsystem corresponding to "Category 2-1" in the pre-defined diagnostic table is the kernel subsystem, then the Wi-Fi subsystem agent can determine that the second joint subsystem is the kernel subsystem. The Wi-Fi subsystem agent can further send a collaborative diagnostic message to the subsystem agent of the kernel subsystem.
  • the subsystem agent corresponding to the second joint subsystem After receiving the collaborative diagnostic message, the subsystem agent corresponding to the second joint subsystem performs anomaly diagnosis and obtains a third anomaly diagnosis result.
  • the subsystem agent of the kernel subsystem can check memory, CPU, and other conditions to perform diagnosis and obtain a third anomaly diagnosis result.
  • the subsystem agent corresponding to the second joint subsystem can further feed back the third anomaly diagnosis result to the subsystem agent corresponding to the target subsystem.
  • the subsystem agent corresponding to the target subsystem in the anomaly diagnosis framework can obtain the first, second, and third anomaly diagnosis results.
  • the set of these three results can serve as the joint anomaly diagnosis result.
  • This joint diagnosis mechanism allows for rapid linkage of multiple systems to obtain accurate joint anomaly diagnosis results, ensuring the reliability of system anomaly diagnosis. Simultaneously, based on this joint anomaly diagnosis result, system anomalies can be reliably repaired, ensuring system operational stability.
  • the anomaly diagnosis framework can rapidly link multiple subsystem agents for multi-subsystem joint diagnosis.
  • the method further includes: making joint anomaly repair decisions for the plurality of subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • a joint repair strategy for multiple subsystems is obtained by making joint repair decisions for multiple subsystems.
  • the joint repair of anomalies in multiple subsystems based on the joint repair strategy can effectively improve the reliability of anomaly repair.
  • the specific method for making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy may include: extracting abnormal issues from the joint anomaly diagnosis results; if at least two abnormal issues are extracted, prioritizing the at least two abnormal issues to obtain the anomaly handling priority for each abnormal issue, and the multi-subsystem joint repair strategy is to perform anomaly repair according to the anomaly handling priority.
  • the step of prioritizing the at least two abnormal issues to obtain the anomaly handling priority for each abnormal issue may include: querying the anomaly handling priority of each extracted abnormal issue from a preset priority table, wherein the preset priority table includes pre-defined anomaly handling priorities for different abnormal issues, and the preset priority table can be pre-configured in the anomaly diagnosis framework.
  • an anomaly may be caused by anomalies in multiple subsystems. Without a joint anomaly repair decision for these subsystems to arrive at a multi-subsystem joint repair strategy, functional conflicts may arise from different subsystems calling the final anomaly repair capabilities, leading to inaccurate anomaly repair.
  • the joint anomaly diagnosis results include the Media subsystem agent detecting insufficient CPU, the Wi-Fi subsystem agent also detecting insufficient CPU, and the kernel subsystem agent detecting insufficient CPU due to I/O. Two anomalies can be extracted: "I/O problem” and "Insufficient CPU.” For the "Insufficient CPU” problem, the Media and Wi-Fi subsystem agents can request CPU scheduling to repair it.
  • a multi-subsystem joint repair strategy (which repairs anomalies according to the aforementioned anomaly handling priority) can prioritize addressing the I/O problem before requesting CPU scheduling, thereby effectively improving the reliability of anomaly repair.
  • the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • the anomaly diagnosis framework transmits anomaly-related data to a pre-defined reporting service (such as a pre-defined reporting application). This service then reports the anomaly-related data to the cloud server.
  • the anomaly diagnosis framework resides in the native layer and transmits data to the pre-defined reporting service (such as a pre-defined reporting application) for network reporting. Because of limitations in the system layer, native layer, and app scheduling, reporting through the pre-defined reporting service bypasses these limitations and enables rapid data reporting through mechanisms such as the log system.
  • the anomaly diagnosis framework can receive anomaly repair content transmitted from the device.
  • this anomaly repair content can be generated by a cloud server based on anomaly-related data analysis and transmitted to the device.
  • This anomaly repair content can be used in the anomaly diagnosis framework to make joint anomaly repair decisions, thereby further improving the reliability of anomaly repair.
  • the anomaly diagnosis framework makes joint anomaly repair decisions based on a repair knowledge graph.
  • the anomaly diagnosis framework can further adjust the repair knowledge graph based on the received anomaly repair content, and make joint anomaly repair decisions based on the adjusted repair knowledge graph to obtain a multi-subsystem joint repair strategy.
  • the anomaly diagnosis framework also includes a framework performance self-test module; the method further includes: acquiring anomaly diagnosis and processing related data in the system; and analyzing the anomaly diagnosis and processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.
  • the anomaly diagnosis framework is configured with a framework performance self-test module.
  • This module can acquire relevant data on anomaly diagnosis and processing in the system, analyze the data, and obtain the framework performance of the anomaly diagnosis framework. This enables the anomaly diagnosis framework to self-test its performance. Based on the test results, corresponding processing can be performed to further improve the framework performance of the anomaly diagnosis framework.
  • the framework performance self-test module may include units such as the data entry performance test unit, the data accuracy test unit, the subsystem diagnosis test unit, and the subsystem repair test unit.
  • the subsystem diagnosis test unit can self-test its anomaly diagnosis performance
  • the subsystem repair test unit can self-test its anomaly repair performance.
  • FIG. 4 shows a block diagram of a system anomaly diagnosis device according to an embodiment of this application.
  • a system anomaly diagnosis device 400 is provided.
  • the system includes multiple subsystems and an anomaly diagnosis framework.
  • the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems.
  • the device is applied to the anomaly diagnosis framework.
  • the system anomaly diagnosis device 400 may include: a connection processing module 410, which can be used to distribute corresponding message processing threads in response to anomaly information sent by a target subsystem in the multiple subsystems detected through the inter-process communication service; a data distribution module 420, which can be used to receive the anomaly information through the message processing threads and distribute the anomaly information to the subsystem agent corresponding to the target subsystem; and a joint diagnosis module 430, which can be used to perform joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem and in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain a joint anomaly diagnosis result.
  • a connection processing module 410 which can be used to distribute corresponding message processing threads in response to anomaly information sent by a target subsystem in the multiple subsystems detected through the inter-process communication service
  • a data distribution module 420 which can be used to receive the anomaly information through the message processing threads and distribute the anomaly information to the subsystem agent corresponding to the target subsystem
  • the device after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the device further includes an anomaly repair module, used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • an anomaly repair module used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • the device before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the device further includes a proxy creation module, configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.
  • a proxy creation module configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.
  • the joint diagnosis module is configured to: perform anomaly diagnosis based on the anomaly information through the subsystem agent corresponding to the target subsystem to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.
  • the joint diagnosis module is configured to: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.
  • the device further includes a data reporting module, configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • a data reporting module configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • the anomaly diagnosis framework further includes a framework performance self-testing module; the framework performance self-testing module is used to: acquire anomaly diagnosis and processing related data in the system; and analyze the anomaly diagnosis and processing related data through the framework performance self-testing module to obtain the framework performance in the anomaly diagnosis framework.
  • FIG5 shows a block diagram of an electronic device according to an embodiment of this application, specifically:
  • the electronic device may include components such as a processor 501 with one or more processing cores, a memory 502 with one or more computer-readable storage media, a power supply 503, and an input unit 504.
  • a processor 501 with one or more processing cores
  • a memory 502 with one or more computer-readable storage media
  • a power supply 503 with one or more computer-readable storage media
  • FIG. 5 does not constitute a limitation on the electronic device, and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
  • the processor 501 is the control center of the electronic device. It connects to various parts of the computer device via various interfaces and lines. By running or executing software programs and/or modules stored in the memory 502, and by calling data stored in the memory 502, it performs various functions of the computer device and processes data, thereby providing overall monitoring of the electronic device.
  • the processor 501 may include one or more processing cores; preferably, the processor 501 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user page, and application programs, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 501.
  • the memory 502 can be used to store software programs and modules.
  • the processor 501 executes various functional applications and data processing by running the software programs and modules stored in the memory 502.
  • the memory 502 may mainly include a program storage area and a data storage area.
  • the program storage area may store the operating system, application programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the computer device, etc.
  • the memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 502 may also include a memory controller to provide the processor 501 with access to the memory 502.
  • the electronic device also includes a power supply 503 that supplies power to various components.
  • the power supply 503 can be logically connected to the processor 501 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system.
  • the power supply 503 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.
  • the electronic device may also include an input unit 504, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
  • an input unit 504 can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
  • the electronic device may also include a display unit, etc., which will not be described in detail here.
  • the processor 501 in the electronic device loads the executable files corresponding to the processes of one or more computer programs into the memory 502 according to the following instructions, and the processor 501 runs the computer programs stored in the memory 502, thereby realizing the various functions in the foregoing embodiments of this application.
  • the processor 501 can perform the following steps:
  • a corresponding message processing thread is distributed; the abnormal information is received by the message processing thread and distributed to the subsystem agent corresponding to the target subsystem; the subsystem agent corresponding to the target subsystem, together with the subsystem agents corresponding to other subsystems in the multiple subsystems, performs joint abnormal diagnosis to obtain a joint abnormal diagnosis result.
  • the method further includes: making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.
  • the method before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if the subsystems in the multiple subsystems are detected to have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.
  • the step of performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the plurality of subsystems, to obtain a joint anomaly diagnosis result includes: performing anomaly diagnosis through the subsystem agent corresponding to the target subsystem based on the anomaly information to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.
  • obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result includes: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.
  • the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.
  • the anomaly diagnosis framework further includes a framework performance self-test module; it also includes: acquiring anomaly diagnosis processing related data in the system; and analyzing the anomaly diagnosis processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.
  • embodiments of this application also provide a storage medium storing a computer program that can be loaded by a processor to execute the steps in any of the methods provided in embodiments of this application.
  • the storage medium can be a computer-readable storage medium, which may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer And Data Communications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present application relates to the technical field of system anomaly processing, and discloses a system anomaly diagnosis method and apparatus, a storage medium, and an electronic device. Upon detecting, by means of a cross-process communication service, anomaly information sent by a target subsystem, a message processing thread is distributed; the anomaly information is received by means of the message processing thread and distributed to a corresponding subsystem agent; and anomaly joint diagnosis is performed by means of the subsystem agent in conjunction with subsystem agents of other subsystems. The present application improves the reliability of anomaly diagnosis in systems.

Description

系统异常诊断方法、装置、存储介质及电子设备System anomaly diagnosis methods, devices, storage media and electronic equipment

本申请要求于2024年05月07日提交中国专利局、申请号为202410559269.7、申请名称为“系统异常诊断方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 202410559269.7, filed on May 7, 2024, entitled "System Anomaly Diagnosis Method, Apparatus, Storage Medium and Electronic Equipment", the entire contents of which are incorporated herein by reference.

技术领域Technical Field

本申请涉及系统异常处理技术领域,具体涉及一种系统异常诊断方法、装置、存储介质及电子设备。This application relates to the field of system anomaly handling technology, specifically to a system anomaly diagnosis method, device, storage medium, and electronic device.

背景技术Background Technology

以Android系统为例的系统,广泛应用于各行各业,系统中出现异常通常需要及时可靠地检测,才能进行相关处理来保证系统稳定性。Taking Android as an example, the system is widely used in various industries. When anomalies occur in the system, timely and reliable detection is usually required to carry out relevant processing to ensure system stability.

技术问题Technical issues

目前的系统异常诊断手段,通常是针对某个出现的异常单独进行异常诊断,异常诊断的可靠性较低,导致系统稳定地难以保证。Current methods for diagnosing system anomalies typically involve diagnosing an anomaly individually, which results in low reliability and makes it difficult to guarantee system stability.

技术解决方案Technical solutions

本申请实施例提供一种系统异常诊断方案,可以实现在低性能开销的情况下,快速连接系统中子系统并联合多个子系统的能力进行异常联合诊断,有效提升系统中异常诊断可靠性,可靠保证系统稳定性。This application provides a system anomaly diagnosis scheme that enables rapid connection of subsystems within the system and joint anomaly diagnosis of multiple subsystems with low performance overhead, effectively improving the reliability of anomaly diagnosis and reliably ensuring system stability.

本申请实施例提供以下技术方案:The embodiments of this application provide the following technical solutions:

根据本申请的一个实施例,一种系统异常诊断方法,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述方法应用于所述异常诊断框架,所述方法包括:响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。According to one embodiment of this application, a system anomaly diagnosis method is provided. The system includes multiple subsystems and an anomaly diagnosis framework. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems. The method is applied to the anomaly diagnosis framework and includes: in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service, distributing a corresponding message processing thread; receiving the anomaly information through the message processing thread and distributing the anomaly information to the subsystem agent corresponding to the target subsystem; and performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.

在本申请的一些实施例中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述方法还包括:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。In some embodiments of this application, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the method further includes: making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.

在本申请的一些实施例中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述方法还包括:检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。In some embodiments of this application, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if the subsystems in the multiple subsystems are detected to have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.

在本申请的一些实施例中,所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果,包括:通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the step of performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the plurality of subsystems, to obtain a joint anomaly diagnosis result includes: performing anomaly diagnosis through the subsystem agent corresponding to the target subsystem based on the anomaly information to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.

在本申请的一些实施例中,所述根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果,包括:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result includes: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.

在本申请的一些实施例中,所述方法还包括:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。In some embodiments of this application, the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.

在本申请的一些实施例中,所述异常诊断框架中还包括框架性能自检模块;所述方法还包括:获取所述系统中的异常诊断处理相关数据;通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。In some embodiments of this application, the anomaly diagnosis framework further includes a framework performance self-test module; the method further includes: acquiring anomaly diagnosis processing related data in the system; and analyzing the anomaly diagnosis processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.

根据本申请的一个实施例,一种系统异常诊断装置,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述装置应用于所述异常诊断框架,所述装置包括:连接处理模块,用于响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;数据分发模块,用于通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;联合诊断模块,用于通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。According to one embodiment of this application, a system anomaly diagnosis device is provided. The system includes multiple subsystems and an anomaly diagnosis framework. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems. The device is applied to the anomaly diagnosis framework and includes: a connection processing module, configured to distribute a corresponding message processing thread in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service; a data distribution module, configured to receive the anomaly information through the message processing thread and distribute the anomaly information to the subsystem agent corresponding to the target subsystem; and a joint diagnosis module, configured to perform joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem and in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.

在本申请的一些实施例中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述装置还包括异常修复模块,用于:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。In some embodiments of this application, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the device further includes an anomaly repair module, used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.

在本申请的一些实施例中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述装置还包括代理创建模块,用于:检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。In some embodiments of this application, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the device further includes a proxy creation module, configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.

在本申请的一些实施例中,所述联合诊断模块,用于:通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the joint diagnosis module is configured to: perform anomaly diagnosis based on the anomaly information through the subsystem agent corresponding to the target subsystem to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.

在本申请的一些实施例中,所述联合诊断模块,用于:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the joint diagnosis module is configured to: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.

在本申请的一些实施例中,所述装置还包括数据上报模块,用于:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。In some embodiments of this application, the device further includes a data reporting module, configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.

在本申请的一些实施例中,所述异常诊断框架中还包括框架性能自检模块;所述框架性能自检模块,用于:获取所述系统中的异常诊断处理相关数据;通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。In some embodiments of this application, the anomaly diagnosis framework further includes a framework performance self-testing module; the framework performance self-testing module is used to: acquire anomaly diagnosis and processing related data in the system; and analyze the anomaly diagnosis and processing related data through the framework performance self-testing module to obtain the framework performance in the anomaly diagnosis framework.

根据本申请的另一实施例,一种存储介质,其上存储有计算机程序,当所述计算机程序被计算机的处理器执行时,使计算机执行本申请实施例所述的方法。According to another embodiment of this application, a storage medium stores a computer program thereon, which, when executed by a computer's processor, causes the computer to perform the methods described in the embodiments of this application.

根据本申请的另一实施例,一种电子设备可以包括:存储器,存储有计算机程序;处理器,读取存储器存储的计算机程序,以执行本申请实施例所述的方法。According to another embodiment of this application, an electronic device may include: a memory storing a computer program; and a processor reading the computer program stored in the memory to execute the methods described in the embodiments of this application.

根据本申请的另一实施例,一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行本申请实施例所述的各种可选实现方式中提供的方法。According to another embodiment of this application, a computer program product or computer program includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the methods provided in the various optional implementations described in the embodiments of this application.

有益效果Beneficial effects

本申请实施例的系统异常诊断方法中,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述方法应用于所述异常诊断框架,所述方法包括:响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。In the system anomaly diagnosis method of this application embodiment, the system includes multiple subsystems and an anomaly diagnosis framework. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems. The method is applied to the anomaly diagnosis framework and includes: in response to detecting anomaly information sent by a target subsystem in the multiple subsystems through the inter-process communication service, distributing a corresponding message processing thread; receiving the anomaly information through the message processing thread and distributing the anomaly information to the subsystem agent corresponding to the target subsystem; and performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, to obtain a joint anomaly diagnosis result.

以这种方式,系统中配置异常诊断框架,异常诊断框架中包括跨进程通信服务以及各子系统对应的子系统代理,异常诊断框架可以响应于通过跨进程通信服务监测到多个子系统中目标子系统发送的异常信息,便快速分发对应的消息处理线程对该异常信息进行接收分发,进而可以快速连接各子系统,进一步的,通过目标子系统对应的子系统代理,快速联动多个子系统中其它子系统对应的子系统代理进行异常联合诊断,可以联合多个子系统的能力进行异常联合诊断而准确地得到异常联合诊断结果,整体上可以实现在低性能开销的情况下,快速连接系统中子系统并联合多个子系统的能力进行异常联合诊断,有效提升系统中异常诊断可靠性,可靠保证系统稳定性。In this way, an anomaly diagnosis framework is configured in the system. The framework includes an inter-process communication service and subsystem agents corresponding to each subsystem. In response to anomaly information detected by the inter-process communication service from a target subsystem in multiple subsystems, the framework quickly distributes the corresponding message processing thread to receive and distribute the anomaly information. This allows for rapid connection between the subsystems. Furthermore, through the subsystem agent corresponding to the target subsystem, it quickly links with the subsystem agents corresponding to other subsystems in multiple subsystems to perform joint anomaly diagnosis. This allows for the combined capabilities of multiple subsystems to accurately obtain joint anomaly diagnosis results. Overall, it can achieve rapid connection of subsystems in the system and joint anomaly diagnosis with multiple subsystems under low performance overhead, effectively improving the reliability of anomaly diagnosis in the system and reliably ensuring system stability.

附图说明Attached Figure Description

为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

图1示出了根据本申请的一个实施例的系统异常诊断方法的流程图。Figure 1 shows a flowchart of a system anomaly diagnosis method according to an embodiment of this application.

图2示出了根据本申请的一个实施例的子系统代理创建流程图。Figure 2 shows a flowchart of a subsystem agent creation process according to an embodiment of this application.

图3示出了根据本申请的一个实施例的系统异常联合诊断流程图。Figure 3 shows a flowchart of a system anomaly joint diagnosis according to an embodiment of this application.

图4示出了根据本申请的一个实施例的系统异常诊断装置的框图。Figure 4 shows a block diagram of a system anomaly diagnosis device according to an embodiment of this application.

图5示出了根据本申请的一个实施例的电子设备的框图。Figure 5 shows a block diagram of an electronic device according to an embodiment of this application.

本申请的实施方式Implementation methods of this application

以下结合附图及实施例,对本公开进行进一步详细说明。应当理解,此处所提供的实施例仅仅用以解释本公开,并不用于限定本公开。另外,以下所提供的实施例是用于实施本公开的部分实施例,而非提供实施本公开的全部实施例,在不冲突的情况下,本公开实施例记载的技术方案可以任意组合的方式实施。The present disclosure will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments provided herein are merely illustrative of the present disclosure and are not intended to limit the present disclosure. Furthermore, the embodiments provided below are some embodiments for implementing the present disclosure, and not all embodiments for implementing the present disclosure. Unless otherwise specified, the technical solutions described in the embodiments of the present disclosure can be implemented in any combination.

需要说明的是,在本公开实施例中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的方法或者装置不仅包括所明确记载的要素,而且还包括没有明确列出的其他要素,或者是还包括为实施方法或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的方法或者装置中还存在另外的相关要素(例如方法中的步骤或者装置中的单元,例如的单元可以是部分电路、部分处理器、部分程序或软件等等)。It should be noted that, in the embodiments of this disclosure, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a method or apparatus that includes a list of elements includes not only the elements expressly described, but also other elements not expressly listed, or elements inherent to implementing the method or apparatus. Without further limitations, an element defined by the phrase "comprising a..." does not exclude the presence of other related elements (e.g., steps in the method or units in the apparatus, such as portions of circuitry, processors, programs, or software, etc.) in the method or apparatus that includes that element.

例如,本公开实施例提供的系统异常诊断方法包含了一系列的步骤,但是本公开实施例提供的系统异常诊断方法不限于所记载的步骤,同样地,本公开实施例提供的系统异常诊断装置包括了一系列单元,但是本公开实施例提供的装置不限于包括所明确记载的单元,还可以包括为获取相关信息、或基于信息进行处理时所需要设置的单元。For example, the system anomaly diagnosis method provided in this disclosure includes a series of steps, but the system anomaly diagnosis method provided in this disclosure is not limited to the steps described. Similarly, the system anomaly diagnosis device provided in this disclosure includes a series of units, but the device provided in this disclosure is not limited to the units explicitly described, but may also include units that need to be set up for obtaining relevant information or processing based on information.

除非另有定义,本文所使用的所有的技术和科学术语与属于本公开的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述具体的实施例的目的,不是旨在限制本公开。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of this disclosure.

图1示意性示出了根据本申请的一个实施例的系统异常诊断方法的流程图。该系统异常诊断方法的执行主体可以是任意具有处理能力的设备,例如电视、电脑、手机、智能手表以及家电设备等。Figure 1 schematically illustrates a flowchart of a system anomaly diagnosis method according to an embodiment of this application. The subject executing this system anomaly diagnosis method can be any device with processing capabilities, such as a television, computer, mobile phone, smartwatch, and home appliance.

设备中可以安装系统,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述方法可以应用于所述异常诊断框架。The device can install a system, which includes multiple subsystems and an anomaly diagnosis framework. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems. The method can be applied to the anomaly diagnosis framework.

如图1所示,该系统异常诊断方法可以包括步骤S110至步骤S130。As shown in Figure 1, the abnormal diagnosis method of this system may include steps S110 to S130.

步骤S110,响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;步骤S120,通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;步骤S130,通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。Step S110: In response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, a corresponding message processing thread is distributed; Step S120: The abnormal information is received through the message processing thread and distributed to the subsystem agent corresponding to the target subsystem; Step S130: The subsystem agent corresponding to the target subsystem, together with the subsystem agents corresponding to other subsystems in the multiple subsystems, performs joint abnormal diagnosis to obtain a joint abnormal diagnosis result.

系统中至少包括多个子系统以及异常诊断框架。多个子系统例如无线网络(Wi-Fi)、音频(Audio)、蓝牙、多媒体(Media)及内核(Kernel)等等多个子系统,多个子系统可以位于原生层(Native层)、系统层或应用层。异常诊断框架为用于联合多个子系统进行异常联合诊断的框架,异常诊断框架可以位于原生层。The system includes at least multiple subsystems and an anomaly diagnosis framework. These subsystems may include, for example, wireless network (Wi-Fi), audio, Bluetooth, multimedia, and kernel subsystems, which can reside at the native layer, system layer, or application layer. The anomaly diagnosis framework is used to jointly diagnose anomalies across these multiple subsystems; this framework can reside at the native layer.

异常诊断框架中可以包括预先或动态创建的UnixDomainSocket Server或AIDL Service等跨进程通信服务,异常诊断框架可以通过跨进程通信服务跨进程监测多个子系统发送的消息或数据。进而,响应于通过跨进程通信服务监测到多个子系统中目标子系统(目标子系统可以为多个子系统中任意一个)发送的异常信息,异常诊断框架可以即时分发目标子系统对应的消息处理线程进行该异常信息的接收与分发,从而实现异常诊断框架快速连接该目标子系统。该目标子系统为多个子系统中任意一个子系统,也即异常诊断框架可以快速连接全部子系统中任意一个子系统。The anomaly diagnosis framework can include pre-created or dynamically created cross-process communication services such as UnixDomainSocket Server or AIDL Service. This framework can monitor messages or data sent by multiple subsystems across processes using these services. Furthermore, in response to anomaly information detected by the cross-process communication service from a target subsystem (which can be any one of the multiple subsystems), the framework can immediately dispatch the corresponding message processing thread of the target subsystem to receive and distribute the anomaly information, thus enabling the framework to quickly connect to that target subsystem. Since the target subsystem can be any one of the multiple subsystems, the framework can quickly connect to any subsystem within the entire network.

异常诊断框架还可以创建各子系统对应的子系统代理,子系统代理即可以获取到对应的子系统的数据以及具有对应的子系统的异常修复能力的代理模块,子系统代理之间可以进行消息交互从而协同进行异常诊断。The anomaly diagnosis framework can also create subsystem agents corresponding to each subsystem. Subsystem agents can obtain data from the corresponding subsystem and agent modules with the corresponding subsystem's anomaly repair capabilities. Subsystem agents can interact with each other through messages to collaboratively perform anomaly diagnosis.

异常诊断框架可以通过消息处理线程接收异常信息,并将异常信息分发至该目标子系统对应的子系统代理;通过目标子系统对应的子系统代理,联合多个子系统中其它子系统对应的子系统代理进行异常联合诊断,从而准确地得到异常联合诊断结果。The anomaly diagnosis framework can receive anomaly information through a message processing thread and distribute the anomaly information to the subsystem agent corresponding to the target subsystem. Through the subsystem agent corresponding to the target subsystem, it can perform joint anomaly diagnosis in conjunction with the subsystem agents corresponding to other subsystems in multiple subsystems, thereby accurately obtaining the joint anomaly diagnosis results.

以这种方式,基于步骤S110至步骤S130,系统中配置异常诊断框架,异常诊断框架中包括跨进程通信服务以及各子系统对应的子系统代理,异常诊断框架可以响应于通过跨进程通信服务监测到多个子系统中目标子系统发送的异常信息,便快速分发对应的消息处理线程对该异常信息进行接收分发,进而可以快速连接各子系统,进一步的,通过目标子系统对应的子系统代理,快速联动多个子系统中其它子系统对应的子系统代理进行异常联合诊断,可以联合多个子系统的能力进行异常联合诊断而准确地得到异常联合诊断结果,整体上可以实现在低性能开销的情况下,快速连接系统中子系统并联合多个子系统的能力进行异常联合诊断,有效提升系统中异常诊断可靠性,可靠保证系统稳定性。In this way, based on steps S110 to S130, an anomaly diagnosis framework is configured in the system. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each subsystem. In response to anomaly information sent by a target subsystem in multiple subsystems detected through the inter-process communication service, the anomaly diagnosis framework can quickly distribute the corresponding message processing thread to receive and distribute the anomaly information, thereby quickly connecting each subsystem. Furthermore, through the subsystem agent corresponding to the target subsystem, it can quickly link with the subsystem agents corresponding to other subsystems in multiple subsystems to perform joint anomaly diagnosis. It can combine the capabilities of multiple subsystems to perform joint anomaly diagnosis and accurately obtain the joint anomaly diagnosis result. Overall, it can quickly connect the subsystems in the system and combine the capabilities of multiple subsystems to perform joint anomaly diagnosis with low performance overhead, effectively improving the reliability of anomaly diagnosis in the system and reliably ensuring system stability.

下面描述图1实施例下进行系统异常诊断时,所进行的各步骤下进一步可选的具体实施例。The following describes further optional embodiments of the steps performed during system anomaly diagnosis in the embodiment shown in Figure 1.

一种实施例中,参阅图2,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述方法还包括:步骤S210,检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;步骤S220,若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。In one embodiment, referring to FIG2, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: step S210, detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; step S220, if it is detected that the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.

异常诊断框架中封装连接接口,连接接口用于子系统接入异常诊断框架,连接接口即开发工具包(SDK,Software Development Kit)接口(API,Application Programming Interface),连接接口可以包括应用(APP)、系统、内核三种开发工具包接口,从而各子系统可以通过连接接口接入异常诊断框架。The anomaly diagnosis framework encapsulates connection interfaces, which are used by subsystems to access the anomaly diagnosis framework. These connection interfaces are Software Development Kit (SDK) interfaces (APIs), which can include application (APP), system, and kernel SDK interfaces. Thus, each subsystem can access the anomaly diagnosis framework through these connection interfaces.

异常诊断框架可以检测多个子系统中子系统是否通过封装的连接接口连接进入异常诊断框架,若检测到某个子系统通过封装的连接接口连接进入异常诊断框架,异常诊断框架可以创建该某个子系统对应的子系统代理,通过子系统代理可以获取到该某个子系统的数据以及该某个子系统的异常修复能力。The anomaly diagnosis framework can detect whether a subsystem in multiple subsystems has connected to the anomaly diagnosis framework through an encapsulated connection interface. If a subsystem is detected to have connected to the anomaly diagnosis framework through an encapsulated connection interface, the anomaly diagnosis framework can create a subsystem agent corresponding to that subsystem. Through the subsystem agent, the data of that subsystem and its anomaly repair capabilities can be obtained.

例如,Media、Audio、Wi-Fi这三个子系统都通过连接接口连接进入到异常诊断框架中后,异常诊断框架可以创建这三个子系统对应的子系统代理,异常诊断框架就能通过子系统代理获取到这三个子系统的数据以及这三个子系统的异常修复能力。For example, after the Media, Audio, and Wi-Fi subsystems connect to the anomaly diagnosis framework through the connection interface, the anomaly diagnosis framework can create subsystem agents corresponding to these three subsystems. The anomaly diagnosis framework can then obtain the data of these three subsystems and their anomaly repair capabilities through the subsystem agents.

一种实施例中,参阅图3,所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果,具体可以包括:In one embodiment, referring to Figure 3, the step of performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain the joint anomaly diagnosis result may specifically include:

步骤S310,通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;步骤S320,若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;步骤S330,通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;步骤S340,根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。Step S310: The subsystem agent corresponding to the target subsystem performs anomaly diagnosis based on the anomaly information to obtain a first anomaly diagnosis result; Step S320: If the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, a collaborative diagnosis message is sent to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; Step S330: The subsystem agent corresponding to the first joint subsystem performs anomaly diagnosis to obtain a second anomaly diagnosis result; Step S340: The joint anomaly diagnosis result is obtained based on the first anomaly diagnosis result and the second anomaly diagnosis result.

首先,通过目标子系统对应的子系统代理根据异常信息可以进行异常诊断,得到第一异常诊断结果。First, the subsystem agent corresponding to the target subsystem can perform anomaly diagnosis based on the anomaly information to obtain the first anomaly diagnosis result.

例如,以目标子系统为Media子系统为例,在用户启动影视类应用看视频(VOD)时,Media子系统会检测缓冲、播放、卡顿、音画同步轴信息。当Media子系统中的某个监控点发现异常后,Media子系统会通过特定通路(该特定通路为用于子系统将异常信息发送到异常诊断框架中的一个通路),将异常信息发送到异常诊断框架内部。异常诊断框架通过跨进程通信服务监测到该目标子系统发送的异常信息后,异常诊断框架可以即时分发该目标子系统对应的消息处理线程进行该异常信息进行接收,并由该消息处理线程将该异常信息分发给Media子系统对应的Media子系统代理,继而由Media子系统对应的Media子系统代理获得该异常信息。For example, taking the Media subsystem as the target subsystem, when a user launches a video-on-demand (VOD) application, the Media subsystem detects buffering, playback, stuttering, and audio-visual synchronization information. When a monitoring point in the Media subsystem detects an anomaly, it sends the anomaly information to the anomaly diagnosis framework through a specific channel (a channel used by the subsystem to send anomaly information to the anomaly diagnosis framework). After the anomaly diagnosis framework detects the anomaly information sent by the target subsystem through the inter-process communication service, it can immediately dispatch the message processing thread corresponding to the target subsystem to receive the anomaly information. The message processing thread then distributes the anomaly information to the Media subsystem's corresponding Media subsystem agent, which in turn obtains the anomaly information.

以异常信息为卡顿且出现音画不同步为例,异常诊断框架内部的Media子系统代理可以对该异常信息进行分析诊断,得到第一异常诊断结果为“画面轴和音频轴关联不上,并且发现出现播放缓存(buffer)不足所导致的数据缓冲问题”。Taking the abnormal information as stuttering and audio-visual desynchronization as an example, the Media subsystem agent inside the abnormal diagnosis framework can analyze and diagnose the abnormal information, and obtain the first abnormal diagnosis result as "the video axis and audio axis cannot be associated, and a data buffering problem caused by insufficient playback buffer is found".

进一步的,目标子系统对应的子系统代理根据第一异常诊断结果判断是否需要其它第一联合子系统进行协同诊断。若需要,则通过目标子系统对应的子系统代理向第一联合子系统对应的子系统代理发送协同诊断消息,其中,第一联合子系统包含系统中的多个子系统中除该目标子系统之外的其它子系统中的一个或多个。其中,目标子系统对应的子系统代理根据第一异常诊断结果确定第一联合子系统的具体方式可以是:从预设诊断表中查询第一异常诊断结果对应的第一异常类别,以及,从该预设诊断表中进一步查询该第一异常类别所对应的第一联合子系统。预设诊断表中可以包括预设的不同异常诊断结果对应的异常类别、不同异常类别对应的联合子系统,预设诊断表可以是配置在异常诊断框架中的,目标子系统对应的子系统代理可以从异常诊断框架中获得该预设诊断表。Furthermore, the subsystem agent corresponding to the target subsystem determines whether collaborative diagnosis by other first joint subsystems is required based on the first anomaly diagnosis result. If so, a collaborative diagnosis message is sent from the subsystem agent corresponding to the target subsystem to the subsystem agent corresponding to the first joint subsystem. The first joint subsystem includes one or more subsystems from multiple subsystems within the system, excluding the target subsystem. Specifically, the subsystem agent corresponding to the target subsystem determines the first joint subsystem based on the first anomaly diagnosis result by: querying the first anomaly category corresponding to the first anomaly diagnosis result from a preset diagnosis table, and further querying the first joint subsystem corresponding to that first anomaly category from the preset diagnosis table. The preset diagnosis table may include preset anomaly categories corresponding to different anomaly diagnosis results and joint subsystems corresponding to different anomaly categories. The preset diagnosis table may be configured within the anomaly diagnosis framework, and the subsystem agent corresponding to the target subsystem can obtain the preset diagnosis table from the anomaly diagnosis framework.

以第一异常诊断结果为“画面轴和音频轴关联不上,并且发现出现播放缓存(buffer)不足所导致的数据”为例,Media子系统的Media子系统代理可以从预设诊断表中查询到该第一异常诊断结果对应的第一异常类别为“类别1”,而该预设诊断表中“类别1”对应的联合子系统包括Wi-Fi子系统和Audio子系统,进而,Media子系统代理可以进一步确定需要其它第一联合子系统进行协同诊断且第一联合子系统具体包括Wi-Fi子系统和Audio子系统。Taking the first anomaly diagnosis result as "the picture axis and audio axis cannot be associated, and data caused by insufficient playback buffer is found", the Media subsystem agent of the Media subsystem can query the preset diagnosis table to find that the first anomaly category corresponding to the first anomaly diagnosis result is "Category 1". The joint subsystems corresponding to "Category 1" in the preset diagnosis table include the Wi-Fi subsystem and the Audio subsystem. Therefore, the Media subsystem agent can further determine that other first joint subsystems need to perform collaborative diagnosis, and the first joint subsystems specifically include the Wi-Fi subsystem and the Audio subsystem.

进一步的,异常诊断框架中第一联合子系统对应的子系统代理收到协同诊断消息后,第一联合子系统对应的子系统代理可以进行进一步的异常诊断,得到第二异常诊断结果。例如,Wi-Fi子系统代理可以结合Wi-Fi诊断子系统诊断Wi-Fi情况,确认Wi-Fi硬件状态、网络连接状态、网络连接质量等从而得到Wi-Fi模块的诊断结果;Audio子系统代理可以结合Media关键帧数据,进行音画分析,如Media解码效率及Audio解码效率等的分析,得到Audio模块的诊断结果,从而,第二异常诊断结果可以包括Wi-Fi模块的诊断结果及Audio模块的诊断结果。Furthermore, after receiving the collaborative diagnostic message, the subsystem agent corresponding to the first joint subsystem in the anomaly diagnosis framework can perform further anomaly diagnosis to obtain a second anomaly diagnosis result. For example, the Wi-Fi subsystem agent can combine the Wi-Fi diagnosis subsystem to diagnose the Wi-Fi situation, confirming the Wi-Fi hardware status, network connection status, network connection quality, etc., thereby obtaining the diagnostic result for the Wi-Fi module; the Audio subsystem agent can combine Media keyframe data to perform audio-visual analysis, such as analyzing Media decoding efficiency and Audio decoding efficiency, to obtain the diagnostic result for the Audio module. Therefore, the second anomaly diagnosis result can include the diagnostic results for both the Wi-Fi module and the Audio module.

进一步的,异常诊断框架中的目标子系统对应的子系统代理可以获得第一异常诊断结果与第二异常诊断结果,第一异常诊断结果与第二异常诊断结果的集合可以作为异常联合诊断结果。以这种方式,通过这种联合诊断机制可以快速联动多个系统得到准确的异常联合诊断结果,保证系统异常诊断可靠性,同时,基于该异常联合诊断结果可以可靠地对系统异常进行修复,保证系统运行稳定性。Furthermore, the subsystem agent corresponding to the target subsystem in the anomaly diagnosis framework can obtain the first and second anomaly diagnosis results. The set of the first and second anomaly diagnosis results can be used as the joint anomaly diagnosis result. In this way, this joint diagnosis mechanism can quickly link multiple systems to obtain accurate joint anomaly diagnosis results, ensuring the reliability of system anomaly diagnosis. At the same time, based on the joint anomaly diagnosis result, system anomalies can be reliably repaired, ensuring system operational stability.

进一步的,一种实施例中,所述根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果,可以包括:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。Furthermore, in one embodiment, obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result may include: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.

若第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过第一联合子系统对应的子系统代理向第二联合子系统对应的子系统代理发送协同诊断消息。其中,第一联合子系统对应的子系统代理根据第二异常诊断结果确定第二联合子系统的具体方式可以是:从预设诊断表中查询第二异常诊断结果对应的第二异常类别,以及,从该预设诊断表中进一步查询该第二异常类别所对应的第二联合子系统。预设诊断表中可以包括预设的不同异常诊断结果对应的异常类别、不同异常类别对应的联合子系统,预设诊断表可以是配置在异常诊断框架中的,第一联合子系统对应的子系统代理可以从异常诊断框架中获得该预设诊断表。If the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then a collaborative diagnosis message is sent to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem. Specifically, the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result by: querying the second anomaly category corresponding to the second anomaly diagnosis result from a preset diagnosis table, and further querying the second joint subsystem corresponding to the second anomaly category from the preset diagnosis table. The preset diagnosis table may include preset anomaly categories corresponding to different anomaly diagnosis results and joint subsystems corresponding to different anomaly categories. The preset diagnosis table may be configured in the anomaly diagnosis framework, and the subsystem agent corresponding to the first joint subsystem can obtain the preset diagnosis table from the anomaly diagnosis framework.

例如,Wi-Fi子系统代理可以从预设诊断表中查询到Wi-Fi模块的诊断结果(属于第二异常诊断结果)对应的第二异常类别为“类别2-1”,且该预设诊断表中“类别2-1”对应的联合子系统为内核子系统,则Wi-Fi子系统代理可以确定第二联合子系统为内核子系统,Wi-Fi子系统代理进一步可以向该内核子系统的子系统代理发送协同诊断消息。For example, the Wi-Fi subsystem agent can query the pre-defined diagnostic table to find that the diagnostic result of the Wi-Fi module (belonging to the second abnormal diagnostic result) corresponds to the second abnormal category "Category 2-1". If the joint subsystem corresponding to "Category 2-1" in the pre-defined diagnostic table is the kernel subsystem, then the Wi-Fi subsystem agent can determine that the second joint subsystem is the kernel subsystem. The Wi-Fi subsystem agent can further send a collaborative diagnostic message to the subsystem agent of the kernel subsystem.

第二联合子系统对应的子系统代理收到协同诊断消息后,第二联合子系统对应的子系统代理进行异常诊断,可以得到第三异常诊断结果。例如,内核子系统的子系统代理可以查看内存、CPU等情况进行诊断,从而得到第三异常诊断结果。第二联合子系统对应的子系统代理进一步可以将第三异常诊断结果反馈给目标子系统对应的子系统代理。After receiving the collaborative diagnostic message, the subsystem agent corresponding to the second joint subsystem performs anomaly diagnosis and obtains a third anomaly diagnosis result. For example, the subsystem agent of the kernel subsystem can check memory, CPU, and other conditions to perform diagnosis and obtain a third anomaly diagnosis result. The subsystem agent corresponding to the second joint subsystem can further feed back the third anomaly diagnosis result to the subsystem agent corresponding to the target subsystem.

进而,异常诊断框架中的目标子系统对应的子系统代理可以获得第一异常诊断结果、第二异常诊断结果、第三异常诊断结果,第一异常诊断结果、第二异常诊断结果、第三异常诊断结果的集合可以作为异常联合诊断结果,通过这种联合诊断机制可以快速联动多个系统得到准确的异常联合诊断结果,保证系统异常诊断可靠性,同时,基于该异常联合诊断结果可以可靠地对系统异常进行修复,保证系统运行稳定性。以此类推,异常诊断框架中可以快速联动多个子系统代理进行多子系统联合诊断。Furthermore, the subsystem agent corresponding to the target subsystem in the anomaly diagnosis framework can obtain the first, second, and third anomaly diagnosis results. The set of these three results can serve as the joint anomaly diagnosis result. This joint diagnosis mechanism allows for rapid linkage of multiple systems to obtain accurate joint anomaly diagnosis results, ensuring the reliability of system anomaly diagnosis. Simultaneously, based on this joint anomaly diagnosis result, system anomalies can be reliably repaired, ensuring system operational stability. Similarly, the anomaly diagnosis framework can rapidly link multiple subsystem agents for multi-subsystem joint diagnosis.

进一步的,一种实施例中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述方法还包括:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。Furthermore, in one embodiment, after performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem and in conjunction with the subsystem agents corresponding to other subsystems in the plurality of subsystems to obtain joint anomaly diagnosis results, the method further includes: making joint anomaly repair decisions for the plurality of subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.

根据异常联合诊断结果对多个子系统进行异常联合修复决策得到多子系统联合修复策略,根据多子系统联合修复策略进行异常的多子系统联合修复,可以有效提升异常修复的可靠性。Based on the joint diagnosis results of anomalies, a joint repair strategy for multiple subsystems is obtained by making joint repair decisions for multiple subsystems. The joint repair of anomalies in multiple subsystems based on the joint repair strategy can effectively improve the reliability of anomaly repair.

其中,根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略的具体方式可以包括:从所述异常联合诊断结果中提取异常问题;若提取到至少两个异常问题,则对所述至少两个异常问题进行优先级排序,得到各所述异常问题的异常处理优先级,所述多子系统联合修复策略为按照所述异常处理优先级进行异常修复。其中,对所述至少两个异常问题进行优先级排序,得到各所述异常问题的异常处理优先级的步骤,具体可以是:从预设优先级表中,查询提取到的各个异常问题的异常处理优先级,其中,预设优先级表中包括预先针对不同异常问题设定对应的异常处理优先级,预设优先级表可以预先配置在异常诊断框架中。The specific method for making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy may include: extracting abnormal issues from the joint anomaly diagnosis results; if at least two abnormal issues are extracted, prioritizing the at least two abnormal issues to obtain the anomaly handling priority for each abnormal issue, and the multi-subsystem joint repair strategy is to perform anomaly repair according to the anomaly handling priority. Specifically, the step of prioritizing the at least two abnormal issues to obtain the anomaly handling priority for each abnormal issue may include: querying the anomaly handling priority of each extracted abnormal issue from a preset priority table, wherein the preset priority table includes pre-defined anomaly handling priorities for different abnormal issues, and the preset priority table can be pre-configured in the anomaly diagnosis framework.

具体地,一个异常可能由多个子系统异常引起,如不对多个子系统进行异常联合修复决策得到多子系统联合修复策略,极可能会出现的不同子系统调用最终异常修复能力出现的功能冲突覆盖,导致异常无法准确修复。例如,一个示例中,异常联合诊断结果中包括Media子系统代理发现CPU不足、Wi-Fi子系统代理也发现CPU不足、内核子系统代理发现因为IO导致CPU不足,则可以从中提取到2个异常问题分别为“IO问题”及“CPU不足”,其中,对于“CPU不足”这一问题,Media子系统代理和Wi-Fi子系统代理可以申请CPU调度来修复,但是如果不优先处理“IO问题”,仅先申请CPU调度,会导致卡顿更加严重)。而根据异常联合诊断结果对多个子系统进行异常联合修复决策,得到的多子系统联合修复策略(多子系统联合修复策略为按照所述异常处理优先级进行异常修复),则可以使得优先处理IO问题,而在再申请CPU调度,从而有效提升异常修复的可靠性。Specifically, an anomaly may be caused by anomalies in multiple subsystems. Without a joint anomaly repair decision for these subsystems to arrive at a multi-subsystem joint repair strategy, functional conflicts may arise from different subsystems calling the final anomaly repair capabilities, leading to inaccurate anomaly repair. For example, in one scenario, the joint anomaly diagnosis results include the Media subsystem agent detecting insufficient CPU, the Wi-Fi subsystem agent also detecting insufficient CPU, and the kernel subsystem agent detecting insufficient CPU due to I/O. Two anomalies can be extracted: "I/O problem" and "Insufficient CPU." For the "Insufficient CPU" problem, the Media and Wi-Fi subsystem agents can request CPU scheduling to repair it. However, if the "I/O problem" is not addressed first, and CPU scheduling is requested only, the lag will worsen. By making a joint anomaly repair decision based on the joint anomaly diagnosis results for multiple subsystems, a multi-subsystem joint repair strategy (which repairs anomalies according to the aforementioned anomaly handling priority) can prioritize addressing the I/O problem before requesting CPU scheduling, thereby effectively improving the reliability of anomaly repair.

进一步的,一种实施例中,所述方法还包括:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。In a further embodiment, the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.

异常诊断框架将异常相关数据传输给预设上报服务(该预设上报服务例如预设的上报应用),预设上报服务可以将异常相关数据上报给云端服务器。异常诊断框架位于原生层,将数据传输给预设上报服务(该预设上报服务例如预设的上报应用)来进行网络上报,因为系统层、Native层与App调度存在限制,通过预设上报服务进行上报可以绕过系统层、Native层与App调度存在的限制,并通过一些机制(例如Log系统的机制)实现快速实现数据上报。The anomaly diagnosis framework transmits anomaly-related data to a pre-defined reporting service (such as a pre-defined reporting application). This service then reports the anomaly-related data to the cloud server. The anomaly diagnosis framework resides in the native layer and transmits data to the pre-defined reporting service (such as a pre-defined reporting application) for network reporting. Because of limitations in the system layer, native layer, and app scheduling, reporting through the pre-defined reporting service bypasses these limitations and enables rapid data reporting through mechanisms such as the log system.

进一步的,异常诊断框架可以接收设备中传输的异常修复内容,该异常修复内容具体可以为云端服务器根据异常相关数据分析并传输给设备的,异常修复内容可以用于异常诊断框架中进行异常联合修复决策,从而,进一步提升异常修复可靠性。Furthermore, the anomaly diagnosis framework can receive anomaly repair content transmitted from the device. Specifically, this anomaly repair content can be generated by a cloud server based on anomaly-related data analysis and transmitted to the device. This anomaly repair content can be used in the anomaly diagnosis framework to make joint anomaly repair decisions, thereby further improving the reliability of anomaly repair.

一些实施方式中,异常诊断框架中基于修复知识图谱进行异常联合修复决策,异常诊断框架可以基于接收到的异常修复内容进一步调整修复知识图谱,并基于调整后的修复知识图谱进行异常联合修复决策,得到多子系统联合修复策略。In some implementations, the anomaly diagnosis framework makes joint anomaly repair decisions based on a repair knowledge graph. The anomaly diagnosis framework can further adjust the repair knowledge graph based on the received anomaly repair content, and make joint anomaly repair decisions based on the adjusted repair knowledge graph to obtain a multi-subsystem joint repair strategy.

进一步的,所述异常诊断框架中还包括框架性能自检模块;所述方法还包括:获取所述系统中的异常诊断处理相关数据;通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。Furthermore, the anomaly diagnosis framework also includes a framework performance self-test module; the method further includes: acquiring anomaly diagnosis and processing related data in the system; and analyzing the anomaly diagnosis and processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.

异常诊断框架中配置框架性能自检模块,框架性能自检模块可以获取系统中的异常诊断处理相关数据,通过框架性能自检模块对异常诊断处理相关数据进行分析,得到异常诊断框架中的框架性能,从而实现异常诊断框架自检测其框架性能,根据框架性能的检测结果可以执行相应处理,进一步提升异常诊断框架的框架性能。The anomaly diagnosis framework is configured with a framework performance self-test module. This module can acquire relevant data on anomaly diagnosis and processing in the system, analyze the data, and obtain the framework performance of the anomaly diagnosis framework. This enables the anomaly diagnosis framework to self-test its performance. Based on the test results, corresponding processing can be performed to further improve the framework performance of the anomaly diagnosis framework.

例如,框架性能自检模块可以包括埋点性能检测单元、数据准确性检测单元、子系统诊断检测单元、子系统修复检测单元等等的单元,异常诊断框架中通过子系统诊断检测单元可以自检其对于异常的诊断性能,通过子系统修复检测单元可以自检其对于子系统异常的修复性能等。For example, the framework performance self-test module may include units such as the data entry performance test unit, the data accuracy test unit, the subsystem diagnosis test unit, and the subsystem repair test unit. In the anomaly diagnosis framework, the subsystem diagnosis test unit can self-test its anomaly diagnosis performance, and the subsystem repair test unit can self-test its anomaly repair performance.

为便于更好的实施本申请实施例提供的系统异常诊断方法,本申请实施例还提供一种基于上述系统异常诊断方法的系统异常诊断装置。其中名词的含义与上述系统异常诊断方法中相同,具体实现细节可以参考方法实施例中的说明。图4示出了根据本申请的一个实施例的系统异常诊断装置的框图。To facilitate better implementation of the system anomaly diagnosis method provided in the embodiments of this application, the embodiments of this application also provide a system anomaly diagnosis device based on the above-described system anomaly diagnosis method. The meanings of the terms used are the same as in the above-described system anomaly diagnosis method, and specific implementation details can be found in the descriptions in the method embodiments. Figure 4 shows a block diagram of a system anomaly diagnosis device according to an embodiment of this application.

如图4所示,一种系统异常诊断装置400,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述装置应用于所述异常诊断框架,所述系统异常诊断装置400中可以包括:连接处理模块410可以用于响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;数据分发模块420可以用于通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;联合诊断模块430可以用于通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。As shown in Figure 4, a system anomaly diagnosis device 400 is provided. The system includes multiple subsystems and an anomaly diagnosis framework. The anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems. The device is applied to the anomaly diagnosis framework. The system anomaly diagnosis device 400 may include: a connection processing module 410, which can be used to distribute corresponding message processing threads in response to anomaly information sent by a target subsystem in the multiple subsystems detected through the inter-process communication service; a data distribution module 420, which can be used to receive the anomaly information through the message processing threads and distribute the anomaly information to the subsystem agent corresponding to the target subsystem; and a joint diagnosis module 430, which can be used to perform joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem and in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain a joint anomaly diagnosis result.

在本申请的一些实施例中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述装置还包括异常修复模块,用于:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。In some embodiments of this application, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the device further includes an anomaly repair module, used to: make joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.

在本申请的一些实施例中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述装置还包括代理创建模块,用于:检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。In some embodiments of this application, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the device further includes a proxy creation module, configured to: detect whether a subsystem in the multiple subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the multiple subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the multiple subsystems.

在本申请的一些实施例中,所述联合诊断模块,用于:通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the joint diagnosis module is configured to: perform anomaly diagnosis based on the anomaly information through the subsystem agent corresponding to the target subsystem to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.

在本申请的一些实施例中,所述联合诊断模块,用于:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the joint diagnosis module is configured to: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.

在本申请的一些实施例中,所述装置还包括数据上报模块,用于:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。In some embodiments of this application, the device further includes a data reporting module, configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.

在本申请的一些实施例中,所述异常诊断框架中还包括框架性能自检模块;所述框架性能自检模块,用于:获取所述系统中的异常诊断处理相关数据;通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。In some embodiments of this application, the anomaly diagnosis framework further includes a framework performance self-testing module; the framework performance self-testing module is used to: acquire anomaly diagnosis and processing related data in the system; and analyze the anomaly diagnosis and processing related data through the framework performance self-testing module to obtain the framework performance in the anomaly diagnosis framework.

应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units for the device used to perform actions have been mentioned in the detailed description above, this division is not mandatory. In fact, according to the embodiments of this application, the features and functions of two or more modules or units described above can be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided and embodied by multiple modules or units.

此外,本申请实施例还提供一种电子设备,如图5所示,图5示出了根据本申请的一个实施例的电子设备的框图,具体来讲:Furthermore, this application also provides an electronic device, as shown in FIG5. FIG5 shows a block diagram of an electronic device according to an embodiment of this application, specifically:

该电子设备可以包括一个或者一个以上处理核心的处理器501、一个或一个以上计算机可读存储介质的存储器502、电源503和输入单元504等部件。本领域技术人员可以理解,图5中示出的电子设备结构并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:The electronic device may include components such as a processor 501 with one or more processing cores, a memory 502 with one or more computer-readable storage media, a power supply 503, and an input unit 504. Those skilled in the art will understand that the electronic device structure shown in FIG. 5 does not constitute a limitation on the electronic device, and may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:

处理器501是该电子设备的控制中心,利用各种接口和线路连接整个计算机设备的各个部分,通过运行或执行存储在存储器502内的软件程序和/或模块,以及调用存储在存储器502内的数据,执行计算机设备的各种功能和处理数据,从而对电子设备进行整体监控。可选的,处理器501可包括一个或多个处理核心;优选的,处理器501可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户页面和应用程序等,调制解调处理器主要处理无线通讯。可以理解的是,上述调制解调处理器也可以不集成到处理器501中。The processor 501 is the control center of the electronic device. It connects to various parts of the computer device via various interfaces and lines. By running or executing software programs and/or modules stored in the memory 502, and by calling data stored in the memory 502, it performs various functions of the computer device and processes data, thereby providing overall monitoring of the electronic device. Optionally, the processor 501 may include one or more processing cores; preferably, the processor 501 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user page, and application programs, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 501.

存储器502可用于存储软件程序以及模块,处理器501通过运行存储在存储器502的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器502还可以包括存储器控制器,以提供处理器501对存储器502的访问。The memory 502 can be used to store software programs and modules. The processor 501 executes various functional applications and data processing by running the software programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, application programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the computer device, etc. In addition, the memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 502 may also include a memory controller to provide the processor 501 with access to the memory 502.

电子设备还包括给各个部件供电的电源503,优选的,电源503可以通过电源管理系统与处理器501逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源503还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。The electronic device also includes a power supply 503 that supplies power to various components. Preferably, the power supply 503 can be logically connected to the processor 501 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 503 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.

该电子设备还可包括输入单元504,该输入单元504可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。The electronic device may also include an input unit 504, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

尽管未示出,电子设备还可以包括显示单元等,在此不再赘述。具体在本实施例中,电子设备中的处理器501会按照如下的指令,将一个或一个以上的计算机程序的进程对应的可执行文件加载到存储器502中,并由处理器501来运行存储在存储器502中的计算机程序,从而实现本申请前述实施例中各种功能,如处理器501可以执行下述步骤:Although not shown, the electronic device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 501 in the electronic device loads the executable files corresponding to the processes of one or more computer programs into the memory 502 according to the following instructions, and the processor 501 runs the computer programs stored in the memory 502, thereby realizing the various functions in the foregoing embodiments of this application. For example, the processor 501 can perform the following steps:

响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。In response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, a corresponding message processing thread is distributed; the abnormal information is received by the message processing thread and distributed to the subsystem agent corresponding to the target subsystem; the subsystem agent corresponding to the target subsystem, together with the subsystem agents corresponding to other subsystems in the multiple subsystems, performs joint abnormal diagnosis to obtain a joint abnormal diagnosis result.

在本申请的一些实施例中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,还包括:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。In some embodiments of this application, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems to obtain joint anomaly diagnosis results, the method further includes: making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and performing joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy.

在本申请的一些实施例中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,还包括:检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。In some embodiments of this application, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the inter-process communication service, the method further includes: detecting whether the subsystems in the multiple subsystems have connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if the subsystems in the multiple subsystems are detected to have connected to the abnormal diagnosis framework, creating a subsystem agent corresponding to the subsystem in the multiple subsystems.

在本申请的一些实施例中,所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果,包括:通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, the step of performing joint anomaly diagnosis through the subsystem agent corresponding to the target subsystem, in conjunction with the subsystem agents corresponding to other subsystems in the plurality of subsystems, to obtain a joint anomaly diagnosis result includes: performing anomaly diagnosis through the subsystem agent corresponding to the target subsystem based on the anomaly information to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result.

在本申请的一些实施例中,所述根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果,包括:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。In some embodiments of this application, obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result includes: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then sending a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; performing anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtaining the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result.

在本申请的一些实施例中,还包括:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。In some embodiments of this application, the method further includes: transmitting anomaly-related data to a preset reporting service, the preset reporting service being used to report the anomaly-related data to a cloud server; and receiving anomaly repair content transmitted from the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device.

在本申请的一些实施例中,所述异常诊断框架中还包括框架性能自检模块;还包括:获取所述系统中的异常诊断处理相关数据;通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。In some embodiments of this application, the anomaly diagnosis framework further includes a framework performance self-test module; it also includes: acquiring anomaly diagnosis processing related data in the system; and analyzing the anomaly diagnosis processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework.

本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过计算机程序来完成,或通过计算机程序控制相关的硬件来完成,该计算机程序可以存储于一计算机可读存储介质中,并由处理器进行加载和执行。Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by a computer program, or by a computer program controlling related hardware. The computer program can be stored in a computer-readable storage medium and loaded and executed by a processor.

为此,本申请实施例还提供一种存储介质,其中存储有计算机程序,该计算机程序能够被处理器进行加载,以执行本申请实施例所提供的任一种方法中的步骤。Therefore, embodiments of this application also provide a storage medium storing a computer program that can be loaded by a processor to execute the steps in any of the methods provided in embodiments of this application.

其中,该存储介质可以是计算机可读存储介质,该存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。The storage medium can be a computer-readable storage medium, which may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

由于该存储介质中所存储的计算机程序,可以执行本申请实施例所提供的任一种方法中的步骤,因此,可以实现本申请实施例所提供的方法所能实现的有益效果,详见前面的实施例,在此不再赘述。Since the computer program stored in the storage medium can execute the steps of any of the methods provided in the embodiments of this application, the beneficial effects that the methods provided in the embodiments of this application can achieve can be realized. For details, please refer to the previous embodiments, which will not be repeated here.

本领域技术人员在考虑说明书及实践这里公开的实施方式后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein.

应当理解的是,本申请并不局限于上面已经描述并在附图中示出的实施例,而可以在不脱离其范围的情况下进行各种修改和改变。It should be understood that this application is not limited to the embodiments described above and shown in the accompanying drawings, but various modifications and changes can be made without departing from its scope.

Claims (20)

一种系统异常诊断方法,其中,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述方法应用于所述异常诊断框架,所述方法包括:A system anomaly diagnosis method, wherein the system includes multiple subsystems and an anomaly diagnosis framework, the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems, the method is applied to the anomaly diagnosis framework, and the method includes: 响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;In response to detecting abnormal information sent by the target subsystem in the multiple subsystems through the cross-process communication service, a corresponding message processing thread is distributed; 通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;The exception information is received by the message processing thread and distributed to the subsystem agent corresponding to the target subsystem. 通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。By using the subsystem agent corresponding to the target subsystem, and in conjunction with the subsystem agents corresponding to other subsystems in the multiple subsystems, anomaly joint diagnosis is performed to obtain the anomaly joint diagnosis result. 根据权利要求1所述的方法,其中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述方法还包括:According to the method of claim 1, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the plurality of subsystems to obtain the joint anomaly diagnosis result, the method further includes: 根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;Based on the joint diagnosis results of the anomalies, joint anomaly repair decisions are made for the multiple subsystems to obtain a joint repair strategy for multiple subsystems. 根据所述多子系统联合修复策略进行异常的多子系统联合修复。Perform joint repair of abnormal multi-subsystems according to the aforementioned multi-subsystem joint repair strategy. 根据权利要求1所述的方法,其中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述方法还包括:According to the method of claim 1, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the plurality of subsystems through the inter-process communication service, the method further includes: 检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;Detect whether a subsystem in the multiple subsystems is connected to the anomaly diagnosis framework through the connection interface encapsulated by the anomaly diagnosis framework. 若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。If a subsystem connection is detected to enter the anomaly diagnosis framework from the multiple subsystems, a subsystem agent corresponding to the subsystem in the multiple subsystems is created. 根据权利要求1所述的方法,其中,所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果,包括:According to the method of claim 1, the step of performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the plurality of subsystems to obtain joint anomaly diagnosis results includes: 通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;The subsystem agent corresponding to the target subsystem performs anomaly diagnosis based on the anomaly information to obtain a first anomaly diagnosis result. 若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;If the subsystem agent corresponding to the target subsystem determines the first joint subsystem based on the first anomaly diagnosis result, then a collaborative diagnosis message is sent to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem. 通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;Anomaly diagnosis is performed through the subsystem agent corresponding to the first joint subsystem to obtain the second anomaly diagnosis result; 根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。The combined abnormal diagnosis result is obtained based on the first abnormal diagnosis result and the second abnormal diagnosis result. 根据权利要求4所述的方法,其中,所述根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果,包括:According to the method of claim 4, the step of obtaining the joint abnormal diagnosis result based on the first abnormal diagnosis result and the second abnormal diagnosis result includes: 若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;If the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then a collaborative diagnosis message is sent from the subsystem agent corresponding to the first joint subsystem to the subsystem agent corresponding to the second joint subsystem. 通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;Anomaly diagnosis is performed through the subsystem agent corresponding to the second joint subsystem to obtain the third anomaly diagnosis result; 根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。The combined abnormal diagnosis result is obtained based on the first abnormal diagnosis result, the second abnormal diagnosis result, and the third abnormal diagnosis result. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises: 将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;The abnormality-related data is transmitted to a preset reporting service, which is used to report the abnormality-related data to a cloud server. 接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。The receiving device receives anomaly repair content transmitted from the cloud server, which is generated by the cloud server based on the anomaly-related data and transmitted to the device. 根据权利要求1所述的方法,其中,所述异常诊断框架中还包括框架性能自检模块;所述方法还包括:According to the method of claim 1, the anomaly diagnosis framework further includes a framework performance self-test module; the method further includes: 获取所述系统中的异常诊断处理相关数据;Obtain relevant data on anomaly diagnosis and handling in the system; 通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能。The framework performance is obtained by analyzing the anomaly diagnosis and processing related data through the framework performance self-test module. 根据权利要求4所述的方法,其中,所述根据所述第一异常诊断结果确定第一联合子系统,包括:According to the method of claim 4, wherein determining the first joint subsystem based on the first abnormal diagnosis result includes: 从预设诊断表中查询所述第一异常诊断结果对应的第一异常类别;Query the first abnormal category corresponding to the first abnormal diagnosis result from the preset diagnosis table; 从所述预设诊断表中查询所述第一异常类别所对应的第一联合子系统。Query the first joint subsystem corresponding to the first abnormal category from the preset diagnostic table. 根据权利要求2所述的方法,其中,所述根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略,包括:According to the method of claim 2, the step of making joint anomaly repair decisions for the multiple subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy includes: 从所述异常联合诊断结果中提取异常问题;Extract abnormal issues from the combined abnormal diagnostic results; 若提取到至少两个异常问题,则对所述至少两个异常问题进行优先级排序,得到各所述异常问题的异常处理优先级,所述多子系统联合修复策略为按照所述异常处理优先级进行异常修复。If at least two abnormal issues are extracted, the at least two abnormal issues are prioritized to obtain the abnormal handling priority of each abnormal issue, and the multi-subsystem joint repair strategy is to repair the abnormal issues according to the abnormal handling priority. 根据权利要求9所述的方法,其中,所述对所述至少两个异常问题进行优先级排序,得到各所述异常问题的异常处理优先级,包括:According to the method of claim 9, the step of prioritizing the at least two abnormal issues to obtain the exception handling priority for each of the abnormal issues includes: 从预设优先级表中,查询提取到的各个异常问题的异常处理优先级,其中,所述预设优先级表中包括预先针对不同异常问题设定的异常处理优先级。From the preset priority table, query the exception handling priority of each exception problem extracted, wherein the preset priority table includes exception handling priorities set in advance for different exception problems. 根据权利要求6所述的方法,其中,所述异常诊断框架中基于修复知识图谱进行异常联合修复决策;在所述接收设备中传输的异常修复内容之后,所述方法还包括:According to the method of claim 6, wherein the anomaly diagnosis framework performs joint anomaly repair decision-making based on a repair knowledge graph; after the anomaly repair content transmitted in the receiving device, the method further includes: 基于所述异常修复内容调整所述修复知识图谱;The repair knowledge graph is adjusted based on the anomaly repair content; 基于调整后的修复知识图谱进行异常联合修复决策,得到多子系统联合修复策略。Anomaly joint repair decision-making is performed based on the adjusted repair knowledge graph, resulting in a multi-subsystem joint repair strategy. 根据权利要求7所述的方法,其中,所述框架性能自检模块中包括子系统诊断检测单元;所述通过所述框架性能自检模块对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架中的框架性能,包括:According to the method of claim 7, the framework performance self-test module includes a subsystem diagnostic detection unit; the step of analyzing the anomaly diagnosis processing related data through the framework performance self-test module to obtain the framework performance in the anomaly diagnosis framework includes: 通过所述子系统诊断检测单元对所述异常诊断处理相关数据进行分析,得到所述异常诊断框架对于异常的诊断性能。The subsystem diagnostic detection unit analyzes the anomaly diagnosis and processing related data to obtain the anomaly diagnosis performance of the anomaly diagnosis framework. 一种系统异常诊断装置,其中,所述系统包括多个子系统及异常诊断框架,所述异常诊断框架中包括跨进程通信服务以及各所述子系统对应的子系统代理,所述装置应用于所述异常诊断框架,所述装置包括:A system anomaly diagnosis device, wherein the system includes multiple subsystems and an anomaly diagnosis framework, the anomaly diagnosis framework includes an inter-process communication service and subsystem agents corresponding to each of the subsystems, the device is applied to the anomaly diagnosis framework, and the device includes: 连接处理模块,用于响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程;The connection processing module is used to respond to abnormal information sent by the target subsystem in the multiple subsystems through the cross-process communication service, and to distribute the corresponding message processing thread. 数据分发模块,用于通过所述消息处理线程接收所述异常信息,并将所述异常信息分发至所述目标子系统对应的子系统代理;The data distribution module is used to receive the exception information through the message processing thread and distribute the exception information to the subsystem agent corresponding to the target subsystem. 联合诊断模块,用于通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果。The joint diagnosis module is used to perform joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the multiple subsystems, and obtain joint anomaly diagnosis results. 根据权利要求13所述的装置,其中,在所述通过所述目标子系统对应的子系统代理,联合所述多个子系统中其它子系统对应的子系统代理进行异常联合诊断,得到异常联合诊断结果之后,所述装置还包括异常修复模块,用于:根据所述异常联合诊断结果,对所述多个子系统进行异常联合修复决策,得到多子系统联合修复策略;根据所述多子系统联合修复策略进行异常的多子系统联合修复。According to the apparatus of claim 13, after performing joint anomaly diagnosis by combining the subsystem agent corresponding to the target subsystem with the subsystem agents corresponding to other subsystems in the plurality of subsystems to obtain joint anomaly diagnosis results, the apparatus further includes an anomaly repair module, configured to: make joint anomaly repair decisions for the plurality of subsystems based on the joint anomaly diagnosis results to obtain a multi-subsystem joint repair strategy; and perform joint multi-subsystem repair of anomalies based on the multi-subsystem joint repair strategy. 根据权利要求13所述的装置,其中,在所述响应于通过所述跨进程通信服务监测到所述多个子系统中目标子系统发送的异常信息,分发对应的消息处理线程之前,所述装置还包括代理创建模块,用于:检测所述多个子系统中子系统是否通过所述异常诊断框架封装的连接接口连接进入所述异常诊断框架;若检测到所述多个子系统中子系统连接进入所述异常诊断框架,创建所述多个子系统中子系统对应的子系统代理。According to the apparatus of claim 13, before distributing the corresponding message processing thread in response to detecting abnormal information sent by the target subsystem in the plurality of subsystems through the inter-process communication service, the apparatus further includes a proxy creation module, configured to: detect whether a subsystem in the plurality of subsystems has connected to the abnormal diagnosis framework through the connection interface encapsulated by the abnormal diagnosis framework; if a subsystem in the plurality of subsystems is detected to have connected to the abnormal diagnosis framework, create a subsystem proxy corresponding to the subsystem in the plurality of subsystems. 根据权利要求13所述的装置,其中,所述联合诊断模块,用于:通过所述目标子系统对应的子系统代理根据所述异常信息进行异常诊断,得到第一异常诊断结果;若所述目标子系统对应的子系统代理根据所述第一异常诊断结果确定第一联合子系统,则通过所述目标子系统对应的子系统代理向所述第一联合子系统对应的子系统代理发送协同诊断消息;通过所述第一联合子系统对应的子系统代理进行异常诊断,得到第二异常诊断结果;根据所述第一异常诊断结果与所述第二异常诊断结果,得到所述异常联合诊断结果。According to the apparatus of claim 13, the joint diagnosis module is configured to: perform anomaly diagnosis based on the anomaly information through the subsystem agent corresponding to the target subsystem to obtain a first anomaly diagnosis result; if the subsystem agent corresponding to the target subsystem determines a first joint subsystem based on the first anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the first joint subsystem through the subsystem agent corresponding to the target subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the first joint subsystem to obtain a second anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result and the second anomaly diagnosis result. 根据权利要求16所述的装置,其中,所述联合诊断模块,用于:若所述第一联合子系统对应的子系统代理根据所述第二异常诊断结果确定第二联合子系统,则通过所述第一联合子系统对应的子系统代理向所述第二联合子系统对应的子系统代理发送协同诊断消息;通过所述第二联合子系统对应的子系统代理进行异常诊断,得到第三异常诊断结果;根据所述第一异常诊断结果、所述第二异常诊断结果及所述第三异常诊断结果,得到所述异常联合诊断结果。According to the apparatus of claim 16, the joint diagnosis module is configured to: if the subsystem agent corresponding to the first joint subsystem determines the second joint subsystem based on the second anomaly diagnosis result, then send a collaborative diagnosis message to the subsystem agent corresponding to the second joint subsystem through the subsystem agent corresponding to the first joint subsystem; perform anomaly diagnosis through the subsystem agent corresponding to the second joint subsystem to obtain a third anomaly diagnosis result; and obtain the joint anomaly diagnosis result based on the first anomaly diagnosis result, the second anomaly diagnosis result, and the third anomaly diagnosis result. 根据权利要求13所述的装置,其中,所述装置还包括数据上报模块,用于:将异常相关数据传输给预设上报服务,所述预设上报服务用于将所述异常相关数据上报给云端服务器;接收设备中传输的异常修复内容,所述异常修复内容为所述云端服务器根据所述异常相关数据分析并传输给设备的。The apparatus according to claim 13, further comprising a data reporting module, configured to: transmit anomaly-related data to a preset reporting service, the preset reporting service being configured to report the anomaly-related data to a cloud server; and receive anomaly repair content transmitted in the device, the anomaly repair content being analyzed by the cloud server based on the anomaly-related data and transmitted to the device. 一种存储介质,其中,其上存储有计算机程序,当所述计算机程序被计算机的处理器执行时,使计算机执行权利要求1至12任一项所述的方法。A storage medium having a computer program stored thereon, which, when executed by a computer's processor, causes the computer to perform the method according to any one of claims 1 to 12. 一种电子设备,其中,包括:存储器,存储有计算机程序;处理器,读取存储器存储的计算机程序,以执行权利要求1至12任一项所述的方法。An electronic device comprising: a memory storing a computer program; and a processor for reading the computer program stored in the memory to perform the method according to any one of claims 1 to 12.
PCT/CN2025/079139 2024-05-07 2025-02-25 System anomaly diagnosis method and apparatus, storage medium, and electronic device Pending WO2025232323A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202410559269.7 2024-05-07
CN202410559269.7A CN118567891A (en) 2024-05-07 2024-05-07 System abnormality diagnosis method, device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2025232323A1 true WO2025232323A1 (en) 2025-11-13

Family

ID=92477489

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2025/079139 Pending WO2025232323A1 (en) 2024-05-07 2025-02-25 System anomaly diagnosis method and apparatus, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN118567891A (en)
WO (1) WO2025232323A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118567891A (en) * 2024-05-07 2024-08-30 深圳Tcl数字技术有限公司 System abnormality diagnosis method, device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160330067A1 (en) * 2014-01-21 2016-11-10 Huawei Technologies Co., Ltd. Network Service Fault Handling Method, Service Management System, and System Management Module
CN111130934A (en) * 2019-12-20 2020-05-08 国铁吉讯科技有限公司 Monitoring method, device and system of communication system
CN111831512A (en) * 2020-07-15 2020-10-27 北京百度网讯科技有限公司 Method, device, electronic device and storage medium for troubleshooting abnormal operation and maintenance
CN113343912A (en) * 2021-06-29 2021-09-03 北京奇艺世纪科技有限公司 Method and device for determining play abnormality reason and electronic equipment
CN118567891A (en) * 2024-05-07 2024-08-30 深圳Tcl数字技术有限公司 System abnormality diagnosis method, device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160330067A1 (en) * 2014-01-21 2016-11-10 Huawei Technologies Co., Ltd. Network Service Fault Handling Method, Service Management System, and System Management Module
CN111130934A (en) * 2019-12-20 2020-05-08 国铁吉讯科技有限公司 Monitoring method, device and system of communication system
CN111831512A (en) * 2020-07-15 2020-10-27 北京百度网讯科技有限公司 Method, device, electronic device and storage medium for troubleshooting abnormal operation and maintenance
CN113343912A (en) * 2021-06-29 2021-09-03 北京奇艺世纪科技有限公司 Method and device for determining play abnormality reason and electronic equipment
CN118567891A (en) * 2024-05-07 2024-08-30 深圳Tcl数字技术有限公司 System abnormality diagnosis method, device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN118567891A (en) 2024-08-30

Similar Documents

Publication Publication Date Title
US11218392B2 (en) Media stream monitor with heartbeat timer
US9497095B2 (en) Dynamic control over tracing of messages received by a message broker
CN111083225A (en) Data processing method and device in Internet of things platform and Internet of things platform
CN108521353B (en) Processing method and device for positioning performance bottleneck and readable storage medium
US11838384B2 (en) Intelligent scheduling apparatus and method
US8756314B2 (en) Selective registration for remote event notifications in processing node clusters
CN109656782A (en) Visual scheduling monitoring method, device and server
CN102360362A (en) Seating screen recording method, quality inspection method and related equipment
CN111190755A (en) Application program function exception handling method and device
WO2025232323A1 (en) System anomaly diagnosis method and apparatus, storage medium, and electronic device
CN106155828A (en) For play-back application resource control method and equipment
EP3691261A1 (en) Method and device for locating video service fault, and storage medium
CN106789209B (en) Exception handling method and device
US9819933B2 (en) Automated testing of media devices
US9372722B2 (en) Reliable asynchronous processing of a synchronous request
CN110554929A (en) Data verification method and device, computer equipment and storage medium
CN115658500A (en) A Vue-based front-end error log upload method and system in hybrid development
CN113595814A (en) Message delay detection method and device, electronic equipment and storage medium
CN114598894B (en) Interactive message processing method, device, equipment and medium
CN112995648B (en) Internet TV full-process fault diagnosis method, device and computing equipment
CN112764914B (en) Intelligent task distribution system and intelligent task analysis method, device and electronic equipment
CN116980331A (en) Device testing method and apparatus, electronic device, and computer-readable storage medium
CN114007090B (en) Video live broadcast establishing method and device, storage medium and electronic equipment
CN112950447A (en) Resource scheduling method, device, server and storage medium
CN109039770A (en) A kind of method for refreshing, device and the relevant device of server CMC