[go: up one dir, main page]

CN102375775A - System unrecoverable error indication signal detection circuit - Google Patents

System unrecoverable error indication signal detection circuit Download PDF

Info

Publication number
CN102375775A
CN102375775A CN2010102532546A CN201010253254A CN102375775A CN 102375775 A CN102375775 A CN 102375775A CN 2010102532546 A CN2010102532546 A CN 2010102532546A CN 201010253254 A CN201010253254 A CN 201010253254A CN 102375775 A CN102375775 A CN 102375775A
Authority
CN
China
Prior art keywords
indication signal
unrecoverable error
error indication
programmable logic
logic device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102532546A
Other languages
Chinese (zh)
Other versions
CN102375775B (en
Inventor
蔡育生
范文纲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Tonglu Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
State Grid Corp of China SGCC
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to CN201010253254.6A priority Critical patent/CN102375775B/en
Publication of CN102375775A publication Critical patent/CN102375775A/en
Application granted granted Critical
Publication of CN102375775B publication Critical patent/CN102375775B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

本发明揭示了一种系统不可恢复错误指示信号检测电路,用于计算机系统中,包括:多个中央处理器单元,当该计算机系统出现不可恢复错误时,至少一个中央处理器单元输出一不可恢复错误指示信号;一复杂可编程逻辑器件,电性耦接至所述中央处理器单元;一南桥,电性耦接至所述复杂可编程逻辑器件;以及一基板管理控制器,电性耦接至所述复杂可编程逻辑器件;当所述不可恢复错误指示信号分别为单个脉冲信号和多个连续脉冲信号时,该复杂可编程逻辑器件将其对应地传送至南桥或基板管理控制器,以重启该计算机系统。采用本发明的检测电路,可根据指示信号的类别将其传送至南桥或基板管理控制器,以通过南桥或基板管理控制器输出重启命令并重启系统。

The present invention discloses a system unrecoverable error indication signal detection circuit, which is used in a computer system and includes: a plurality of central processing units, when an unrecoverable error occurs in the computer system, at least one central processing unit outputs an unrecoverable error indication signal; a complex programmable logic device, electrically coupled to the central processing unit; a south bridge, electrically coupled to the complex programmable logic device; and a baseboard management controller, electrically coupled to the complex programmable logic device; when the unrecoverable error indication signal is a single pulse signal and a plurality of continuous pulse signals, the complex programmable logic device transmits it to the south bridge or the baseboard management controller accordingly to restart the computer system. The detection circuit of the present invention can transmit the indication signal to the south bridge or the baseboard management controller according to the type of the indication signal, so that the south bridge or the baseboard management controller outputs a restart command and restarts the system.

Description

System unrecoverable error indication signal detection circuit
Technical Field
The present invention relates to computer systems, and more particularly, to a system unrecoverable error indication signal detection circuit in a computer system.
Background
Currently, when a server system is normally operated, some system errors in the operation process are sometimes experienced, and the system errors are divided according to whether the errors are recoverable or not, and generally include recoverable errors and unrecoverable errors. To improve the reliability of a server system when recoverable errors occur, it is typically configured to capture recoverable or correctable errors and write them into an error log as they occur, with the capture and log recorded handler giving the server system user an opportunity to replace the defective memory cells before the entire system crashes, allowing the system to resume normal operation. However, when an unrecoverable error occurs, it indicates that the server system has failed to continue to operate and must be restarted.
In view of the above, an urgent need exists in the art for a detection circuit that can issue a restart command to restart a system and record specific information related to a CPU when an unrecoverable error occurs in the system.
Disclosure of Invention
Aiming at the defects existing in the prior art when a computer system detects an unrecoverable error, the invention provides a novel system unrecoverable error indication signal detection circuit.
According to one aspect of the present invention, there is provided a system unrecoverable error indication signal detection circuit for use in a computer system, the system unrecoverable error indication signal detection circuit comprising:
a plurality of central processing units, at least one of which outputs an unrecoverable error indication signal when an unrecoverable error occurs in the computer system;
a complex programmable logic device electrically coupled to the CPU and receiving the unrecoverable error indication signal;
a south bridge electrically coupled to the complex programmable logic device; and
a baseboard management controller electrically coupled to the complex programmable logic device;
when the unrecoverable error indication signal is a single pulse signal, the complex programmable logic device transmits the unrecoverable error indication signal to the south bridge, and the south bridge outputs a restart command to restart the computer system; when the unrecoverable error indication signal is more than two continuous pulse signals, the complex programmable logic device transmits the unrecoverable error indication signal to the baseboard management controller, and the baseboard management controller outputs a restart command to restart the computer system.
Preferably, the complex programmable logic device transmits the received unrecoverable error indication signal to the south bridge or baseboard management controller, and an error log is generated by the baseboard management controller. Further, the error log includes a number of the central processor unit associated with the unrecoverable error indication signal.
Preferably, the system unrecoverable error indication signal detection circuit further includes a plurality of voltage conversion modules, each of the voltage conversion modules is electrically coupled to a corresponding central processing unit, and each of the voltage conversion modules receives the unrecoverable error indication signal output from the central processing unit and amplifies the intensity of the unrecoverable error indication signal. According to an embodiment of the present invention, the voltage converting module includes an NPN transistor having an emitter coupled to the cpu unit and a collector coupled to the complex programmable logic device. According to another embodiment of the present invention, the voltage conversion module comprises a CMOS transistor having a source coupled to the cpu unit and a drain coupled to the complex programmable logic device.
Preferably, when an unrecoverable critical error occurs in the computer system, the unrecoverable error indication signal from the corresponding central processor unit is active low.
Preferably, the bmc records the system unrecoverable error indication signal sent by a cpu of the cpu units through a gpio port of the gpio ports.
Preferably, the computer system is a server.
By adopting the system unrecoverable error indication signal detection circuit, when the complex programmable logic device in the detection circuit receives the unrecoverable error indication signal, the complex programmable logic device can transmit the unrecoverable error indication signal to the south bridge or the baseboard management controller according to the type of the indication signal so as to restart the system through the south bridge or the baseboard management controller. In addition, when the complex programmable logic device receives the unrecoverable error indication signal, the bmc may generate an error log to explicitly record which cpu the unrecoverable error indication signal originated from.
Drawings
The various aspects of the present invention will become more apparent to the reader after reading the detailed description of the invention with reference to the attached drawings. Wherein,
FIG. 1 is a diagram illustrating the overall architecture of a detection circuit for detecting a system unrecoverable error indication signal in a computer system, in accordance with one embodiment of the present invention; and
FIG. 2 further illustrates a circuit schematic of the voltage conversion module in the unrecoverable error indication signal detection circuit of the system shown in FIG. 1.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
FIG. 1 is a diagram illustrating the overall architecture of a detection circuit for detecting a system unrecoverable error indication signal in a computer system, in accordance with one embodiment of the present invention. The computer system herein may be a server, for example. Referring to fig. 1, the system unrecoverable error indication signal detection circuit (hereinafter referred to as a detection circuit for simplicity) includes Central Processing Units (CPUs) 101 and 103, a complex programmable logic device 109, a south bridge 111, and a board management controller 113. When the computer system has an unrecoverable error, the cpu unit 101 or the cpu unit 103 outputs an unrecoverable error indication signal. The complex programmable logic device 109 is electrically coupled to the cpu units 101 and 103 and receives the system unrecoverable error indication signal from the cpu units 101 and 103.
In addition, the south bridge 111 is electrically coupled to the complex programmable logic device 109, and the bmc 113 is electrically coupled to the complex programmable logic device 109, so that when the unrecoverable error indication signal from the cpu is a single pulse signal, the complex programmable logic device 109 transmits the unrecoverable error indication signal to the south bridge 111 and outputs a restart command from the south bridge 111, thereby restarting the computer system. When the unrecoverable error indication signal from the cpu is more than two continuous pulse signals, the plc 109 transmits the unrecoverable error indication signal to the bmc 113, and the bmc 113 outputs a restart command to restart the computer system. It should be noted that, in the computer system, the reason why the south bridge 111 and the bmc 113 respectively output the restart command to restart the system is that the south bridge is halted when the unrecoverable error indication signal is a plurality of consecutive pulse signals, so that the bmc 113 controls the system to restart in this case.
It will be understood by those skilled in the art that if a single CPU computer system is involved, when an unrecoverable critical error occurs, the CPU will automatically check the unrecoverable critical error and then notify the south bridge to issue a restart command. Clearly, in this case, the CPU source of the unrecoverable error indication signal is unambiguous and clear and does not cause confusion or uncertainty. However, for a computer system having two or more CPUs, in the prior art, multiple unrecoverable error indication signals are transmitted to the control unit together, and the control unit sends a request to the south bridge to output a restart command through the south bridge. Thus, it is not certain which CPU sends the unrecoverable error indication signal, and when the unrecoverable error indication signal is a continuous pulse signal, the south bridge is halted and cannot be restarted in time.
According to an embodiment of the present invention, the detection circuit further includes a voltage conversion module 105 and a voltage conversion module 107, wherein the voltage conversion module 105 is electrically connected to the central processing unit 101 and the complex programmable logic device 109, and the voltage conversion module 107 is electrically connected to the central processing unit 103 and the complex programmable logic device 111. It will be understood by those skilled in the art that the voltage conversion modules 105 and 107 in this embodiment are not required by the detection circuit of the present invention, they are only for amplifying or enhancing the strength of the unrecoverable error indication signal, and the present invention is not limited thereto.
From the above, it can be seen that the detection of the unrecoverable error indication signal by the detection circuit of the present invention can clearly distinguish the number of the cpu unit closely related to the unrecoverable error indication signal. Preferably, the voltage conversion modules 105 and 107 are conversion circuits having the same electronic component configuration.
It should be noted that whether the received unrecoverable error indication signal is a single pulse or a plurality of pulses in succession, the complex programmable logic device 109 issues identification information to the baseboard management controller 113 and an error log is generated by the baseboard management controller 113. For example, the error log includes the number of the central processor unit associated with the unrecoverable error indication signal. For example, when the central processing unit 101 or 103 sends an indication signal, the bmc 113 records a log entry, which may record an unrecoverable error indication signal sent by the central processing unit 101 or the central processing unit 103.
FIG. 2 further illustrates a circuit schematic of the voltage conversion module in the unrecoverable error indication signal detection circuit of the system shown in FIG. 1. It is easy to see that the voltage conversion module 105 corresponding to the central processing unit 101 and the voltage conversion module 107 corresponding to the central processing unit 103 can be configured to be identical. Any voltage conversion module is described in detail below.
For the sake of convenience of description, the unrecoverable error indication signal is preset to be active low. That is, when the unrecoverable error indication signal assumes a low level, it indicates that the system has a serious error that is unrecoverable; conversely, when the unrecoverable error indication signal assumes a high level, the system is illustrated as being in a normal operating state. It should be understood that the unrecoverable error indication signal may also be preset to be active high and the electronic component configuration model and parameters in the voltage conversion module may be modified accordingly.
Taking the active low level as an example, when the unrecoverable error indication signal from the cpu 101 is at a high level and electrically coupled to the emitter of the NPN transistor Q1, the collector of Q1 is at a high level when Q1 is in an off state due to the non-compliance with the on condition. Since the collector of Q1 is electrically connected to the plc 109, the gpio corresponding to the collector of Q1 also receives a high signal, and the plc 109 does not transmit a signal to notify the south bridge or bmc to issue a restart command. On the other hand, when the unrecoverable error indication signal from the cpu 101 is at a low level, the voltage between the base and the emitter of Q1 is biased positive, and Q1 is in a conducting state, at which time the collector and the emitter of Q1 form an electrical path and assume a low level. Since the collector of Q1 is electrically connected to the plc 109, the gpio port corresponding to the collector of Q1 receives a low signal, and the plc 109 transmits information to the bmc 113 to log the current error. Preferably, the bmc records a system unrecoverable error indication signal sent by a cpu of the cpu units through a gpio port of the gpio ports. For example, it can be known from the error log that the central processing unit 101 sends an unrecoverable error indication signal when a serious error occurs in the system.
It should be understood by those skilled in the art that although the voltage conversion module in fig. 2 mainly performs conversion between levels through NPN transistors (e.g., converting from a level signal of 1.1V to a level signal of 3.3V), the present invention is not limited thereto. For example, instead of NPN transistors, CMOS transistors may be used, and the level conversion function may be performed as well.
By adopting the system unrecoverable error indication signal detection circuit, when the complex programmable logic device in the detection circuit receives the unrecoverable error indication signal, the complex programmable logic device can transmit the unrecoverable error indication signal to the south bridge or the baseboard management controller according to the type of the indication signal so as to restart the system through the south bridge or the baseboard management controller. In addition, when the complex programmable logic device receives the unrecoverable error indication signal, the bmc may generate an error log to explicitly record which cpu the unrecoverable error indication signal originated from.
Hereinbefore, specific embodiments of the present invention are described with reference to the drawings. However, those skilled in the art will appreciate that various modifications and substitutions can be made to the specific embodiments of the present invention without departing from the spirit and scope of the invention. Such modifications and substitutions are intended to be included within the scope of the present invention as defined by the appended claims.

Claims (9)

1. A system unrecoverable error indication signal detection circuit for use in a computer system, the system unrecoverable error indication signal detection circuit comprising:
a plurality of central processing units, at least one of which outputs an unrecoverable error indication signal when an unrecoverable error occurs in the computer system;
a complex programmable logic device electrically coupled to the CPU and receiving the unrecoverable error indication signal;
a south bridge electrically coupled to the complex programmable logic device; and
a baseboard management controller electrically coupled to the complex programmable logic device;
when the unrecoverable error indication signal is a single pulse signal, the complex programmable logic device transmits the unrecoverable error indication signal to the south bridge, and the south bridge outputs a restart command to restart the computer system; when the unrecoverable error indication signal is more than two continuous pulse signals, the complex programmable logic device transmits the unrecoverable error indication signal to the baseboard management controller, and the baseboard management controller outputs a restart command to restart the computer system.
2. The system unrecoverable error indication signal detecting circuit according to claim 1, wherein said complex programmable logic device transmits said received unrecoverable error indication signal to said south bridge or baseboard management controller and generates an error log by said baseboard management controller.
3. The system unrecoverable error indication signal detecting circuit according to claim 2, wherein said error log includes a number of the central processor unit associated with the unrecoverable error indication signal.
4. The system unrecoverable error indication signal detecting circuit according to claim 1, further comprising a plurality of voltage converting modules, each of the voltage converting modules being electrically coupled to a corresponding CPU, each of the voltage converting modules receiving the unrecoverable error indication signal outputted from the CPU and amplifying the intensity of the unrecoverable error indication signal.
5. The system unrecoverable error indication signal detecting circuit according to claim 4, wherein said voltage converting module includes an NPN transistor having an emitter coupled to said central processing unit and a collector coupled to said complex programmable logic device.
6. The system unrecoverable error indication signal detecting circuit according to claim 4, wherein said voltage converting module includes a CMOS transistor having a source coupled to said CPU unit and a drain coupled to said complex programmable logic device.
7. The system unrecoverable error indication signal detecting circuit according to claim 1, wherein the unrecoverable error indication signal from the corresponding central processing unit is active low when an unrecoverable critical error occurs in said computer system.
8. The system unrecoverable error indication signal detecting circuit according to claim 1, wherein the bmc records the system unrecoverable error indication signal transmitted from a cpu of the cpu units through a gpio port of the gpio ports.
9. The system unrecoverable error indication signal detecting circuit according to claim 1, wherein said computer system is a server.
CN201010253254.6A 2010-08-11 2010-08-11 A kind of computer system with detection system unrecoverable error indication signal Active CN102375775B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010253254.6A CN102375775B (en) 2010-08-11 2010-08-11 A kind of computer system with detection system unrecoverable error indication signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010253254.6A CN102375775B (en) 2010-08-11 2010-08-11 A kind of computer system with detection system unrecoverable error indication signal

Publications (2)

Publication Number Publication Date
CN102375775A true CN102375775A (en) 2012-03-14
CN102375775B CN102375775B (en) 2014-08-20

Family

ID=45794409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010253254.6A Active CN102375775B (en) 2010-08-11 2010-08-11 A kind of computer system with detection system unrecoverable error indication signal

Country Status (1)

Country Link
CN (1) CN102375775B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104503947A (en) * 2014-12-16 2015-04-08 华为技术有限公司 Multi-channel server and its signal processing method
CN105393178A (en) * 2014-04-25 2016-03-09 三菱电机株式会社 Programmable logic controller
CN106201961A (en) * 2016-07-01 2016-12-07 英业达科技有限公司 Control calculator system and the method for processor working frequency
CN106919490A (en) * 2017-02-19 2017-07-04 郑州云海信息技术有限公司 Server failure detection method and device
CN109932995A (en) * 2017-12-18 2019-06-25 鸿富锦精密电子(天津)有限公司 Electronic device
US10353763B2 (en) 2014-06-24 2019-07-16 Huawei Technologies Co., Ltd. Fault processing method, related apparatus, and computer
CN111949457A (en) * 2020-07-27 2020-11-17 中国长城科技集团股份有限公司 Server fault chip detection method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1949182A (en) * 2005-10-14 2007-04-18 戴尔产品有限公司 Detecting correctable errors and logging information relating to their location in memory
CN101630278A (en) * 2008-07-18 2010-01-20 深圳富泰宏精密工业有限公司 Method for recording crash abnormal information of electronic device and electronic device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1949182A (en) * 2005-10-14 2007-04-18 戴尔产品有限公司 Detecting correctable errors and logging information relating to their location in memory
CN101630278A (en) * 2008-07-18 2010-01-20 深圳富泰宏精密工业有限公司 Method for recording crash abnormal information of electronic device and electronic device

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105393178A (en) * 2014-04-25 2016-03-09 三菱电机株式会社 Programmable logic controller
CN105393178B (en) * 2014-04-25 2017-05-24 三菱电机株式会社 Programmable logic controller
US10353763B2 (en) 2014-06-24 2019-07-16 Huawei Technologies Co., Ltd. Fault processing method, related apparatus, and computer
US11360842B2 (en) 2014-06-24 2022-06-14 Huawei Technologies Co., Ltd. Fault processing method, related apparatus, and computer
CN104503947A (en) * 2014-12-16 2015-04-08 华为技术有限公司 Multi-channel server and its signal processing method
CN106201961A (en) * 2016-07-01 2016-12-07 英业达科技有限公司 Control calculator system and the method for processor working frequency
CN106919490A (en) * 2017-02-19 2017-07-04 郑州云海信息技术有限公司 Server failure detection method and device
CN109932995A (en) * 2017-12-18 2019-06-25 鸿富锦精密电子(天津)有限公司 Electronic device
CN109932995B (en) * 2017-12-18 2021-06-15 鸿富锦精密电子(天津)有限公司 Electronic device
CN111949457A (en) * 2020-07-27 2020-11-17 中国长城科技集团股份有限公司 Server fault chip detection method and device

Also Published As

Publication number Publication date
CN102375775B (en) 2014-08-20

Similar Documents

Publication Publication Date Title
CN102375775B (en) A kind of computer system with detection system unrecoverable error indication signal
CN104991629B (en) Power-fail detecting system and its method
WO2021169260A1 (en) System board card power supply test method, apparatus and device, and storage medium
CN104320308B (en) A kind of method and device of server exception detection
KR101558687B1 (en) Serial communication test device, system including the same and method thereof
US8689059B2 (en) System and method for handling system failure
CN102467440A (en) Internal memory error detection system and method
CN102467417B (en) computer system
CN104572362A (en) Electronic device capable of detecting hard disk state
US9026685B2 (en) Memory module communication control
US9626241B2 (en) Watchdogable register-based I/O
CN201859389U (en) Reset management chip and reset system
US9158646B2 (en) Abnormal information output system for a computer system
US20140143597A1 (en) Computer system and operating method thereof
US20130151746A1 (en) Electronic device with general purpose input output expander and signal detection method
US7890831B2 (en) Processor test system utilizing functional redundancy
US20120144245A1 (en) Computing device and method for detecting pci system errors in the computing device
TW201423390A (en) Computer system and operating method thereof
TW201422923A (en) Fan rotation speed control system and method for controlling rotation speed of fan
CN102681928A (en) Abnormal information output system of computer system
JP4299634B2 (en) Information processing apparatus and clock abnormality detection program for information processing apparatus
TWI584114B (en) Power failure detection system and method thereof
JP5561790B2 (en) Hardware failure suspect identification device, hardware failure suspect identification method, and program
CN103186443B (en) Signal control method and system thereof
JP2014059685A (en) Programmable logic device, information processor, suspect place pointing-out method and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: HANGZHOU POWER SUPPLY COMPANY,STATE GRID ZHEJIANG

Effective date: 20141127

Owner name: STATE GRID CORPORATION OF CHINA

Free format text: FORMER OWNER: YINGYEDA CO., LTD., TAIWAN

Effective date: 20141127

C41 Transfer of patent application or patent right or utility model
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Xu Rongyong

Inventor after: Zhang Wei

Inventor after: Hua Chengjun

Inventor after: Hong Jie

Inventor after: Zhan Lei

Inventor after: Wu Xiaohui

Inventor after: Zheng Jianjun

Inventor after: Nie Haitao

Inventor after: Liu Lei

Inventor after: Wei Yu

Inventor before: Cai Yusheng

Inventor before: Fan Wengang

COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: TAIWAN, CHINA TO: 100031 XICHENG, BEIJING

Free format text: CORRECT: INVENTOR; FROM: CAI YUSHENG FAN WENGANG TO: XU RONGYONG HUA CHENGJUN HONG JIE ZHAN LEI YU XIAOHUI ZHENG JIANJUN NIE HAITAO LIU LEI WEI YU ZHANG WEI

TR01 Transfer of patent right

Effective date of registration: 20141127

Address after: 100031 Xicheng District West Chang'an Avenue, No. 86, Beijing

Patentee after: State Grid Corporation of China

Patentee after: Hangzhou Power Supply Company, State Grid Zhejiang Electric Power Company

Patentee after: STATE GRID ZHEJIANG TONGLU POWER SUPPLY COMPANY

Address before: Taipei City, Taiwan Chinese Shilin District Hougang Street No. sixty-six

Patentee before: Inventec Corporation