US20100287413A1 - Error detection and/or correction through coordinated computations - Google Patents
Error detection and/or correction through coordinated computations Download PDFInfo
- Publication number
- US20100287413A1 US20100287413A1 US12/463,979 US46397909A US2010287413A1 US 20100287413 A1 US20100287413 A1 US 20100287413A1 US 46397909 A US46397909 A US 46397909A US 2010287413 A1 US2010287413 A1 US 2010287413A1
- Authority
- US
- United States
- Prior art keywords
- computing platform
- computation
- variables
- cut
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1637—Error detection by comparing the output of redundant processing systems using additional compare functionality in one or some but not all of the redundant processing components
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/165—Error detection by comparing the output of redundant processing systems with continued operation after detection of the error
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2205—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
- G06F11/2236—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test CPU or processors
Definitions
- FIG. 1 illustrates an overview of computation error detection and/or correction through coordinated computation, in accordance with various examples
- FIG. 2 illustrates a method to address computing errors, in accordance with various examples
- FIGS. 3 a - 3 c illustrate an example computation cut and variable observation, in accordance with various examples
- FIG. 4 illustrates an example processor with implementation decisions that facilitate addressing errors, including error detection and correction through coordinated computation, in accordance with various examples
- FIG. 5 illustrates a computing system configured in accordance with various examples
- FIG. 6 illustrates a computing program product in accordance with various examples, all arranged in accordance with the present disclosure.
- Embodiments of the present disclosure include various techniques to address the issue of computation errors through coordinated computation.
- a technique for addressing computation errors comprises methods for error detection, diagnosis, characterization, and correction.
- another technique includes synthesis and implementation techniques for facilitating errors addressing.
- the techniques may be employed to correct permanent errors due to manufacturing variability.
- FIG. 1 wherein an overview of the present disclosure of addressing computing errors through coordinated computation, in accordance with various embodiments, is illustrated.
- two different processors or computing platforms 101 and 103 may be configured to coordinate execution of the same or similar functionality.
- At least one of these executions e.g. execution on processor or computing platform 103 , may be conducted in such a way that it may be significantly more robust against large set of errors. Therefore, this or these types of execution may be used as a standard for correcting errors in the other execution, e.g. execution on processor or computing platform 101 , done under energy constraints or any other setting where the errors may be more likely due to reasons such as radiation or high temperature.
- processors or computer systems 101 and 103 may be independently provided with the program and input data to be executed in a coordinated manner.
- processor or computing platform 103 may be configured to instruct processor or computing platform 103 on where and when to take a cut of the computation and observe one or more variables at the cut, to be described more fully below.
- processor or computing platform 101 may be configured to send real time variables that may be used for error detection to processor or computing platform 103 that executes the same or similar program using the same or similar input data.
- processor or computing platform 103 may analyze the received variable values, and detect for errors based at least in part on the result of the analysis.
- processor or computing platform 103 further may diagnose the errors, create corrections to correct one or more of the detected errors, and/or may send instructions to processor or computing platform 103 on how to correct the detected errors.
- the processor or computing platform 103 may characterize processor or computing platform 101 in terms of its faults, simulates or emulates its execution and sends the corrective instructions to processor or computing platform 101 .
- the correction may be either data dependent or generic for a pertinent program.
- the present disclosure may be practiced in an off-line manner; in other embodiments, the present disclosure may be practiced in an on-line manner.
- the data may be transferred between the involved devices using either wired or wireless communication.
- one of the executions may be conducted using tight clock cycle time of a fixed point processor and the other may be done with lax clock cycle time on a floating point processor.
- errors may not be limited to errors on the execution units, in the relevant datapaths, or datapaths themselves, but may include errors of all types of interconnect, memory elements, clock circuitry, power distribution network and any other device that participates in data processing, storage, communication, or acquisition.
- error diagnosis and/or creation of correction may be performed on another computing platform other than the computing platform performing the coordinated computation and error detection.
- computing platform 101 may be a wireless computing device such as a mobile phone, a media player, a laptop computer, a personal digital assistant, and so forth.
- Computing platform 103 on the other hand may be any one of a number of servers.
- FIG. 2 illustrates a method 200 of the present disclosure to address computing errors in further detail, in accordance with various embodiments.
- method 200 may comprise all or a subset of the following operations: (i) Test input data generation 202 ; (ii) Error detection 204 ; (iii) Error diagnosis 206 ; (iv) Errors characterization 208 ; (v) Error corrections 210 ; and/or (vi) System verification 212 .
- the goal may be to generate test vectors that cover some or all possible errors in systematic way so that more likely and the higher impact errors may be first identified. These test input may be either identical or different for different phases.
- error detection may be through cut-based comparison, described more fully below.
- the process may start with cuts that may be far from each other and gradually introduce more dense cuts in order to facilitate error diagnosis.
- the process may diagnosis the errors by tracing the variables from one cut to another, described more fully below.
- a spectrum of corrections may be generated, so that user may select one that may be perceived as beneficial.
- different corrections may be created for different quality of service (QoS) requirements of different applications.
- QoS quality of service
- overall system verification may be conducted using e.g. learn-and-test methodology.
- FIGS. 3 a - 3 c illustrate the concept of cut and variable observation, and their employment for error detection, diagnosis and correction, in accordance with various embodiments of the present disclosure. More specifically, FIGS. 3 a - 3 c illustrate an example execution of the first three iterations of a small program that may be executed in a very long loop.
- inputs In 1 , In 2 , In 3 , and In 4 are correspondingly multiplied with constants C 1 , C 2 , C 3 , and C 4 using multiplier M 1 , M 2 , M 3 , and M 4 respectively.
- the products of the first two multiplications In 1 with C 1 and In 2 and C 2 , and the products of the last two multiplications In 3 with C 3 and In 4 and C 4 are correspondingly added together using accumulators A 1 and A 2 respectively.
- the results of the additions are then further added together using accumulators A 3 .
- the result is then outputted as well as provided as In 2 for the next iteration. Further, the output of accumulators A 3 is further combined with another value using accumulators A 4 and provided as input In 4 for the next iteration.
- a cut at the point of the computation where outputs of the multiplications have been initially pairwise added together may be taken (i.e. at the output of accumulator A 1 and A 2 ) in each of the three iterations.
- various variables may be observed, and the observations may be used for subsequent error detection, diagnosis and correction.
- any error between two cuts may be detected by checking only the variable values of the later cut.
- one may further reduce the average number of observed variables by checking only the cuts every k-th iterations.
- a cut may be taken at each of three iterations of the example execution, the present disclosure is not so restricted; in alternate embodiments, the cuts do not have to be taken periodically.
- the basic scheme for error detection may be by comparing the variable values observed at the cut of the less reliable (e.g. lower energy, less resource) processor or computing platform with the variable values observed at the cut of the more reliable (e.g. higher energy, more resource) processor computing platform.
- the reliability and the accuracy of the trusted processor or computing platform may be further improved by employing high precision arithmetic, enforcement of favorable operational and environmental conditions, and double, triple or higher redundancy.
- variable values of each cut may be observed.
- the cut variables represent a set of states and/or inputs that completely defined some or all consequent outputs.
- the location of a cut, and the frequency of taking cuts may be application dependent.
- the cuts may be improved or optimized according to a variety of objectives including but not limited to their cardinality; difficulty of correcting the cuts variables, and suitability for compression and/or decompression.
- error diagnosis may be conducted between two consecutive cuts where some or all variables in the earlier cut may be correct. An error may be considered to be detected if at least one variable in a later cut is not correct. In various embodiments, error diagnosis further comprises backward tracing of the incorrect variables starting from the later cut. In instances where the correct value on one of the inputs is detected, the search along that direction may be terminated. Accordingly, the source for each detected error may be determined.
- error diagnosis may not necessarily be performed for error correction, as errors may or may not have to be performed at the same places where they may be introduced.
- the first adder for example, increases the result by seven, and the second reduces the sum by nine systematically. It is easy to see that it may be sufficient to increase one of the inputs by two in order to always correctly rectify the two chained adders.
- the diagnosis of errors may be illuminating for design and operation of the processors or computing systems by revealing what should be altered in order to prevent a particular error or a type of error.
- tracing may be performed by comparing the complete cuts of the program.
- the search may be conducted between the first cut where the differences between the variable values of the two computing platforms may be observed, and the previous cut. Either forward or backward search may be used.
- the assumption may be made that there are no canceling errors, since errors that may be cancelled do not have to be addressed. Cancellation of two or more errors happens when, for example, one addition increases the output variable by the same amount, as the consequent addition reduces.
- error classification may be conducted by identifying the properties of errors and by grouping the actually observed errors with similar values according to one or more properties together.
- the error properties may include the source (e.g, bitwidth, clock cycle time, operational conditions, environment), permanency (single cycle, multiple cycles, permanent), the sequential depth of combinational logic where the error may be observed, suitability for corrections according to input data, variable, and program alternations, the impact on the accuracy in terms of the impacted output variables and their change, and the percentage of devices where the pertinent error may be likely to occur.
- error classification may be performed based at least in part on the likelihood of presence of a error type on the less reliable processor or computing platform.
- error correction may be conducted at the places where the errors occurred. In other embodiments, some or all errors may be corrected at the cut variables. In still other embodiments, the permanent faults may be corrected using addition program instructions inserted before or after operation executed at a faulty execution unit or stored in a faulty register.
- correction of errors may be conducted using any combination of mechanisms including but not limited to the following five ways: (i) data corrections; (ii) constant and variable corrections; (iii) program or computational structure corrections; (iv) operational conditions corrections; and/or (v) environment alternations.
- the corrections may be limited to a single data set executed using a given software and/or hardware implemented functionality, may be specific or generic with respect to the source of errors, and may target automatic correction of a specific types of errors without need to consider a specific input datasets.
- the correction may be complete in the sense that the results on the considered platforms may be identical to one with the highest reliability and accuracy or may be partial and conducted in such a way that a specific objective error norm or subjective quality of service criteria may be satisfied.
- the corrections may be conducted so that a specific operational metrics such as the latency and throughput may be satisfied.
- error corrections may be conducted using one or more of the mechanisms that include but are not limited to program, inputs, and variables alternation and conditions and environment alternations. Selecting a way for compensation may be driven by design and operation metrics and the architecture, operating system, system software and other implementation and operational issues. For example, if communication is much more expensive than computation, than the primary goal may be to reduce the required amount of communication. Therefore, selecting a small cut where the corrections may be suitable for compression may be favored as compared to the simplicity of correcting the output or the inputs.
- a technique for addressing error may additionally or alternatively include (i) Systematic errors correction for an arbitrary inputs to an arbitrary or a specific program/functionality; and/or (ii) Synthesis for error correction.
- synthesis for error correction comprises of a set of synthesis and implementation decisions that support any of the earlier described operations for systematic errors correction for an arbitrary inputs to an arbitrary or a specific program/functionality.
- a special emphasis may be placed in error correction where hardware may be added to allow rapid correction of internal variables.
- scheduling slots may be intentionally left empty for error correction.
- built-in-self-repair techniques may be employed for correction of manufacturing variability errors.
- FIG. 4 illustrates an example processor 400 with implementation decisions that facilitate addressing errors, including error detection and correction through coordinated computation as earlier described, in accordance with various embodiments of the present disclosure.
- the processor 400 includes a widely used dedicated register file architecture where some or all registers 402 may be grouped in a number of register files RF 1 , RF 2 , RF 3 , RF 4 , RF 5 , and RF 6 in order to share access logic and reduce interconnect, and each register file RF 1 , RF 2 et al may be connected to an input one of the execution units.
- ALU arithmetic logic unit
- Registers files RF 1 and RF 2 may be coupled to the ALU 404 , register files RF 3 and RF 4 may be coupled to the adder 406 , and register files RF 5 and RF 6 may be coupled to the multiplier 408 .
- the there may be direct connection from the input port only to register files RF 1 and RF 4 and that there may be a direct connection to the output only from the output of the multiplier.
- variables that are observed to be ones that are result of one of the multiplications may be stored in register files RF 1 and RF 4 , and the corrections may be applied to variables that are stored in register files RF 1 and RF 4 .
- the hardware organization making the example processor particularly suitable for practicing the earlier described approaches to error detection and/or correction.
- FIG. 5 is a block diagram illustrating an example computing device configured in accordance with the present disclosure.
- computing device 500 typically includes one or more processors 510 and system memory 520 .
- a memory bus 530 may be used for communicating between the processor 510 and the system memory 520 .
- processor 510 may be of any type including but not limited to a microprocessor ( ⁇ P), a microcontroller ( ⁇ C), a digital signal processor (DSP), or any combination thereof.
- Processor 510 may include one more levels of caching, such as a level one cache 511 and a level two cache 12 , a processor core 513 , and registers 514 .
- An example processor core 513 may include an arithmetic logic unit (ALU), a floating point unit (FPU), a digital signal processing core (DSP Core), or any combination thereof.
- An example memory controller 515 may also be used with the processor 510 , or in some implementations the memory controller 515 may be an internal part of the processor 510 .
- system memory 520 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof.
- System memory 520 may include an operating system 521 , one or more applications 522 , and program data 524 .
- Application 522 may include programming instructions providing logic to implement the above described coordinated computation based error detection and correction.
- Program Data 524 may include the applicable and related coordinated computation based error detection and correction data values.
- Computing device 500 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 501 and any required devices and interfaces.
- a bus/interface controller 540 may be used to facilitate communications between the basic configuration 501 and one or more data storage devices 550 via a storage interface bus 541 .
- the data storage devices 550 may be removable storage devices 551 , non-removable storage devices 552 , or a combination thereof. Examples of removable storage and non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDD), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and tape drives to name a few.
- Example computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
- Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by computing device 500 . Any such computer storage media may be part of device 500 .
- Computing device 500 may also include an interface bus 542 for facilitating communication from various interface devices (e.g., output interfaces, peripheral interfaces, and communication interfaces) to the basic configuration 501 via the bus/interface controller 540 .
- Example output devices 560 include a graphics processing unit 561 and an audio processing unit 562 , which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 563 .
- Example peripheral interfaces 570 include a serial interface controller 571 or a parallel interface controller 572 , which may be configured to communicate with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (e.g., printer, scanner, etc.) via one or more I/O ports 573 .
- An example communication device 580 includes a network controller 581 , which may be arranged to facilitate communications with one or more other computing devices 590 over a network communication link via one or more communication ports 582 .
- the network communication link may be one example of a communication media.
- Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media.
- a “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
- communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media.
- RF radio frequency
- IR infrared
- the term computer readable media as used herein may include both storage media and communication media.
- Computing device 500 may be implemented as a portion of a small-form factor portable (or mobile) electronic device such as a cell phone, a personal data assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions.
- a small-form factor portable (or mobile) electronic device such as a cell phone, a personal data assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions.
- PDA personal data assistant
- Computing device 500 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations.
- FIG. 6 illustrates a block diagram of an example computer program product 600 , all arranged in accordance with the present disclosure.
- computer program product 600 includes a signal bearing medium 602 that may also include programming instructions 604 .
- Programming instructions 604 may be for receiving one or more observations of one or more variables of one or more cuts of a computation performed on another computing platform using one or more input data.
- Programming instructions 604 may also be for detecting one or more errors of the computation performed on the another computing platform, based at least in part on the observations of the one or more variables of the cut of the computation.
- programming instructions 604 may be for determining a set of one or more corrections, and providing to the another computing platform, the set of one or more corrections.
- programming instructions 604 may also be for implementing a set of provided corrections.
- computer product 600 may include one or more of a computer readable medium 606 , a recordable medium 608 and a communications medium 610 .
- the dotted boxes around these elements depict different types of mediums included within, but not limited to, signal bearing medium 602 . These types of mediums may distribute programming instructions 604 to be executed by logic.
- Computer readable medium 606 and recordable medium 608 may include, but are not limited to, a flexible disk, a hard disk drive (HDD), a Compact Disc (CD), a Digital Video Disk (DVD), a digital tape, a computer memory, etc.
- Communications medium 610 may include, but is not limited to, a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communication link, a wireless communication link, etc.).
- implementations may be in hardware, such as employed to operate on a device or combination of devices, for example, whereas other implementations may be in software and/or firmware.
- some implementations may include one or more articles, such as a storage medium or storage media.
- This storage media such as CD-ROMs, computer disks, flash memory, or the like, for example, may have instructions stored thereon, that, when executed by a system, such as a computer system, computing platform, or other system, for example, may result in execution of a processor in accordance with claimed subject matter, such as one of the implementations previously described, for example.
- a computing platform may include one or more processing units or processors, one or more input/output devices, such as a display, a keyboard and/or a mouse, and one or more memories, such as static random access memory, dynamic random access memory, flash memory, and/or a hard drive.
- the implementer may opt for a mainly hardware and/or firmware vehicle; if flexibility is paramount, the implementer may opt for a mainly software implementation; or, yet again alternatively, the implementer may opt for some combination of hardware, software, and/or firmware.
- ASICs Application Specific Integrated Circuits
- FPGAs Field Programmable Gate Arrays
- DSPs digital signal processors
- ASICs Application Specific Integrated Circuits
- FPGAs Field Programmable Gate Arrays
- DSPs digital signal processors
- aspects of the embodiments disclosed herein, in whole or in part, may be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computer systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure.
- a signal bearing medium examples include, but are not limited to, the following: a recordable type medium such as a floppy disk, a hard disk drive, a Compact Disc (CD), a Digital Video Disk (DVD), a digital tape, a computer memory, etc.; and a transmission type medium such as a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).
- any two components so associated may also be viewed as being “operably connected”, or “operably coupled”, to each other to achieve the desired functionality, and any two components capable of being so associated may also be viewed as being “operably couplable”, to each other to achieve the desired functionality.
- operably couplable include but are not limited to physically mateable and/or physically interacting components and/or wirelessly interactable and/or wirelessly interacting components and/or logically interacting and/or logically interactable components.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
- Detection And Correction Of Errors (AREA)
Abstract
Description
- There may be numerous potential sources of errors in computational, communicating, and data storing systems and data processing including algorithmic and data structure errors, compilation mistakes, errors in design specification and design implementation, manufacturing faults, environmental impact, inadequate operational conditions, data alternations, and intentional malicious attacks. Essentially all technological and application trends may be likely to have additional negative impact on the error rates and their impact. For example, feature scaling may exponentially increase the likelihood of radiation errors. Also, the exponentially growing rate of integration may make difficult to produce manufacturing error free integrated circuits and systems. Finally, the complexity of application grows at significantly higher rates than the computational capabilities of silicon. One of the ramifications of a more efficient use of silicon may be increased number of design errors. Even more importantly may be that optimization of all design metrics, including clocking speed, energy, area, manufacturing and testing cost, latency, throughput and reliability, may be inevitably related to the ability to address error. And the ability to address error may be particularly important to low power and/or energy minimization/reduction, and debugging, among other goals.
- Subject matter is particularly pointed out and distinctly claimed in the concluding portion of the specification. The foregoing and other features of the present disclosure will become more fully apparent from the following description and appended claims, taken in conjunction with the accompanying drawings. Understanding that these drawings depict only several embodiments in accordance with the disclosure and are, therefore, not to be considered limiting of its scope, the disclosure will be described with additional specificity and detail through use of the accompanying drawings, in which:
-
FIG. 1 illustrates an overview of computation error detection and/or correction through coordinated computation, in accordance with various examples; -
FIG. 2 illustrates a method to address computing errors, in accordance with various examples; -
FIGS. 3 a-3 c illustrate an example computation cut and variable observation, in accordance with various examples; -
FIG. 4 illustrates an example processor with implementation decisions that facilitate addressing errors, including error detection and correction through coordinated computation, in accordance with various examples; -
FIG. 5 illustrates a computing system configured in accordance with various examples; and -
FIG. 6 illustrates a computing program product in accordance with various examples, all arranged in accordance with the present disclosure. - The following description sets forth various examples along with specific details to provide a thorough understanding of claimed subject matter. It will be understood by those skilled in the art, however, that claimed subject matter may be practiced without some or more of the specific details disclosed herein. Further, in some circumstances, well-known methods, procedures, systems, components and/or circuits have not been described in detail in order to avoid unnecessarily obscuring claimed subject matter. In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the Figures, can be arranged, substituted, combined, and designed in a wide variety of different configurations, all of which are explicitly contemplated and make part of this disclosure.
- This disclosure is drawn, inter alia, to methods, apparatus, and systems related to addressing computation errors via coordinated computation on two computing platforms. Embodiments of the present disclosure include various techniques to address the issue of computation errors through coordinated computation. In various embodiments, a technique for addressing computation errors comprises methods for error detection, diagnosis, characterization, and correction. In various embodiments, another technique includes synthesis and implementation techniques for facilitating errors addressing. In various embodiments, the techniques may be employed to correct permanent errors due to manufacturing variability.
- In the following detailed description, reference is made to the accompanying drawings which form a part hereof, and in which are shown by way of illustration embodiments in which embodiments may be practiced. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope of the disclosure. Therefore, the following detailed description is not to be taken in a limiting sense, and the scope of embodiments is defined by the appended claims and their equivalents.
- Referring now to
FIG. 1 , wherein an overview of the present disclosure of addressing computing errors through coordinated computation, in accordance with various embodiments, is illustrated. As shown, for the embodiments, two different processors or 101 and 103 may be configured to coordinate execution of the same or similar functionality. At least one of these executions, e.g. execution on processor orcomputing platforms computing platform 103, may be conducted in such a way that it may be significantly more robust against large set of errors. Therefore, this or these types of execution may be used as a standard for correcting errors in the other execution, e.g. execution on processor orcomputing platform 101, done under energy constraints or any other setting where the errors may be more likely due to reasons such as radiation or high temperature. - In various embodiments, processors or
101 and 103 may be independently provided with the program and input data to be executed in a coordinated manner. At action 105 (Cut Probe), in various embodiments, processor orcomputer systems computing platform 103 may be configured to instruct processor orcomputing platform 103 on where and when to take a cut of the computation and observe one or more variables at the cut, to be described more fully below. At action 107 (Observed Variables), processor orcomputing platform 101 may be configured to send real time variables that may be used for error detection to processor orcomputing platform 103 that executes the same or similar program using the same or similar input data. In various embodiments, on receipt, processor orcomputing platform 103 may analyze the received variable values, and detect for errors based at least in part on the result of the analysis. At action 109 (Corrections), in various embodiments, processor orcomputing platform 103 further may diagnose the errors, create corrections to correct one or more of the detected errors, and/or may send instructions to processor orcomputing platform 103 on how to correct the detected errors. - In alternate embodiments, the processor or
computing platform 103 may characterize processor orcomputing platform 101 in terms of its faults, simulates or emulates its execution and sends the corrective instructions to processor orcomputing platform 101. The correction may be either data dependent or generic for a pertinent program. - In various embodiments, the present disclosure may be practiced in an off-line manner; in other embodiments, the present disclosure may be practiced in an on-line manner. In an on-line embodiment, the data may be transferred between the involved devices using either wired or wireless communication. In various embodiments, one of the executions may be conducted using tight clock cycle time of a fixed point processor and the other may be done with lax clock cycle time on a floating point processor.
- Before further describing embodiments of the present disclosure, it should be noted that for the purpose of this specification, what is considered errors may not be limited to errors on the execution units, in the relevant datapaths, or datapaths themselves, but may include errors of all types of interconnect, memory elements, clock circuitry, power distribution network and any other device that participates in data processing, storage, communication, or acquisition.
- Further, in alternate embodiments, error diagnosis and/or creation of correction may be performed on another computing platform other than the computing platform performing the coordinated computation and error detection.
- In various embodiments,
computing platform 101 may be a wireless computing device such as a mobile phone, a media player, a laptop computer, a personal digital assistant, and so forth.Computing platform 103 on the other hand may be any one of a number of servers. -
FIG. 2 illustrates a method 200 of the present disclosure to address computing errors in further detail, in accordance with various embodiments. As shown, method 200 may comprise all or a subset of the following operations: (i) Testinput data generation 202; (ii)Error detection 204; (iii)Error diagnosis 206; (iv)Errors characterization 208; (v)Error corrections 210; and/or (vi)System verification 212. Inblock 202, in various embodiments, the goal may be to generate test vectors that cover some or all possible errors in systematic way so that more likely and the higher impact errors may be first identified. These test input may be either identical or different for different phases. Atblock 204, in various embodiments, error detection may be through cut-based comparison, described more fully below. Atblock 206, in various embodiments, the process may start with cuts that may be far from each other and gradually introduce more dense cuts in order to facilitate error diagnosis. Atblock 208, in various embodiments, for error characterization, the process may diagnosis the errors by tracing the variables from one cut to another, described more fully below. Atblock 210, in various embodiments, for error correction, a spectrum of corrections may be generated, so that user may select one that may be perceived as beneficial. In various embodiments, different corrections may be created for different quality of service (QoS) requirements of different applications. Atblock 212, in various embodiments, overall system verification may be conducted using e.g. learn-and-test methodology. -
FIGS. 3 a-3 c illustrate the concept of cut and variable observation, and their employment for error detection, diagnosis and correction, in accordance with various embodiments of the present disclosure. More specifically,FIGS. 3 a-3 c illustrate an example execution of the first three iterations of a small program that may be executed in a very long loop. In this example small program, inputs In1, In2, In3, and In4 are correspondingly multiplied with constants C1, C2, C3, and C4 using multiplier M1, M2, M3, and M4 respectively. The products of the first two multiplications In1 with C1 and In2 and C2, and the products of the last two multiplications In3 with C3 and In4 and C4 are correspondingly added together using accumulators A1 and A2 respectively. The results of the additions are then further added together using accumulators A3. The result is then outputted as well as provided as In2 for the next iteration. Further, the output of accumulators A3 is further combined with another value using accumulators A4 and provided as input In4 for the next iteration. - As illustrated, a cut at the point of the computation where outputs of the multiplications have been initially pairwise added together may be taken (i.e. at the output of accumulator A1 and A2) in each of the three iterations. At each cut, various variables may be observed, and the observations may be used for subsequent error detection, diagnosis and correction.
- Before further continuing with the present disclosure, it should be noted that any error between two cuts may be detected by checking only the variable values of the later cut. In addition, one may further reduce the average number of observed variables by checking only the cuts every k-th iterations. Further, while a cut may be taken at each of three iterations of the example execution, the present disclosure is not so restricted; in alternate embodiments, the cuts do not have to be taken periodically.
- In various embodiments, the basic scheme for error detection may be by comparing the variable values observed at the cut of the less reliable (e.g. lower energy, less resource) processor or computing platform with the variable values observed at the cut of the more reliable (e.g. higher energy, more resource) processor computing platform. In various embodiments, the reliability and the accuracy of the trusted processor or computing platform may be further improved by employing high precision arithmetic, enforcement of favorable operational and environmental conditions, and double, triple or higher redundancy.
- In various embodiments, the variable values of each cut may be observed. The cut variables represent a set of states and/or inputs that completely defined some or all consequent outputs. The location of a cut, and the frequency of taking cuts may be application dependent. In various embodiments, the cuts may be improved or optimized according to a variety of objectives including but not limited to their cardinality; difficulty of correcting the cuts variables, and suitability for compression and/or decompression.
- In various embodiments, error diagnosis may be conducted between two consecutive cuts where some or all variables in the earlier cut may be correct. An error may be considered to be detected if at least one variable in a later cut is not correct. In various embodiments, error diagnosis further comprises backward tracing of the incorrect variables starting from the later cut. In instances where the correct value on one of the inputs is detected, the search along that direction may be terminated. Accordingly, the source for each detected error may be determined.
- It should be noted that error diagnosis may not necessarily be performed for error correction, as errors may or may not have to be performed at the same places where they may be introduced. Consider the case where two chained adders have permanent manufacturing faults. The first adder, for example, increases the result by seven, and the second reduces the sum by nine systematically. It is easy to see that it may be sufficient to increase one of the inputs by two in order to always correctly rectify the two chained adders. However, the diagnosis of errors may be illuminating for design and operation of the processors or computing systems by revealing what should be altered in order to prevent a particular error or a type of error.
- As alluded to earlier, in various embodiments, tracing may be performed by comparing the complete cuts of the program. The search may be conducted between the first cut where the differences between the variable values of the two computing platforms may be observed, and the previous cut. Either forward or backward search may be used. In various embodiments, the assumption may be made that there are no canceling errors, since errors that may be cancelled do not have to be addressed. Cancellation of two or more errors happens when, for example, one addition increases the output variable by the same amount, as the consequent addition reduces.
- In various embodiments, error classification may be conducted by identifying the properties of errors and by grouping the actually observed errors with similar values according to one or more properties together. In various embodiments, the error properties may include the source (e.g, bitwidth, clock cycle time, operational conditions, environment), permanency (single cycle, multiple cycles, permanent), the sequential depth of combinational logic where the error may be observed, suitability for corrections according to input data, variable, and program alternations, the impact on the accuracy in terms of the impacted output variables and their change, and the percentage of devices where the pertinent error may be likely to occur. In various embodiments, error classification may be performed based at least in part on the likelihood of presence of a error type on the less reliable processor or computing platform.
- In various embodiments, error correction may be conducted at the places where the errors occurred. In other embodiments, some or all errors may be corrected at the cut variables. In still other embodiments, the permanent faults may be corrected using addition program instructions inserted before or after operation executed at a faulty execution unit or stored in a faulty register.
- In various embodiments, correction of errors may be conducted using any combination of mechanisms including but not limited to the following five ways: (i) data corrections; (ii) constant and variable corrections; (iii) program or computational structure corrections; (iv) operational conditions corrections; and/or (v) environment alternations. The corrections may be limited to a single data set executed using a given software and/or hardware implemented functionality, may be specific or generic with respect to the source of errors, and may target automatic correction of a specific types of errors without need to consider a specific input datasets. In various embodiments, the correction may be complete in the sense that the results on the considered platforms may be identical to one with the highest reliability and accuracy or may be partial and conducted in such a way that a specific objective error norm or subjective quality of service criteria may be satisfied. In addition, the corrections may be conducted so that a specific operational metrics such as the latency and throughput may be satisfied.
- In various embodiments, error corrections may be conducted using one or more of the mechanisms that include but are not limited to program, inputs, and variables alternation and conditions and environment alternations. Selecting a way for compensation may be driven by design and operation metrics and the architecture, operating system, system software and other implementation and operational issues. For example, if communication is much more expensive than computation, than the primary goal may be to reduce the required amount of communication. Therefore, selecting a small cut where the corrections may be suitable for compression may be favored as compared to the simplicity of correcting the output or the inputs.
- As described earlier, due to technological and application trends, it may be likely that it may be difficult to manufacture fault-free integrated circuits. In addition, the complexity of hardware, system and utility software may result into frequent error corrections. In order to address this, a technique for addressing error may additionally or alternatively include (i) Systematic errors correction for an arbitrary inputs to an arbitrary or a specific program/functionality; and/or (ii) Synthesis for error correction.
- In various embodiments, synthesis for error correction comprises of a set of synthesis and implementation decisions that support any of the earlier described operations for systematic errors correction for an arbitrary inputs to an arbitrary or a specific program/functionality. In various embodiments, a special emphasis may be placed in error correction where hardware may be added to allow rapid correction of internal variables. In various embodiments, scheduling slots may be intentionally left empty for error correction. In other embodiments, built-in-self-repair techniques may be employed for correction of manufacturing variability errors.
-
FIG. 4 illustrates anexample processor 400 with implementation decisions that facilitate addressing errors, including error detection and correction through coordinated computation as earlier described, in accordance with various embodiments of the present disclosure. As shown, theprocessor 400 includes a widely used dedicated register file architecture where some or allregisters 402 may be grouped in a number of register files RF1, RF2, RF3, RF4, RF5, and RF6 in order to share access logic and reduce interconnect, and each register file RF1, RF2 et al may be connected to an input one of the execution units. In particular, for the illustrated embodiments, there may be three execution units arithmetic logic unit (ALU) 404,adder 406, andmultiplier 408. Registers files RF1 and RF2 may be coupled to theALU 404, register files RF3 and RF4 may be coupled to theadder 406, and register files RF5 and RF6 may be coupled to themultiplier 408. Note that the there may be direct connection from the input port only to register files RF1 and RF4 and that there may be a direct connection to the output only from the output of the multiplier. In various embodiments, in synthesis for error detection and correction, variables that are observed to be ones that are result of one of the multiplications may be stored in register files RF1 and RF4, and the corrections may be applied to variables that are stored in register files RF1 and RF4. Thus, the hardware organization making the example processor particularly suitable for practicing the earlier described approaches to error detection and/or correction. -
FIG. 5 is a block diagram illustrating an example computing device configured in accordance with the present disclosure. In a very basic configuration 501,computing device 500 typically includes one ormore processors 510 andsystem memory 520. A memory bus 530 may be used for communicating between theprocessor 510 and thesystem memory 520. - Depending on the desired configuration,
processor 510 may be of any type including but not limited to a microprocessor (μP), a microcontroller (μC), a digital signal processor (DSP), or any combination thereof.Processor 510 may include one more levels of caching, such as a level onecache 511 and a level two cache 12, aprocessor core 513, and registers 514. Anexample processor core 513 may include an arithmetic logic unit (ALU), a floating point unit (FPU), a digital signal processing core (DSP Core), or any combination thereof. Anexample memory controller 515 may also be used with theprocessor 510, or in some implementations thememory controller 515 may be an internal part of theprocessor 510. - Depending on the desired configuration, the
system memory 520 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof.System memory 520 may include anoperating system 521, one ormore applications 522, andprogram data 524.Application 522 may include programming instructions providing logic to implement the above described coordinated computation based error detection and correction.Program Data 524 may include the applicable and related coordinated computation based error detection and correction data values. -
Computing device 500 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 501 and any required devices and interfaces. For example, a bus/interface controller 540 may be used to facilitate communications between the basic configuration 501 and one or moredata storage devices 550 via a storage interface bus 541. Thedata storage devices 550 may beremovable storage devices 551,non-removable storage devices 552, or a combination thereof. Examples of removable storage and non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDD), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and tape drives to name a few. Example computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. -
System memory 520,removable storage 551 andnon-removable storage 552 are all examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by computingdevice 500. Any such computer storage media may be part ofdevice 500. -
Computing device 500 may also include an interface bus 542 for facilitating communication from various interface devices (e.g., output interfaces, peripheral interfaces, and communication interfaces) to the basic configuration 501 via the bus/interface controller 540.Example output devices 560 include a graphics processing unit 561 and an audio processing unit 562, which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 563. Exampleperipheral interfaces 570 include aserial interface controller 571 or a parallel interface controller 572, which may be configured to communicate with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (e.g., printer, scanner, etc.) via one or more I/O ports 573. Anexample communication device 580 includes anetwork controller 581, which may be arranged to facilitate communications with one or moreother computing devices 590 over a network communication link via one ormore communication ports 582. - The network communication link may be one example of a communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media. The term computer readable media as used herein may include both storage media and communication media.
-
Computing device 500 may be implemented as a portion of a small-form factor portable (or mobile) electronic device such as a cell phone, a personal data assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions.Computing device 500 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations. -
FIG. 6 illustrates a block diagram of an example computer program product 600, all arranged in accordance with the present disclosure. In some examples, as shown inFIG. 6 , computer program product 600 includes a signal bearing medium 602 that may also include programming instructions 604. Programming instructions 604 may be for receiving one or more observations of one or more variables of one or more cuts of a computation performed on another computing platform using one or more input data. Programming instructions 604 may also be for detecting one or more errors of the computation performed on the another computing platform, based at least in part on the observations of the one or more variables of the cut of the computation. Further, programming instructions 604 may be for determining a set of one or more corrections, and providing to the another computing platform, the set of one or more corrections. In some embodiments, programming instructions 604 may also be for implementing a set of provided corrections. - Also depicted in
FIG. 6 , in some examples, computer product 600 may include one or more of a computerreadable medium 606, arecordable medium 608 and acommunications medium 610. The dotted boxes around these elements depict different types of mediums included within, but not limited to, signal bearing medium 602. These types of mediums may distribute programming instructions 604 to be executed by logic. Computerreadable medium 606 andrecordable medium 608 may include, but are not limited to, a flexible disk, a hard disk drive (HDD), a Compact Disc (CD), a Digital Video Disk (DVD), a digital tape, a computer memory, etc. Communications medium 610 may include, but is not limited to, a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communication link, a wireless communication link, etc.). - Claimed subject matter is not limited in scope to the particular implementations described herein. For example, some implementations may be in hardware, such as employed to operate on a device or combination of devices, for example, whereas other implementations may be in software and/or firmware. Likewise, although claimed subject matter is not limited in scope in this respect, some implementations may include one or more articles, such as a storage medium or storage media. This storage media, such as CD-ROMs, computer disks, flash memory, or the like, for example, may have instructions stored thereon, that, when executed by a system, such as a computer system, computing platform, or other system, for example, may result in execution of a processor in accordance with claimed subject matter, such as one of the implementations previously described, for example. As one possibility, a computing platform may include one or more processing units or processors, one or more input/output devices, such as a display, a keyboard and/or a mouse, and one or more memories, such as static random access memory, dynamic random access memory, flash memory, and/or a hard drive.
- There is little distinction left between hardware and software implementations of aspects of systems; the use of hardware or software is generally (but not always, in that in certain contexts the choice between hardware and software may become significant) a design choice representing cost vs. efficiency tradeoffs. There may be various vehicles by which processes and/or systems and/or other technologies described herein may be effected (e.g., hardware, software, and/or firmware), and that the favored vehicle may vary with the context in which the processes and/or systems and/or other technologies are deployed. For example, if an implementer determines that speed and accuracy are paramount, the implementer may opt for a mainly hardware and/or firmware vehicle; if flexibility is paramount, the implementer may opt for a mainly software implementation; or, yet again alternatively, the implementer may opt for some combination of hardware, software, and/or firmware.
- In some embodiments, several portions of the subject matter described herein may be implemented via Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), digital signal processors (DSPs), or other integrated formats. However, those skilled in the art will recognize that some aspects of the embodiments disclosed herein, in whole or in part, may be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computer systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure. In addition, those skilled in the art will appreciate that the mechanisms of the subject matter described herein are capable of being distributed as a program product in a variety of forms, and that an illustrative embodiment of the subject matter described herein applies regardless of the particular type of signal bearing medium used to actually carry out the distribution. Examples of a signal bearing medium include, but are not limited to, the following: a recordable type medium such as a floppy disk, a hard disk drive, a Compact Disc (CD), a Digital Video Disk (DVD), a digital tape, a computer memory, etc.; and a transmission type medium such as a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).
- The herein described subject matter sometimes illustrates different components or elements contained within, or connected with, different other components or elements. It is to be understood that such depicted architectures are merely examples, and that in fact many other architectures may be implemented which achieve the same functionality. In a conceptual sense, any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality may be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or intermedial components. Likewise, any two components so associated may also be viewed as being “operably connected”, or “operably coupled”, to each other to achieve the desired functionality, and any two components capable of being so associated may also be viewed as being “operably couplable”, to each other to achieve the desired functionality. Specific examples of operably couplable include but are not limited to physically mateable and/or physically interacting components and/or wirelessly interactable and/or wirelessly interacting components and/or logically interacting and/or logically interactable components.
- It will be understood by those within the art that, in general, terms used herein, and especially in the appended claims (e.g., bodies of the appended claims) are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). It will be further understood by those within the art that if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases “at least one” and “one or more” to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to inventions containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an” should typically be interpreted to mean “at least one” or “one or more”); the same holds true for the use of definite articles used to introduce claim recitations. In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should typically be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations,” without other modifiers, typically means at least two recitations, or two or more recitations). Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc.” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). In those instances where a convention analogous to “at least one of A, B, or C, etc.” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, or C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” will be understood to include the possibilities of “A” or “B” or “A and B.”
- Although certain embodiments have been illustrated and described herein for purposes of description of the preferred embodiment, it will be appreciated by those of ordinary skill in the art that a wide variety of alternate and/or equivalent embodiments or implementations calculated to achieve the same purposes may be substituted for the embodiments shown and described without departing from the scope of the disclosure. Those with skill in the art will readily appreciate that embodiments of the disclosure may be implemented in a very wide variety of ways. This application is intended to cover any adaptations or variations of the embodiments discussed herein. Therefore, it is manifestly intended that embodiments of the disclosure be limited only by the claims and the equivalents thereof.
Claims (39)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/463,979 US8255743B2 (en) | 2009-05-11 | 2009-05-11 | Error detection and/or correction through coordinated computations |
| US13/584,277 US8566638B2 (en) | 2009-05-11 | 2012-08-13 | Error detection and/or correction through coordinated computations |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/463,979 US8255743B2 (en) | 2009-05-11 | 2009-05-11 | Error detection and/or correction through coordinated computations |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/584,277 Continuation US8566638B2 (en) | 2009-05-11 | 2012-08-13 | Error detection and/or correction through coordinated computations |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20100287413A1 true US20100287413A1 (en) | 2010-11-11 |
| US8255743B2 US8255743B2 (en) | 2012-08-28 |
Family
ID=43063079
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/463,979 Expired - Fee Related US8255743B2 (en) | 2009-05-11 | 2009-05-11 | Error detection and/or correction through coordinated computations |
| US13/584,277 Expired - Fee Related US8566638B2 (en) | 2009-05-11 | 2012-08-13 | Error detection and/or correction through coordinated computations |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/584,277 Expired - Fee Related US8566638B2 (en) | 2009-05-11 | 2012-08-13 | Error detection and/or correction through coordinated computations |
Country Status (1)
| Country | Link |
|---|---|
| US (2) | US8255743B2 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220058078A1 (en) * | 2020-08-18 | 2022-02-24 | Micron Technology, Inc. | Memory data correction using multiple error control operations |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7017150B2 (en) * | 2001-08-20 | 2006-03-21 | Sun Microsystems, Inc. | Method and apparatus for automatically isolating minimal distinguishing stimuli in design verification and software development |
| US7098144B2 (en) * | 2004-10-21 | 2006-08-29 | Sharp Laboratories Of America, Inc. | Iridium oxide nanotubes and method for forming same |
| US20070174746A1 (en) * | 2005-12-20 | 2007-07-26 | Juerg Haefliger | Tuning core voltages of processors |
| US7255745B2 (en) * | 2004-10-21 | 2007-08-14 | Sharp Laboratories Of America, Inc. | Iridium oxide nanowires and method for forming same |
| US7308607B2 (en) * | 2003-08-29 | 2007-12-11 | Intel Corporation | Periodic checkpointing in a redundantly multi-threaded architecture |
| US7606695B1 (en) * | 2003-09-30 | 2009-10-20 | Sun Microsystems, Inc. | Self-checking simulations using dynamic data loading |
| US20100241892A1 (en) * | 2009-03-17 | 2010-09-23 | Miodrag Potkonjak | Energy Optimization Through Intentional Errors |
| US20100287409A1 (en) * | 2009-05-11 | 2010-11-11 | Miodrag Potkonjak | State variable-based detection and correction of errors |
| US20100287404A1 (en) * | 2009-05-11 | 2010-11-11 | Miodrag Potkonjak | Input compensated and/or overcompensated computing |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6769073B1 (en) * | 1999-04-06 | 2004-07-27 | Benjamin V. Shapiro | Method and apparatus for building an operating environment capable of degree of software fault tolerance |
| US6988264B2 (en) * | 2002-03-18 | 2006-01-17 | International Business Machines Corporation | Debugging multiple threads or processes |
-
2009
- 2009-05-11 US US12/463,979 patent/US8255743B2/en not_active Expired - Fee Related
-
2012
- 2012-08-13 US US13/584,277 patent/US8566638B2/en not_active Expired - Fee Related
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7017150B2 (en) * | 2001-08-20 | 2006-03-21 | Sun Microsystems, Inc. | Method and apparatus for automatically isolating minimal distinguishing stimuli in design verification and software development |
| US7308607B2 (en) * | 2003-08-29 | 2007-12-11 | Intel Corporation | Periodic checkpointing in a redundantly multi-threaded architecture |
| US7606695B1 (en) * | 2003-09-30 | 2009-10-20 | Sun Microsystems, Inc. | Self-checking simulations using dynamic data loading |
| US7098144B2 (en) * | 2004-10-21 | 2006-08-29 | Sharp Laboratories Of America, Inc. | Iridium oxide nanotubes and method for forming same |
| US7255745B2 (en) * | 2004-10-21 | 2007-08-14 | Sharp Laboratories Of America, Inc. | Iridium oxide nanowires and method for forming same |
| US20070174746A1 (en) * | 2005-12-20 | 2007-07-26 | Juerg Haefliger | Tuning core voltages of processors |
| US20100241892A1 (en) * | 2009-03-17 | 2010-09-23 | Miodrag Potkonjak | Energy Optimization Through Intentional Errors |
| US20100287409A1 (en) * | 2009-05-11 | 2010-11-11 | Miodrag Potkonjak | State variable-based detection and correction of errors |
| US20100287404A1 (en) * | 2009-05-11 | 2010-11-11 | Miodrag Potkonjak | Input compensated and/or overcompensated computing |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220058078A1 (en) * | 2020-08-18 | 2022-02-24 | Micron Technology, Inc. | Memory data correction using multiple error control operations |
| US11314583B2 (en) * | 2020-08-18 | 2022-04-26 | Micron Technology, Inc. | Memory data correction using multiple error control operations |
| US11726863B2 (en) | 2020-08-18 | 2023-08-15 | Micron Technology, Inc. | Memory data correction using multiple error control operations |
Also Published As
| Publication number | Publication date |
|---|---|
| US8566638B2 (en) | 2013-10-22 |
| US20120311382A1 (en) | 2012-12-06 |
| US8255743B2 (en) | 2012-08-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102307364B1 (en) | A vulnerability driven hybrid test system for application programs | |
| US8977902B2 (en) | Integrity checking including side channel monitoring | |
| US20180183577A1 (en) | Techniques for secure message authentication with unified hardware acceleration | |
| US8527737B2 (en) | Using addresses to detect overlapping memory regions | |
| US8578311B1 (en) | Method and system for optimal diameter bounding of designs with complex feed-forward components | |
| TWI804292B (en) | Electronic systems, computer-implemented methods and computer program products for calibrating a quantum decoder | |
| Guerrero Balaguera et al. | Understanding the effects of permanent faults in GPU's parallelism management and control units | |
| US11474795B2 (en) | Static enforcement of provable assertions at compile | |
| CN105183641B (en) | The data consistency verification method and system of a kind of kernel module | |
| Donaldson et al. | Automatic analysis of scratch-pad memory code for heterogeneous multicore processors | |
| CN118642762A (en) | Instruction processing method, device, electronic device and readable storage medium | |
| US8566638B2 (en) | Error detection and/or correction through coordinated computations | |
| Anzt et al. | Fine-grained bit-flip protection for relaxation methods | |
| US8145943B2 (en) | State variable-based detection and correction of errors | |
| US9182943B2 (en) | Methods and devices for prime number generation | |
| WO2020181473A1 (en) | CIRCUIT STRUCTURE, SYSTEM ON CHIP (SoC), AND DATA PROCESSING METHOD | |
| TWI551982B (en) | Register error protection through binary translation | |
| US9928135B2 (en) | Non-local error detection in processor systems | |
| US8041992B2 (en) | Input compensated and/or overcompensated computing | |
| Hartmann et al. | The genotypic complexity of evolved fault-tolerant and noise-robust circuits | |
| CN115168912A (en) | Hardware acceleration method for PQC digital signature algorithm | |
| He | Resilient Deep Learning Accelerators | |
| Liu et al. | QuantEM: The quantum error management compiler | |
| Dutro | Hardware acceleration of the SAMtools variant caller | |
| CN119356703A (en) | A method, device, electronic device and medium for flashing firmware of a device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: TECHNOLOGY CURRENTS LLC, DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:POTKONJAK, MIODRAG;REEL/FRAME:024326/0970 Effective date: 20090627 |
|
| AS | Assignment |
Owner name: EMPIRE TECHNOLOGY DEVELOPMENT LLC, DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TECHNOLOGY CURRENTS LLC;REEL/FRAME:027581/0849 Effective date: 20111021 |
|
| ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
| ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| AS | Assignment |
Owner name: ARISTAEUS HERMES LLC, CALIFORNIA Free format text: REDACTED ASSIGNMENT;ASSIGNOR:POTKONJAK, MIODRAG;REEL/FRAME:029249/0429 Effective date: 20120614 Owner name: EMPIRE TECHNOLOGY DEVELOPMENT LLC, DELAWARE Free format text: REDACTED ASSIGNMENT;ASSIGNOR:ARISTAEUS HERMES LLC;REEL/FRAME:029249/0648 Effective date: 20120614 |
|
| CC | Certificate of correction | ||
| FPAY | Fee payment |
Year of fee payment: 4 |
|
| AS | Assignment |
Owner name: CRESTLINE DIRECT FINANCE, L.P., TEXAS Free format text: SECURITY INTEREST;ASSIGNOR:EMPIRE TECHNOLOGY DEVELOPMENT LLC;REEL/FRAME:048373/0217 Effective date: 20181228 |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| FEPP | Fee payment procedure |
Free format text: 7.5 YR SURCHARGE - LATE PMT W/IN 6 MO, LARGE ENTITY (ORIGINAL EVENT CODE: M1555); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
| AS | Assignment |
Owner name: EMPIRE TECHNOLOGY DEVELOPMENT LLC, WASHINGTON Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CRESTLINE DIRECT FINANCE, L.P.;REEL/FRAME:065712/0585 Effective date: 20231004 Owner name: EMPIRE TECHNOLOGY DEVELOPMENT LLC, WASHINGTON Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CRESTLINE DIRECT FINANCE, L.P.;REEL/FRAME:065712/0585 Effective date: 20231004 |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240828 |