Title Method of Refreshing Memory Devices
Field of the invention
This invention relates to a method of refreshing memory devices, in particular dynamic random access memories (DRAMs).
Background to the invention
Random access memory (RAM) for computer systems is commonly implemented in two forms: static and dynamic. In static RAM technology, once the datum has been written to a storage cell no further action is required by the system to maintain the value in the cell. Static memory therefore needs no refreshing of its contents.
In dynamic RAM (DRAM) the cell datum is stored as the presence or absence of charge on a capacitor. The charge on the capacitor gradually leaks away dependent on the method of manuracture, and or thermal and noise effects. The charge in a cell therefore has to be refreshed by reading the cell's datum before it has leaked to a level below which it cannot be sensed reliably. This
refreshing operation is carried out by performing a destructive read operation followed by a write operation which restores the charge in the cell to its proper level.
Manufacturers of DRAMs specify a rate at which the memory must be refreshed in order to guarantee that no
information is lost. The major mechanism which causes the charge leakage is thermal and generally manufacturers specify a maximum refresh interval at the appropriate end of a device's operating temperature range, e.g. 8ms refresh period at 70 degrees Celsius. The thermal leakage rate is a function of temperature and a well known rule of thumb is that a reduction of 10 degrees Celsius halves the leakage rate and therefore doubles the maximum refresh period, i.e. to 16ms refresh period at 60 degrees Celsius.
Noise effects are by their nature random and can include sources such as radiation and local electrical noise. If, as in the case of radiation, the noise destroys the charge in the cell, then this means that the level below which the charge can fall has to be raised so that a hit on the cell does not lead to loss of data. Electrical noise makes the sensing operation more difficult and in an electrically noisy environment the cell charge will have to be above a certain level for reliable sensing. The converse is also true; in an electrically quiet
environment the cell charge can be smaller and still sensed reliably.
In low power applications such as battery powered computer systems, there are a number of techniques that can be used to reduce power consumption. In a system which is quiescent apart from memory refresh, the power use is dependent on how often the memory is refreshed and how much current is consumed during the refresh operation. An increase in refresh period or decrease in current required during the refresh operation will therefore reduce the average power requirements.
Current generation DRAMs offer a number of refresh
mechanisms to the designer: the two most important variations are known as RAS only refresh and CAS before RAS refresh. In RAS only refresh, the system designer has to inform the RAM which row is to be refreshed and typically this involves setting up a 10 bit address to the chip and pulsing the RAS line. In CAS before RAS refresh a counter internal to the RAM is used, but both RAS and CAS lines are pulsed. Typically, during a CAS before RAS refresh operation, the chip will consume about 10% more power than in a RAS only refresh operation. In a complete system this will not translate to a 10% power saving, because power is required in the RAS only case to drive the address to the chip. However, an economical implementation would still be able to show power saving by using RAS only refresh.
The invention
According to the invention, there is provided a method of refreshing DRAMs according to which a small section (S) of the DRAM is refreshed at a lower rate than the major section (M), data errors arising in section S are sensed, and the refresh rates of the two sections (S and M) are adjusted, whilst maintaining the M section refresh rate higher than the S section refresh rate, until data errors in section S are not significant.
In this connection, it will be appreciated that as section S is always refreshed at the lower rate, errors will be apparent in this section before errors occur in section M if the refresh rate of section S is inadequate. Thus, section S can reliably be employed as a sensor for the chip to determine the optimum refresh rate for existing
environmental conditions.
It is well known that external temperature sensing
techniques can be used to increase the refresh period with a commensurate power reduction.
The invention has a number of advantages over
straightforward, external, sensing techniques, because the chip is sensing itself. Firstly, process effects can be eliminated because section S is subject to the same process variations as section M so that overall process variations are eliminated. Secondly, section S will be subject to the same, or very similar, noise as section M, so that this effect can also be eliminated. Finally, the method provides a much more accurate temperature
measurement of the chip since it is actually on chip and not through an off chip device.
Preferred practice of the invention may be as follows. Initially, section M of the DRAM is refreshed at the manufacturer's specified rate. The refresh rate for Section S is then extended in suitable increments until a rate is attained at which errors are just significant, say a refresh rate E. The refresh rate for section M is then adjusted to the last error free rate of section S
multiplied by a suitable safety factor, for example of the order of 2. Section S continues to be refreshed at rate E and if the refresh rate for this section changes due to an increase or decrease in detected significant errors in the section, then the refresh rate for section M is changed appropriately through a simple control loop. The sensitivity of the technique depends upon the errors in section S. If refresh rate of section S is set so that errors just do not occur then it would be impossible to
sense an improvement in conditions and extend the
interval. By setting the rate so that errors are just occurring, the technique is effectively setting the level at which the charge in a cell cannot be reliably sensed. The sense amplifiers in the memory are designed so that at this point they become very sensitive to a change in conditions.
Description of practical exemplification
The invention is further exemplified and explained below, making reference to the accompanying drawings, in which:-
Figure 1 shows a DRAM organisation for use in the
invention;
Figure 2 shows a typical DRAM cell and addressing
technique for the cell;
Figure 3 is a simplified flow chart exemplifying the
invention; and
Figure 4 is a more detailed flow chart which includes
refresh adaptation.
In a typical DRAM configuration (see Figures 1 and 2), the system is designed to use RAS only refresh on a bank of memory chips. In this configuration the memory
addressing is arranged such that the address of a word in the memory is given by:
Bit address = (Row address * Column size) + Column
address
Alternatively for, e.g., 1M DRAMs:
Column address = A2 - A11
Row address = A12 - A21, where An is the corresponding
processor address line.
Refresh is then accomplished by pulsing RAS with an externally set up row address, which will refresh all cells within the row, i.e. cells normally addressed by A2 - A11. Section S is implemented as a row, or rows, of the memory and is refreshed at a lower rate than section M by not presenting its row address as often as that of Section M. This scheme has a number of advantages over a more straightforward technique which would refresh one chip less often than others. Firstly the sensing is effectively distributed across all devices, which will improve the error reporting/sensing and secondly, the errors will be confined to a contiguous block of memory because of the column addressing. The latter means that it is easier for the system to use the memory, both in conventional use and also when error detecting or
correcting.
A flow chart for a basic refresh scheme is shown in Figure 3, and will be clear without further description. Figure 4 shows a more detailed scheme including refresh
adaptation, which will also be found self explanatory.
In the complete design it is important to improve the scheme to be guarantee data integrity in section M. In the basic scheme outlined above, conditions are sampled at the refresh rate and if conditions change faster than this, data would be lost. For instance, in taking the system from a cold environment to a hot one, if the
system's thermal time constant is less than refresh rate then data would be lost before the refresh rate is adjusted.
The scheme can be adapted to cope with such problems in a number of ways. One way is to augment the system with an external temperature sensor which is used to detect the rate of change of external temperature. Under high rates of change this would force an early re-evaluation of the system refresh rates. In addition, or alternatively, more areas of the chip may be used as sensors, enabling the refresh intervals for the sensors to be staggered so that the maximum interval between senses is always less than the thermal time constant of the system.
Although the data in section S is unreliable it is not useless. If the data is protected by suitable error correction codes then the system can still store
meaningful data in section S. An advantage of this is that an error detection algorithm can be used by the refresh control routine to report the error rate of section S. It may also be advantageous to incorporate some error detection and correction in section M, so that data loss can be avoided under rapidly changing
conditions .
For the technique to provide a useful power saving it is necessary that the control algorithm does not, on average, consume more power than a straightforward refresh
technique. A small amount of processing is required for error detection and the control algorithm is simple so it is possible to achieve power saving provided the memories can be refreshed at a suitably extended rate. If
significant error correction is required at the refresh
rate then the method is unlikely to show any saving because correction is a much more time consuming
technique. The simplest scheme is not to use section S to store valuable data, but just to store a simple pattern in which errors are easily detected; this will lead to a very quick method for detecting the error rate. For further power reduction it is to be noted that if the refresh interval is short with respect to the thermal time constant of the system, it will not be necessary to calculate the number of errors at every section S refresh operation; a refresh operation will always restore the cell contents and if operating near the limit a
significant number of errors could accumulate before the error rate determination was made. This means that the algorithm would overcompensate for the errors because it would not know the error rate distribution leading to higher power consumption than absolutely necessary. This situation is, however, failsafe.
In practical implementation it is sensible to limit the maximum refresh interval. The reason for this is because, as the refresh period is extended, the amount of charge left in the cell's capacitor becomes smaller and smaller and there will come a point at which noise and stray radiation become much more significant. To avoid problems with false error rates, it is important to stay well away from this limiting case. As a practical convenience it is sensible to limit the refresh interval to less than the thermal time constant of the system to avoid the above mentioned problems. This thermal time constant is, in any case, likely to be long with respect to the maximum allowable refresh interval.
Another important consideration is that of sample size.
This very much depends on the relative sizes of sections S and M. If, for instance, section S is 1024 bits long, section M is 1048576 bits, section S error rate is 1 in le6 and section M error rate 1 in le7, then on average section S would only report an error every 1024 samples. Although the error rate in section M will be less than this because it is 1000 times larger, the number of bits in error could be quite large, e.g. 100. This is an unacceptably large number. There are various ways to reduce this disparity and increase the safety margin:
spatial and temporal. Spatially, to strike a better balance between the relative sizes of section S and M, and temporally to adjust the safety factor between the S and M refresh rates so that the S rate is, e.g., x10 of the M rate rather than x2.
The error rate in section S will be dependent upon the pattern stored in the memory chip. For instance, if a cell with charge is surrounded on the chip by discharged cells then it will leak at a higher rate than one
surrounded by charged cells. This effect can therefore be used to increase the sensitivity of the technique by selecting just such awkward patterns. However, the mapping between the logical address and physical cell placement on a memory chip is not normally straightforward and it would be necessary to obtain the information to derive the necessary pattern on a manufacturer by
manufacturer basis.
It is, of course, possible to integrate the essence of the above-described technique onto a memory chip itself. This would be an effective solution which could be combined with existing system architectures for very low power systems, e.g. systems powered by small batteries.