# Judicious Choice of Waveform Parameters and Accurate Estimation of Critical Charge for Logic SER Estimation

Palkesh Jain and Vivian Zhu<sup>1</sup>
Texas Instruments India, Bangalore

<sup>1</sup>Texas Instruments Inc., Dallas, TX USA

palkesh@ti.com

#### Outline

- Motivation
- Simulation methodology
- Impact of waveform shapes on Qcrit
- Impact of pulse widths : circuit response time
- Impact of transistor ageing on Ocrit
- Re-ordering of critical nodes
- Summary and recommendations

#### EE Times and Keynotes

- Texas Instruments
  - "Logic SER may become as significant as SRAM error rates," predicted Hans Stork, TI CTO, in a keynote speech at the International Reliability Physics Symposium, 2004.
  - Robert Baumann warned that reductions in circuit operating voltages, aggressive substrate/junction engineering and reductions in node capacitance mean radiation-induced single event effects have become a serious threat.
- Intel
  - "Soft errors are the second biggest [reliability] concern after leakage current in submicron memory design".

#### EE Times and Keynotes

#### IBM:

• Tim Dell, "For every 256 Mbytes of memory, you will get one soft error a month".

#### Sun Microsystems

• Encountered SEEs causing Sun server workstations to require occasional resets.

#### Cisco Systems

 Encountered SEE failures with its 12000 series router line cards, reporting failures of memory and ASICs and subsequent debugging attempts for soft errors. Cards showed ASIC errors that may have resulted in a card's reloading with a two- or three-minute recovery, according to a field note.

#### SER Primer

- What are Single Event Upsets (SEU) ?
  - Due to alpha particles and cosmic neutrons
  - Storage node will be flipped if  $Q_{collected} > \overline{Q_{crit}}$

#### SEU depends on

- 1. Diffusion charge collection area
- 2. Node capacitance
- 3. Restoring current

Ocrit is a single metric, which represents a node's sensitivity to soft errors.

## Single-Event Upset in SRAMs: The Feedback Mechanism



#### **Race Condition**

Recovery occurs before feedback:

No Upset

Feedback occurs before recovery:

**Cell Upsets** 

## SER Modeling Flow

Expected Outcome Simulation Methodology System SER **Usage Model** System **Timing Activity**; **Logic Tools** Temporal Masking; **RTL Logical Deration Timing Tools** Gate **Ocrit and Nominal FIT SPICE** Circuit **Charge Collection Physics** 3D Simulator Device **Waveforms** 

#### Critical Charge Modeling

- Critical charge at a node represents the 'minimum' charge required by a single-event particle strike, to create an upset.
- Generally, absolute values of critical charge are not of much importance and it is the relative ranking of nodes, in order of their criticality, which is more important.
  - Designer may choose to harden the top critical nodes.

#### Critical Charge Modeling

- It is of importance to estimate the relative critical charge of the nodes accurately.
- Traditional methods to estimate Qcrit include :
  - Device simulation, including generation of electron-hole pairs to simulate the particle strike and associated circuit response (entirely at device level).
  - Circuit level techniques :
    - Inject a current source (obtained empirically, or, analytically) in the circuit node, representative of the particle strike, and measuring the charge deposited by the 'critical' current source.

#### SPICE Level Ocrit Modeling

- Three waveforms used for analysis :
  - Triangular / Trapezoidal
  - Rectangular
  - Double exponential
- Triangular and rectangular pulses are governed by a single peak; exponential waveform is risetime and fall-time dependent.



$$I_{exp} = \frac{Q_{tot}}{\tau_f - \tau_r} * (exp(-\frac{t}{\tau_f}) - exp(-\frac{t}{\tau_r}))$$

#### Circuit Response to different waveforms

 Evidently, a triangular pulse leads to significant amount of undershoot on the struck node, as compared to exp. current source depositing same charge.

 These undershoots alter the device properties for a transient duration, making the Ocrit result inaccurate.



 In fact, undershoots are associated with altering the threshold voltage of the struck device and make it conducive to flip.

## Impact of Undershoot : threshold voltage lowering

- As is seen, the threshold of the struck device, lowers by as much as 20% due to the triangular current pulse strike.
- This reduction in Vt of the device makes the device stronger, causing the logic to flip faster.



 It is hence recommended that triangular waveforms, with shorter pulse widths should not be used for Qcrit estimation

#### Impact of Pulse Width

- Ocrit at a node is generally summarized by following equation :
  - Qcrit =  $C^*V_{dd} + I_{on}^*t_{Flip}$

Where, C is the node's parasitic capacitance, tFlip is the time the node takes to flip and lon is the recovery current, provided by the restoring pMOS

- Typically, a particle strike is associated with generation of current pulses with a wide width-distribution.
- It is important from a design stand-point to assess the impact of different pulse widths on the circuit response and circuits SER reliability.

#### Circuit under study

- Rectangular waveforms of different pulse widths are used to characterize the Ocrit at the struck node:
  - 0.1ps to 1e5 ps.
- We also measure the time the circuit node takes to flip, as in cases with large pulse widths, the logic flips much before than the pulse duration is over
- Additionally, we also measure a metric: modified Qcrit, which represents the area under the curve of the current waveform, till the time when the logic flips irreparably.



#### Circuit Response Time

- Modified Ocrit and flip time for the circuit node are as shown.
- As can be seen, modified Ocrit saturates for very small and very large values of pulse widths.



• Flip time remains constant till pulse width of 100ps, where-in the device action kicks in. This causes the Qcrit to increase and also the flip time.

#### Discussion: Circuit Response Time

- At very small pulse,
   Ocrit is basically a
   function of the node
   capacitance (device
   action does not
   comes into play).
- As the pulse width increases, the restoring pMOS action kicks in, causing the Ocrit to increase



• Eventually, at very high pulse widths, the logic flips much before the pulse duration is over and the modified qcrit saturates.

## Impact of aging

- As seen, the restoring pMOS action is critical in improving the node's Qcrit.
- With increasing importance of transistor degradation phenomenon like NBTI, it is of interest to assess how does the node's critical charge change with device aging.



16

- Assuming a latch, which stores logic '1' through the life will cause NBTI degradation in only the pMOS P1.
- pMOS P1 is restoring transistor for 1-0 flip on N1, while it is a feedback transistor in other case.

## Discussion : Impact of aging

- Comparing the Ocrit for two nodes at two different pMOS ages (t=0 and t=End of Life), clearly, there is a reduction in the Ocrit at EOL.
- Additionally, the nodes N3 and N4 interchange in the criticality order.



- If aging weakens the restoring pMOS, Qcrit of the flip associated with the node decreases, making the node more sensitive to particle strikes.
  - Ocrit of the circuit node should be assessed considering the device aging.

## Re-ordering of critical nodes

- Imperatively, due to strong circuit effects, it is likely that a node which is most critical with a particular waveform, may become less critical with other.
- Nodes N5 and N3

   interchange the criticality
   order with a different choice
   of waveforms



• We believe that this reordering is mainly due to the restoring transistor's response, which indeed is a function of the injected current pulse.

#### Conclusions

- An elaborate analysis of several key factors, which impact the assessment of the critical charge of the circuit was presented.
  - It was shown that triangular and pulses with short rise times are associated with artifacts like undershoots on the struck node, and may lead to an erroneous Ocrit result.
  - A detailed study of how the node's Ocrit changes with the choice of the pulse width of the injected current was presented, highlighting the major contributors (node's cap and device's response).
  - The Ocrit of the node was studied in the presence of the device aging and it was shown that aging may weaken the recovery action, decreasing the Ocrit.
- The study also provides a way for designers to rank-order the logic blocks/nodes in Ocrit criticality.
  - We show that rank-order is a strong function of the pulse width and hence, it strongly motivates to make particle-strike induced pulse-width distribution, an essential parameter in such a ranking.

#### Thank You

Email Feedback : palkesh@ti.com