April 30, 2014

Report
Combination of Multiple
Mechanism for Post-Silicon
Reliability Prediction
Joseph B. Bernstein
Ofir Delly,
Moti Gabbay
Ariel University
Yizhak Bot (BQR)
[email protected]
April30,
30,2014
2014
April
1
We always try learning from the
past in order to improve the
Future.
One Problem…..
Everyone sees the past
April 30, 2014differently !
2
“It is possible to fail in many
ways...while to succeed is
possible only in one way…”
Aristotle
If We don’t learn from the past,
We are condemned to repeat
it…George Santayana, 1952
April 30, 2014
3
April 30, 2014
4
The Semiconductor Test Industry Today
We test the parts “blindly” and then “see how they run…”
April 30, 2014
5
Field Data Analysis Results
Cumulative data for over 10,000,000 Military Electronic Systems
Weibull Beta Paramter Histogram
MTBF
Region
16
14
10
Rate Occurrences, Beta = 1 is Poisson.
8
6
Physics of
Failure
4
2
4
3.8
3.6
3.4
3.2
3
2.8
2.6
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
Frequency
12
= 1 ± .2 for all
systems
Field Failures are generally Constant
Beta
So, we should keep MTBF and FIT
April 30, 2014
6
Some Observations:
Modern Electronics have nearly •
constant failure rate
Few (very rare) exceptions •
Keep the idea of Constant Rate and
work within the framework of
•
Failure-In-Time (FIT)
April 30, 2014
7
So what’s the problems with FIT ?
Handbooks are Pretty outdated 
MIL 217 is OLD and USELESS. o
FIDES is updated but only applies a o
single mechanism approach.
Physics of Failure (PoF) approach o
looks to TTF and not FIT.
Probabilistic DfR requires unique o
distributions for each mechanism.
HALT/HASS cannot predict l. o
April 30, 2014
8
JEDEC Publication JEP 122G Rev. Oct. 2011
I Bet You didn’t know JEDEC says this:
2 Terms and definitions (cont’d)
quoted failure rate: The predicted failure rate for typical 
operating conditions. (This is the FIT)
NOTE: The quoted failure rate is calculated from the observed 
failure rate under accelerated stress conditions multiplied by an
accelerated factor; e.g…..
“ When multiple failure mechanisms and thus
multiple acceleration factors are involved, then
a proper summation technique, e.g., sum-ofthe-failure rates method, is required.”
April 30, 2014
9
Semiconductor Industry ‘Joke’
The Magical Mysterious Decreasing FIT
Intel
Maxim
PLX
1 FIT = 1 Failure per
10,000 parts in 12 years.
If ONLY this were true!
April 30, 2014
10
Measured Component FIT (l) vs. Year Produced
Field Return Data
1000
Failure Rate (FIT)
ACTUAL Failures per
Billion Part-Hours
45-22 nm : ???
90 nm : ~ 150 - 300 FIT
65nm:~ 300-450 FIT
100
130 nm : ~90-120 FIT
0.25 m : ~20-50 FIT
Avionic and
Military
Expectation !
10
1
1985
1990
1995
2000
2005
2010
2015
2020
Year(sold)
Compared to previous avionic system data, the trend •
continues at a much greater than expected rate.
Bernstein’s Law: ~10x increase in FIT every 10 years •
April 30, 2014
11
Benefits to Accurate Prediction !!
More applications means more
Reliability
1. X
Sale$
Performance is
Designed for a required
Reliability specification
Suggestion:
A small reduction
in performance
can bring a huge
gain in reliability
(illustrative)
Two products;
One design
X
2.
Performance
April 30, 2014
Multiple Accelerated Test Matrix for Reliability Prediction
More customers
for the same
Design
12
12
Performance vs. Reliability
inverter RO 21
1.20E+08
Freq. (Hz)
1.00E+08
8.00E+07
6.00E+07
Why not operate here?
4.00E+07
2.00E+07
Nominal Voltage
0.00E+00
0
0.5
1
1.5
2
2.5
3
3.5
Core Voltage (V)
I could double the speed for free If I KNOW the reliability, maybe •
I CAN improve performance !?!?
April 30, 2014
13
Qualification TODAY
Industry ‘Standard’ FIT (failures in time) model:
Acceleration Factor (AF) is the product of Voltage
and Temperature acceleration factors.
3 KILLER problems:
This does NOT fit with KNOWN failure models.
.1
When ZERO failures are reported, there is NO statistical meaning to the
acceleration factor.
Uncertainty is assumed for 0/1 fails, while AF has ZERO uncertainty;
no accounting for error in AF !!
April 30, 2014
.2
.3
14
Multiple Mechanisms Are Here to Stay
Traditional Reliability approach fails to •
predict Field Failures.
Modeling, Simulation and Acceleration alone •
will NOT yield true results without Accurate
Failure Analysis.
HOWEVER: We CAN model and PREDICT •
Failure Rate under Known Conditions with a
more complete picture of the mechanisms ???
April 30, 2014
15
Multiple Mechanisms Don’t Add Up !!!
Single Mechanism Model:
AFsystem = AFThermal* AFElectrical –
So, 1/MTTFuse = 1/(MTTFtest *AFMM) –
Multiple Mechanism Model:
1/MTTFuse = P1/(MTTFtest *AFmech1) + P2/(MTTFtest *AFmech2)
–
Therefore, the effective AF for multiple mechanisms is: –
1
AFMM =
P2
P1
+
AFmech2
AFmech1
The True acceleration factor is the SMALLER one, not •
the one which exposes a failure at accelerated test.
April 30, 2014
16
Traditional Methodology
Single Mechanism Model (old JEDEC Standard): •
77 Devices tested for 1000 hours with 0 failures… –
For Example: AFT = 100 and AFV = 130 •
AFS= 100*130 = 13000 !!
Zero failures at High V and High T
Assume 1 failure after 1000 hours:
Thus FIT: 109 / (77 * 1000 * 13000) = 1 FIT !!
NICE! Now, we have done a great job and can go home and •
celebrate our success !!! NOT !!!
April 30, 2014
17
The Reality of Multiple Mechanisms
BUT….Multiple Mechanisms Compete ! •
Same Example: AFV from HCI and AFT from EM •
EM has Ea = 1 eV and voltage g ~ 1. –
HCI has Ea ~ 0 eV and voltage g ~ 14 –
NOW, AFS = 2/(1/100 + 1/130) = 163 •
So our correct calculation for the same data: •
FIT: 109 / (77 * 1000 * 163) = 113
FIT !!
This is compared to 1 FIT based on HTOL.
Traditional FIT is ALWAYS too low as compared to considering
multiple mechanisms
April 30, 2014
18
Failure Rate Estimation at System Level
New System Reliability Model
Replacement Program (collaboration)
Nth Component
FM1
FM2
FM3
Each component is comprised
of several sub-components in
proportion to their function
and relative reliability stress.
lO = lO '·PO = (B1-O lHCI +B2-O lTDDB +B3-O lEM +B4-O lNBTI )·PO
lD = lD '·PD = (B1-D lHCI +B2-D lTDDB +B3-D lEM +B4-D lNBTI )·PD
lS = lS '·PS = (B1-S lHCI +B2-S lTDDB +B3-S lEM +B4-S lNBTI )·PS
lJ = lJ '·PJ = (B1-J lHCI +B2-J lTDDB +B3-J lEM +B4-J lNBTI )·PJ
Base Failure rate can be determined
at various accelerated conditions in
order to normalize the matrix and
make physics based reliability
assessment from test data combined
with knowledge of the application
April 30, 2014
19
FIXtress™ : A MORE ACCURATE FIT
~S(1/MTTF1+1/MTTF2+…+1/MTTFn) l
Calculated
PDF
(FIT)
The manufacturers have the data, we can make the
prediction (BQR Software Tool) !
λTDDB
l
λHCI
λNBTI
λEM
λPackage
12
10
8
6
4
Time to Fail (years)
April 30, 2014
2
20
Our Guiding Principle:
“It is better to be
roughly right
than precisely
wrong.”
― John Maynard Keynes
April 30, 2014
21
Post-Silicon Test Strategy
How can we match data from reliability Models
with experimentally obtained AF from HTOL?
PROPOSAL: Run Multiple Tests at different
conditions while monitoring degradation.
AF from Burn-in at
different T, V
Physics of Failure
Models (JEDEC)
Matrix solution can
match
April 30, 2014
22 22
Our New Approach (ARIEL)
JEDEC or TSMC
Physics models
Input
MTBF / FIT
24 failure
mechanisms
over 4
categories
DOE Burn-In
Input
Rel. AF
λTDDB λHCI λBTI λEM
 Relative AF
 Relative MTBF/FIT
T1,V1
TDDB
HCI
BTI
=
T2,V2
Input
T3,V3
X
T4,V4
System (TEST)
measurements
EM
Matrix solution
Output
Proportionality
parameter X
DPPM per Fmax limit
(real FIT at V, T test)
Reliability solution:
FIT, DPPM
April 30, 2014
23
Contributions from JEDEC Models
45nm
Temp
Volt
Different Dominant Mechanism at
each test condition
TDDB
HCI
BTI
EM
FIT
200
1.2
2.93E+03
8.35E+00
4.26E+04 2.40E+05
242750
140
1.2
3.71E+02
1.59E-01
4.55E+02 9.71E+03
9710
-35
2.4
3.19E+08
2.12E+13
9.08E+07
8.16E-05
9710000
140
2.4
5.10E+13
5.13E+11
2.20E+13 9.71E+03
703975
30
1.2
1.00
1.00
1.00
1.00
85
1.2
30
0.67
34
399.00
1.8 5305442428
739966
42398594
5362
120
April 30, 2014
1
Use
HTOL
24
HTOL is OVERWHELMINGLY
measuring only TDDB
This is very convenient when Zero failures •
arise during the 1000 hour HTOL test.
Foundries design the gate oxides very well so •
there WILL be NO TDDB failures during HTOL
testing.
3 other mechanisms are just ignored during •
final test and qualification.
April 30, 2014
25
Separation of Mechanisms
Failure Mechanisms can be separated by •
properly selecting test conditions.
High Voltage and Low Voltage tests EM •
High Temperature and High Voltage tests •
for NBTI and for TDDB
Low Temperature and High Voltage tests •
for HCI
April 30, 2014
26
Two Distinct Mechanisms !
HCI frequency dependence •
See at LOW T and High V •
NBTI No Freq. dependence •
Seen at High T and High V •
0.006
0.006
0.005
0.005
-35°C
2.4 V
0.004
0.003
140 °C
2.4 V
0.004
0.003
0.002
0.002
0.001
0.001
0
0
200
400
600
F(MHz)
800
0
0
100
200
300
400
500
F(MHz)
Note: -35°C has >2.5X failure rate as at 140°C for the same Voltage !!
April 30, 2014
27
TDDB from NBTI
-stage RO Frequency vs. Voltage21
700000000
600000000
Neg. Bias-Temperature
Instability (NBTI)
Time-Dependent Dielectric
Breakdown (TDDB)
Performance (freq.)
500000000
400000000
300000000
200000000
Soft breakdown
100000000
0
0
0.5
1
1.5
2
2.5
3
3.5
Voltage-core
April 30, 2014
28
Prediction for 28nm
FIT for f=1GHz
1000
FIT for V=1.0 V
1000
Voltage
100
FIT per billion Gates
FIT per Billion Gates
100
1.2
1.1
10
10
1.0
0.8
0.9
1
30
80
Temperature °C
130
2 GHz
1.5
1.0
0.5
0.1
1
30
50
70
90
Temperature °C
110
130
Dominant Mechanisms are EM and BTI, so strong T
and Freq. dependence but weak V dependence.
April 30, 2014
29
Observation
Increase voltage by 20% •
Increase performance by 20% •
Increases FIT by only factor of 2 •
Increased customer satisfaction •
Increased sales for FREE !!! •
April 30, 2014
30
Main Observations
Dominant Mechanism at HTOL test is Never .1
the dominant mechanism at USE conditions
Acceleration Factor based on 1 mechanism .2
model Significantly Overestimates Reliability
Foundry models today are quite .3
sophisticated and consider N- and P-MOS
based on their own data AND companies
trust these models.
The chip companies WANT to consider the .4
true contributions of EACH mechanism.
April 30, 2014
31
Conclusions
We have developed a prediction model •
that is based on 4 failure mechanisms
Our model is more accurate than the •
single failure model currently in use
Collaboration with Industry is Necessary •
to Verify our Models and to keep pace
with advancing technology
April 30, 2014
32
Thank You
Presto Engineering: Quality Reliability Test Services | HAST HTOL HTSL LTOL LTSL UHAST
Qualification / Reliability
April 30, 2014
33
HOME (/HTTP://PRESTO-ENG.COM/) / SOLUTIONS (/SOLUTIONS/PRESTO-SOLUTIONS.HTML)
/ ENGINEERING SERVICES (/SOLUTIONS/PRESTO-ENGINEERING-SERVICES.HTML)
33

similar documents