Presentation Slides

Report
Device Considerations for Low Power VLSI
Circuits
EDS Silicon Valley Chapter
David Kidd
Senior Director of Digital Design
1
EDS Silicon Valley October 2012
Copyright © 2011
2012 SuVolta, Inc. All rights reserved.
Overview
 Transistor Variability Limits Chips
 Impact on Mobile System on Chip (SOC)
 Limited Low Power Design Techniques
 Where does Variability come from?
 New Transistor Alternatives to Reduce Variability
 Deeply Depleted Channel (DDC) technology
 Silicon Impact
 Outlook
 Taking advantage of Deeply Depleted Channel (DDC) technology in
Mobile SOC
2
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
What is needed in Mobile System on Chip?
 Multiple blocks with different performance requirements
 Integrated on the same die
 Different power modes – would like to run at different supplies
 Multiple VT transistors used to control leakage
 Single chip solution requires analog integration
 Need co-design of architecture, circuits and transistor
technology for best solution
3
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Variability Limits Design & Architecture
 Limited benefit using voltage scaling (DVFS)
 Cannot overdrive much due to reliability and power restrictions
 Dynamically lowering voltage limited to 100-200mV
 Only lowering frequency leaves large leakage power
 “Run to hold” beats DVFS despite overhead
 Finicky SRAM memories
 High SRAM VMIN leaves no room for memory voltage scaling
 Many circuit tricks to improve VMIN and noise margins
 Design teams moved to dedicated power rail for SRAM
 Works for CPU – difficult in GPU
 Impacts power network integrity – more fluctuations
 Transistor variability limits chips
4
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Transistor Variation Source of Chip Variation
 Local/Random Variation
 Transistor next to each other vary widely
 Small number of dopants in transistor channel
 Random Dopant Fluctuation (RDF)
 Apparent in threshold voltage mismatch (σVT)
 Impacts speed, leakage, SRAM & Analog
[#Transistors]
 Global/Systematic/Manufacturing Variation
 Shifts all the transistors similarly
Useable
 Longer/shorter transistor lengths
Yield
 More (or less) implant energy and dose
Too slow
Too hot
 Will result in speed/power distribution
 Industry solution: Remove RDF using Undoped Channel
 What is the right silicon roadmap going forward?
5
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Transistor Alternatives
 FinFET or TriGate
 Promises high drive current
 Manufacturing, cost, and IP challenge
 Doped channel to enable multi VT
Source: GSS, Chipworks
Textbook FinFET
Intel TriGate
 FDSOI
 Showing off undoped channel benefits
 Good body effect, but lack of multi VT capability
 Restricted supply chain
Source: IMEC
 DDC – Deeply Depleted Channel transistor
 Straight forward insertion into Bulk Planar CMOS
 Undoped channel to reduce random variability
 Good body effect and multi VT transistors
Source: Fujitsu
6
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Deeply Depleted Channel™ (DDC) Transistor
1 Undoped or very lightly doped region
 Significantly reduced transistor random variability sVT
 Lower leakage
 Better SRAM (IREAD, lower Vmin & Vret)
 Tighter corners
 Smaller area analog design
 Higher channel mobility (increased Ieff, lower DIBL)
 Higher speed, improved voltage scaling
1
2 VT setting offset region
2
 Enables multiple threshold voltages
3
3 Screening region
 Strong body coefficient
*Example implementation
 Bias bodies to tighten manufacturing distribution
 Body biasing to compensate for temperature and aging
Benefits similar to FinFET in planar bulk CMOS
7
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Bulk Planar Foundry-Compatible Process
SuVolta Flow (example)
Foundry Standard Flow
Wells and VT
STI
Blanket Epi
Wells and VT
STI
Poly
Poly
Spacers and LDDs
Spacers and LDDs
Salicide
Salicide
Gate
Gate
Metal Layers
Metal Layers
 No new materials / No new tools
8
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
New
TEM of DDC Transistor and STI
D
43.1nm
S
Presented at IEDM 2011
9
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
[Leakage Power]
[#Transistors]
Lower Transistor Variability Reduces Leakage
65nm Silicon
SRAM VT
High leakage
tail
High leakage
tail dominates
power
2.7x higher power
(Model using 85mV
subVT slope)
High VT tail
Slows down ICs
 Transistor variability is reflected in threshold voltage (VT) distribution
 Leakage current is exponentially dependent on VT
 Lower VT variability (sVT) reduces number of leaky low VT devices
 Power dissipation is dominated by low VT edge of distribution
 Smaller sVT  Less leakage power for digital and memory/SRAM
10
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Lower Transistor Variability Improves Speed
65nm Silicon
Measurement
(B)
(A)
(C)
 Nominal (TT) ring oscillator speed expected to be 400ps (A)
 Equivalent to having many similar critical paths in a chip
 VT variation will randomly affect paths within the same die limiting speed to 470ps
 Undoped channel reduces variability and increases mobility (B)
 25% faster mean, 30% faster tail due to tighter distribution
 To match performance lower VDD until tails have same speed (C)
 Large impact on power due square dependence P=CV2f +IV
11
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Sub 0.5V VMin is Possible
1.0
 SRAM memories built using 6-T SRAM
mismatch
 Higher VDD is required to avoid failures
 SRAM blocks limit VDD scaling
0
Node 2 [V]
 Smallest transistors on every chip, worst VT
0.8
Node 2 [V]
cell
1
(a) DDC
0.6
0.4
0
0.0
0
0.0
0.2
0.4
0.6
Node 1 [V]
0.8
transistor mismatch
 Demonstrated SRAM to Vmin of 0.425V
300 mV
 No circuit “tricks” for low voltage operation
 Demonstrates potential for 50% voltage
scaling
Tester
limit
Copyright © 2012 SuVolta, Inc. All rights reserved.
Industry Norm
DDC
 Standard SRAM macros
EDS Silicon Valley October 2012
0
0.2
 Vmin – lowest operating voltage limited by
12
0
1.0
Improved VT Matching Key for Low VMIN
1
0
-1
-2
-3
-0.2
-0.1
0
0.1
Pull-down DVT [V]
0.2
2
Baseline
DDC
Cumulative Probability [s]
2
3
3
Baseline
DDC
Cumulative Probability [s]
Cumulative Probability [s]
3
1
0
-1
-2
-3
-0.2
-0.1
0
0.1
Pass-gate DVT [V]
0.2
2
Baseline
DDC
1
0
-1
-2
-3
-0.2
-0.1
0
0.1
Pull-up DVT [V]
 40-60% improved matching for SRAM transistors
Presented at IEDM 2011
13
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
0.2
Design Examples Analog
 In analog circuits, matching is key
 Large transistors used to improve
relative variability in current mirrors,
differential pairs, etc.
 Better transistor matching allows for
 Area savings
 Higher performance
 Lower power
 Undoped channel improves ROUT 
higher gain
10.9x17.4um
Baseline
EDS Silicon Valley October 2012
SuVolta sample layout
DDC
450um
14
6.9x12.2um
Copyright © 2012 SuVolta, Inc. All rights reserved.
450um
210um
 Bandgap reference circuit
 Same accuracy achieved at half the
size
DDC
420um
 OpAmp stage
 Matched bandwidth, gain, slewrate
 Over 50% smaller area (84 vs
190μm2)
 45% better input noise (176 vs 327
μV)
Baseline
Better Chips with Body Biasing
Useable
Yield
Too slow
Too hot
FBB
RBB
 Body Bias to fix systematic variation
 Speed-up (forward bias - FBB) slow parts
 Cool down (reverse bias - RBB) hot parts
 Increase manufacturing yield
 Body bias enables multiple modes of operation
 Active  minimize power at every performance
 Standby  leakage reduction, power gating
 DDC transistor provides 2-4x larger body factor
15
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
TCAD
prediction
Half the Power at Matched Performance
Baseline
speed
65nm Silicon
Measurement
FF
FF
TT
100% power
TT
SS
50% power
SS
 Inverter ring-oscillators (RO) fabricated at process corners
 Baseline @ 1.2V VDD and DDC @ 0.9V VDD
 For each corner, DDC transistor RO is faster and lower power
 Using strong body coefficient to pull in corners
 Half the power (50% less power) while matching speed
16
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Tighter Manufacturing Corners w/ DDC
 Better process control
leads to tighter corners
 Manufacturing flow further
POR
reduces layout effects
 1 sigma tighter wafer to
wafer and within wafer
variation for DDC
3s
2s
VDD=1.2V
1s
 Less overdesign as max
paths and min (hold)
paths are closer
VDD=0.9V
 Faster design closure
65nm Silicon
Measurement
 earlier tapeout
 shorter TTM
17
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Voltage Scaling to 0.6V VDD
5.00E-04
65nm Silicon
Measurement
FF
Power (W)
4.00E-04
Baseline
TT
3.00E-04
SS
2.00E-04
-83%
1.00E-04
DDC
Frequency (MHz)
0.00E+00
0
100
200
300
400
500
 Achieve half the speed at 1/6 the power @0.6V VDD
 Use body bias to compensate for temperature and aging
 Critical for low VDD operation
 Enable workable design window – avoid overdesign
18
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
This is HotChips – Go Faster!
5.00E-04
FF
Power (W)
4.00E-04
Baseline
3.00E-04
DDC
TT
SS
2.00E-04
1.00E-04
65nm Silicon
Measurement
0.00E+00
300
Frequency (MHz)
350
400
450
500
550
600
 Turbo Mode: DDC transistor achieves over 50% speedup
@ 1.2V VDD
 All corners for DDC run
DVFS
V
at 580MHz vs 370MHz DD
Speed
for baseline
Power
19
EDS Silicon Valley October 2012
Baseline
DDC
1.2V
0.6V
0.9V
1.05V
1.2V
1
0.5
1
1.28
1.56
1
0.17
0.52
1
1.51
Copyright © 2012 SuVolta, Inc. All rights reserved.
28nm and Beyond
(silicon calibrated
SPICE simulations)
 Same performance at 0.75V VDD as baseline at 0.9V VDD
 30% lower power
 Alternatively 25% faster at same voltage
 Even better when using body bias to pull in corners
20
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Applying DDC to Lower Variability in Mobile SOC
 CPU: Single thread performance critical
 Push frequency by temporarily raising voltage in turbo mode
 DVFS with body biasing becomes DVBFS
 GPU: High number of cores using small transistors
 Less overdesign due to lower delay variability
 Increase parallelism, lower voltage, body bias dynamically for more pixels/Watt
 Lower frequency blocks
 In addition to high VT transistors also run at lower voltage and optimal body bias
 Whole chip: Use body bias to adjust for manufacturing variation
 Take advantage of improved memory and analog performance
 Lowering variability while compatible with existing bulk planar silicon IP
21
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
Conclusions
 Variability limits chips
 DDC transistor reduces random variability through its undoped channel
 DDC transistor’s strong body factor can be used to fix systematic
variation and compensate for temperature variation
 DDC technology provides performance kicker from 90nm to
20nm
 Straight forward integration into existing nodes
 Compatible with existing bulk planar CMOS silicon IP
 Use existing CAD flow
 DDC technology brings back low power tools
 Large range DVFS
 Body biasing
 Low voltage operation
 Taking advantage of reduced variability DDC transistor in design
and architecture will lead to next level in mobile SOC
22
EDS Silicon Valley October 2012
Copyright © 2012 SuVolta, Inc. All rights reserved.
23
EDS Silicon Valley October 2012
Copyright © 2011
2012 SuVolta, Inc. All rights reserved.

similar documents