### Document

```Side-Channel Attack on OpenSSL ECDSA
Naomi Benger Joop van de Pol
Yuval Yarom
Nigel Smart
1
Outline
• Background
• ECDSA
• wNAF scalar multiplication
• Hidden Number Problem
• Attacking OpenSSL ECDSA
– Improved lattice technique
– Using information from the double and add chain
2
ECDSA
Signer has a private key 1<α<q-1 and a public key
Q=[α]G
1.
2.
3.
4.
5.
6.
Compute h=Hash(m)
Randomly select an ephemeral key 1<k<q
Compute (x,y)=[k]G
Take r=x mod q; If r=0 repeat from 2
Take s=(h+r·α)/k mod q; if s=0 repeat from 2
(r,s) is the signature
Note that
k = ( r / s) × a + ( h / s) mod q
3
wNAF Form
To compute [d]G, first write d in wNAF form:
n-1
d = å di 2 for di Î {0, ±1,±3,...,±(2 -1)}
i
w
i=0
Such that if di≠0 then di+1=…=di+w+1=0.
4
Scalar Multiplication with wNAF form
Precompute {±G, ±[3]G,…, ±[2w-1]G}
x=0
for i=n-1 downto 0
x = Double(x)
if (di≠0) then
end
end
return x
5
Notation
x q = min ( x - aq)
aÎ
Note that x mod q < y implies
x-y/2q < y/2
6
The Hidden Number Problem
Suppose we know numbers ti, ui such that
ati - ui q < q / 2 z
7
The Hidden Number Problem
Suppose we know numbers ti, ui such that
ati - ui q < q / 2 z
We can construct a lattice
æ z
ç 2 ×q
ç
ç
ç
ç 2z × t
1
è
And a vector
(2
ö
÷
÷
÷
2z × q
÷
2 z × td 1 ÷ø
× u1,… , 2 × ud , 0)
Which is very close to a lattice vector that depends
on α.
z
z
8
HNP and ECDSA
k = ( r / s) × a + ( h / s) mod q
Recall that
z
We want ati - ui q < q / 2
In terms of k:
k
0
n
Or in terms of ti and ui:
( r / s) × a + ( h / s) q = k
9
HNP and ECDSA
k = ( r / s) × a + ( h / s) mod q
Recall that
z
We want ati - ui q < q / 2
In terms of k:
k
a
n
l
0
Or in terms of ti and ui:
( r / s) × a - (- ( h / s )) q < q
10
HNP and ECDSA
k = ( r / s) × a + ( h / s) mod q
Recall that
z
We want ati - ui q < q / 2
In terms of k:
k-a
0
n
l
0
Or in terms of ti and ui:
( r / s) × a - ( a - ( h / s)) q < q
11
HNP and ECDSA
k = ( r / s) × a + ( h / s) mod q
Recall that
z
We want ati - ui q < q / 2
In terms of k:
l
k
a
/
2
( )
0
n-l
Or in terms of ti and ui:
(( r / s) × a - ( a - ( h / s))) / 2
l
q
<q/2
l
12
HNP and ECDSA
k = ( r / s) × a + ( h / s) mod q
Recall that
z
We want ati - ui q < q / 2
In terms of k:
( k - a - q / 2) / 2l q < q / 2l+1
Or in terms of ti and ui:
(
0
n-(l+1)
)
l
l+1
r
/
s
×
a
a
h
/
s
+
q
/
2
/
2
<
q
/
2
( ) ( ( )
)
q
13
HNP and ECDSA – State of the Art
• Useful information:
– l bits for l known LSBs
– Between l-1 and l bits for l known MSBs
– l/2 bits for arbitrary l consecutive bits
• Liu and Nguyen 2013 – 160 bit key, 100
signatures, 2 known bits
– Expected 200 observed signatures
14
The X86 Cache
• Memory is slower than the
processor
• The cache utilises locality
to bridge the gap
Processor
– Divides memory into lines
– Stores recently used lines
Cache
• Shared caches improve
performance for multi-core
processors
Memory
15
Cache Consistency
• Memory and cache can be
in inconsistent states
Processor
– Rare, but possible
• Solution: Flushing the
cache contents
Cache
– Ensures that the next load is
served from the memory
Memory
16
• Exploits cache behaviour to leak information
– Shared text segments
– Shared libraries
– Memory de-duplication
– Spy can determine what victim does
– Spy can infer the data the victim operates on
17
• FLUSH memory line
• Wait a bit
line
Processor
Cache
– slow-> no access
– fast-> access
• Repeat
Memory
18
Uses
• OpenSSL AES (Gullasch et al. 2011)
• GnuPG RSA (CVE 2013-4242) [Yarom & Falkner
2014]
• OpenSSL ECDSA over binary fields (CVE 20140076) [Yarom & Benger 2014]
• OpenSSL ECDSA over prime fields – this work.
• OpenSSL AES (cross-VM) [Irazoqui et al. 2014]
• PaaS clouds [Zhang et al. 2014]
19
Attacking OpenSSL wNAF
• Achieve sharing with the victim code
• Use FLUSH+RELOAD to recover the double and
add chain of the wNAF calculation
• Divide time into slots of 1200 cycles (about
0.4μs)
• In each slot, probe a memory line in the code
of the Double and Add functions.
20
Problem I - Speculative Execution
x=0
for i=n-1 downto 0
x = Double(x)
if (di≠0) then
end
end
return x
Solution:
Place probe here
21
Problem II - Overlap
(A)
Victim
Attacker
(B)
Victim
Attacker
(C)
Victim
Attacker
(D)
Victim
Attacker
Attacker
Flush
Wait
Victim
Access
Something else
22
Problem II - Overlap
Victim
if (A)
(!BN_mod_sub_quick(n0, n2, &r->X, p)) goto err;
Attacker
if (!field_mul(group, n0, n1, n0, ctx)) goto err;
if (!BN_mod_sub_quick(&r->Y, n0, n3, p)) goto err;
(B)
Victim
Attacker
(C)
Victim
Attacker
(D)
0x59d62a:
0x59d62f:
0x59d632:
0x59d637:
0x59d63a:
0x59d63d:
0x59d641:
mov 0x8(%rsp),%rcx
mov %rbx,%r8
mov 0x18(%rsp),%rdx
mov %rbp,%rdi
mov %rcx,%rsi
callq *0x38(%rsp)
test %eax,%eax
Victim
Attacker
Attacker
Flush
Wait
Victim
Access
Something else
23
Demo
24
Sample Trace
Raw:
D||||D|D|||D||||A||||D|||D||||D|||D|||A|A|||D|||D|||
|D|||D||||D|||D|||A||||D|||D|D|||D|||D||||D||A|A|||D
|||D|D|||D|||D|||A||||D|||D|||D|||D|||D|||A|A||||D||
|D|||D||||D||A|A|||D||||D|||D|||D|||A|||D|||D|||D|D|
||D|||D|||A||||D|||D|||D|||D|D|||D|||D|||D|||A|||D|D
|||D||…
Processed:
25
Using the LSBs
Reveals 3 LSBs (100). A different trace might
reveal fewer bits. How do we deal with that?
æ z
ç 2 ×q
ç
ç
ç
ç 2z × t
1
è
ö
÷
÷
÷
z
2 ×q
÷
2 z × td 1 ÷ø
z
z
2
×
u
,…
,
2
× ud , 0)
( 1
26
Using the LSBs
Reveals 3 LSBs (100). A different trace might
reveal fewer bits. How do we deal with that?
æ z1
ç 2 ×q
ç
ç
ç
ç 2 z1 × t
1
è
ö
÷
÷
÷
zd
2 ×q
÷
2 zd × td 1 ÷ø
z1
zd
2
×
u
,…
,
2
× ud , 0)
( 1
We vary the z per (ti, ui) tuple.
27
Results
• For secp256k1
Expected
# Sigs
Time (s)
d
Success
Prob.
Time /
Prob.
200
220
240
100
110
60
611.13
79.67
2.68
.035
.020
.005
17460
3933
536
260
280
300
65
70
75
2.26
4.46
13.54
.055
.295
.530
41
15
26
28
We know how to use the revealed LSBs
But these give an average of 2 bits per observed
signature.
29
We know how to use the revealed LSBs
But these give an average of 2 bits per observed
signature.
Can we use the information about the MSBs?
30
Using the MSBs
Assume dm+l,dm≠0
1000…000
0
l+1
x=0
for i=n-1 downto 0
x = Double(x)
if (di≠0) then
end
end
return x
31
Using the MSBs
Assume dm+l,dm≠0
1000…000
0
l+1
After adding [dm]G, for dm>0 it is
And for dm<0
100…00
w 0
l+1
011…11
w 0
l+1
x=0
for i=n-1 downto 0
x = Double(x)
if (di≠0) then
end
end
return x
32
Using the MSBs
Assume dm+l,dm≠0
1000…000
0
l+1
After adding [dm]G, for dm>0 it is
And for dm<0
011…11
w 0
l+1
Either way,
k+2m+w
n
100…00
w 0
l+1
100…00
m+l+1 m+w+1
x=0
for i=n-1 downto 0
x = Double(x)
if (di≠0) then
end
end
return x
0
33
Observation
For many “standard” curves, q is close to a power
of two. That is, q=2n-ε such that |ε|<2p for p≪n.
For example for secp256k1
q=FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
EBAAEDCE6AF48A03BBFD25E8CD0364141
Adding or subtracting q to an n bit number is
unlikely to change the MSBs
34
Using all the Information
For m>p
k + 2 m+w
a
n
(k + 2 ) × 2
m+w
(k + 2 ) × 2
m+w
(k(+k 2+ 2 ) × )2× 2
m+w
m+w
n-m-l-1
n-m-l-1
100…00
n+w-l
n
a
- 2n-1
- 2- 2 - a2
- a2+ ae
n-1n-1
n n
0
0
{
+
0
n+w-l
0
n-m-l-1
0
n
0
n-m-l-1
0
a
n
n-m-l-1
n-m-l-1
100…00
m+l+1 m+w+1
0
n+w-l
n+p-m-l-1
n-m-l-1
aε
0
0
}
35
Using all the Information
For m>p
k + 2 m+w
a
n
(k + 2 ) × 2
m+w
(k + 2 ) × 2
m+w
(k + 2 ) × 2
m+w
n-m-l-1
n-m-l-1
n-m-l-1
100…00
n+w-l
n
a
- 2n-1
-2
- a2 + ae
n
m+w
n-m-l-1
0
{
+
n+w-l
- 2n-1 - aq
0
n-m-l-1
0
0
n+w-l
n
0
m+l+1
0
n+p-m-l-1
(k + 2 ) × 2
0
0
a
n
n-1
100…00
m+l+1 m+w+1
n-m-l-1
aε
0
0
}
0
n
n+w-l
0
36
Using all the Information
For m>p
(k + 2 ) × 2
m+w
(k + 2 ) × 2
m+w
n-m-l-1
n-m-l-1
- 2n-1
0
a
n+w-l
n
-2
n-1
- a2 + ae
n
{
+
0
0
0
n+w-l
n
n+p-m-l-1
(k + 2 ) × 2
m+w
n-m-l-1
(k + 2 ) × 2
n-m-l-1
(k + 2 ) × 2
n-m-l-1
m+w
m+w
- 2n-1 - aq
0
n-m-l-1
n-m-l-1
aε
0
0
}
0
n+w-l
n
- 2n-1 - aq - 2n+w-l-1
0
0
n
n+w-l-1
0
- 2n-1 - 2 n+w-l-1 < 2n+w-l-1 » q / 2l-w+1
q
n+w-l-1
0
37
Results
# perfect
traces
10
11
12
13
Time (s)
Success
probability
2.25
4.66
7.68
0.07
0.25
0.38
11.3
0.54
With a very high probability, observing 25
signatures yields more than 13 perfect traces.
38
Summary
• FLUSH+RELOAD provides a nearly perfect crosscore, cross-VM side channel
• No need for a fixed number of bits in HNP
• Can handle the negative digits in wNAF
• Can handle non-consecutive bits
• Curve choice allows using almost half of the
information in each perfect trace
• We can break a 256 bit curve after observing
as little as 25 signatures.
39
```