side_channel4

Report
AES Side Channel Attacks
Biru Cui
Sam Skalicky
Outline
•
•
•
•
•
AES algorithm
Side channel attacks
Side channel attack against AES
Cache-collision timing attack against AES
Countermeasures
AES Algorithm
• Key Expansion
• Initial Round
– Add Round Key – bitwise xor
• Rounds
–
–
–
–
Sub Bytes - Sbox
Shift Rows – rows shifted cyclically
Mix Columns – mixing operation on the columns
AddRoundKey
• Final Round (no Mix Columns)
– Sub Bytes
– Shift Rows
– Add Round Key
Rijndel Starting Data
Rijndel AES Steps
Rijndel Sub Bytes
Rijndel Shift Rows
Rijndel Mix Columns
Rijndel Add Round Key
AES Algorithm
• AES Lookup Table Optimizations
– Transposed State by Bertoni
• Speedup in decryption
– CAM based by Li
• Combined Sbox& inv Sbox into single table
– FPGA implementations
• Pre-computed GF ops in LUTs
Attacks on AES
• Brute force
• Related Key
• Side Channel
Side Channel Attacks
• Attacks through some implementation
deficiency
– Timing of computations
– Power Analysis
– Fault Injection
– Electromagnetic Radiation
– Acoustic Cryptanalysis
– Cache
Cache-collision timing attack against
AES
• Cache collision
– Hit
– Miss
– Time
Process Operation
• Cache observation
CFS - Scheduler
Victim Process
Spy Process
Cache
AES Cache Side Channel Attack
• AES-128
• Key recovery after observing ~100 encryptions
• Implementation in Linux against OpenSSL
0.9.8n
• Program does not require special privileges on
the host machine
• Linux kernel task scheduler compromised
– Observe every memory access
– (CFG) Completely Fair Scheduler
AES Cache Attack Features
• No heuristic info about plain/cyphertexts
• Works against compressed tables
• 2 phase operation:
– Observation
• ~100 encryptions
• ~2-3 seconds
– Analysis
• ~3 minutes
Process Operation
• Cache observation
CFS - Scheduler
Victim Process
Spy Process
Cache
Cache-collision timing attack against
AES
• AES: operations on each byte
Cache-collision timing attack against
AES
• System information
– Pentium III 1.0 GHz
• L1 cache 32K (split data/instr.)
• L2 cache 256K
– “T” lookup table size 256x256=64k
• Implication
– If the table is fully loaded in the cache, then there is
no cache miss. This is important for why we can do
first round and final round attack.
Cache-collision timing attack against
AES
• AES: the computation of every round
Actual Results, Pentium III
30
20
Timing deviation (cycles)
10
0
0
1
2
3
4
5
6
7
8
9
10
11
12
-10
-20
-30
-40
# of cache collisions
13
14
15
16
17
18
19
20
Cache-collision timing attack against
AES
Plaintext
Key xor
Table
Key xor
Table
…
Table
Key xor
[6]
Cache-collision timing attack against
AES
Plaintext
If a plaintext byte is known,
as well as a first-round table
lookup, a key byte is learned
Key xor
Table
Key xor
Table
…
Table
Key xor
[6]
Cache-collision timing attack against
AES
• First Round Attack
– Spy process flush the cache
– The lookup table is not in the cache. In other
words, the cache collision is only due to same
lookup table access index.
Cache-collision timing attack against
AES
• First Round Attack
Cache-collision timing attack against
AES
• First Round Attack
– If cache hits ( access time less than average access
time)
– Counts the average encryption time for all pi and
p j pair. If there is a low average time occurs for a
pair pi and p j , there is high probability that
pi  p j  ki  k j .
Cache-collision timing attack against
AES
• Final Round Attack
– The final round lookup table T4 is different from
previous lookup table T0 , T1 , T2 , T3 , so there is no
T4 in the cache. And if there is a collision, it’s due
to same lookup table index.
Cache-collision timing attack against
AES
• Final Round Attack
– No MixColumns operations
Cache-collision timing attack against
AES
• Final Round Attack
Cache-collision timing attack against
AES
• Final Round Attack
– If cache hits ( access time less than average access
time)
– Counts the average encryption time for all ci and
c j pair. If there is a low average time occurs for a
pair ci and c j , there is high probability that
10
10
ci  c j  ki  k j .
Cache-collision timing attack against
AES
• Result
Attack
Bernstein
Tesunoo
First/Final round
attack
Encryptions needed
2
27.5
2
26
15
2
Sample type
Plaintext/timing
Plaintext/timing
Plaintext/timing
Countermeasures
– AES can be performed without using lookup tables
– Give OS ability to partition cache between
processes
– Put AES table into ROM, add special instructions
– Separate AES hardware on chip (new Intel CPUs)
References
• [1] Rijndel flash movie:
http://www.cs.bc.edu/~straubin/cs38105/blockciphers/rijndael_ingles2004.swf
• [2] G. Bertoni, et al.,"Efficient Software Implementation of AES on
32-Bit Platforms”
• [3] H. Li, "A New CAM Based S/S−1-Box Look-up Table in AES”
• [4] M. McLoone et al. "Rijndael FPGA Implementations Utilising
Look-Up Tables”
• [5] D. Gullasch et al. "Cache Games – Bringing Access-Based Cache
Attacks on AES to Practice“
• [6] J. Bonneau et al. “Cache-Collision Timing Attacks Against AES”
• [7] Dag Arne Osvik et al. “Cache Attacks and Countermeasures: the
Case of AES”
Backup slides
Original Mix Columns Equations
Revised Mix Columns Equations
FPGA LUT Implementation

similar documents