### Animated slides - Marcelo Kaihara

```Algorithms for public-key cryptology
Montgomery Arithmetic
EPFL-IC-IIF-LACAL
Marcelo E. Kaihara
April 27th, 2007
Motivation
RSA:
ElGamal:
eK : x
dK : y


y  xe mod m
x  yd mod m
 (y1  k mod p , y2  x  k mod p)
a
x

y

y
dK : (y1, y2 ) 
2
1 mod p
eK : (x, k)
Most of the time computing modular multiplications
Need of efficient algorithms for modular multiplication
Notation
Multiple-precision integer arithmetic
B  2b
b  32 or 64
m
depending on the processor
s 1
i
m

B
 i
i0
 ms 1  Bs 1  ms 2  Bs 2    m1  B  m0
 0 if m  0 (normalized)
Montgomery Multiplication
General overview
Ordinary Representation
u
v
Montgomery Representation
~
u
~
v
u v
~
u ~
v




z
~
z
Sequential multiplications performed in
Montgomery representation
Montgomery Multiplication
Montgomery radix R  Bs  m, gcd(R, m)  1
Ordinary Representation
( ,)
u
Montgomery Representation
( , *)
~
u  u  R mod m
Isomorphic
~
u  v  (u  v) mod m
u *~
v ~
u ~
v  R 1 mod m
v
~
v  v  R mod m
Definition
Montgomery Multiplication
Definition:
m : large odd integer
~
u ,~
v  Z / mZ , gcd(m, B)  1
~
u *~
v ~
u ~
v  R 1 mod m
s
(
usually
R

B
)
R m
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
m
~
u
~
* v
~
z
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
1
~
~
( z  ( z0  m0 mod B)  m) mod B 
1
 (~
z  (~
z0  m0  m mod B)) mod B
 (~
z ~
z0 ) mod B  0
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
m
~
u
~
* v
~
z
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
i0
~
z  m  (B  1)
m  (B  1)  m  (B  1)
~
z 
B
2  m  (B  1)

 2m
B
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
m
~
u
~
* v
~
z
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
i1
0~
z  2m
~
z  2  m  m  (B  1)
2  m  m  (B  1)  m  (B  1)
~
z 
B
m  (2  B  1  B  1)

 2m
B
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
z  0;
for (i  0;i  s - 1;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if ~
z  m then ~
z ~
z - m;
m
~
u
~
* v
~
z
How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
R  Bs
((( (~
u ~
v0  B 1 mod m  ~
u ~
v1 )  B 1 mod m  
~
u ~
v )  B 1 mod m  ~
u ~
v )  B1 mod m) mod m
s 2
s 1
~
u ~
vs 1  B1  ~
u ~
vs 2  B2    ~
u ~
v1  Bs 1  ~
u ~
v0  Bs mod m
 (~
u ~
vs 1  Bs 1  ~
u ~
vs 2  Bs 2    ~
u ~
v1  B1  ~
u ~
v0  B0 )  Bs mod m
 s 1 ~ i   s
~
 u    vi  B   B mod m  ~
u ~
v  B s mod m  ~
z
 i0

How to compute?
~
~
~
~
u * v  u  v  R 1 mod m
m
~
u
~
* v
*
qM  m
qM  m
qM  m

qM  m

m
~
u
~
v
qM  m
qM  m
qM  m
qM  m
Subtraction-less Montgomery multiplication
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
~
u ,~
v  [0,2m - 1]
R  4m
~
z  0;
for (i  0;i  s;i  )
{
~
z ~
z ~
u ~
v;
i0
~
z  2  m  (B  1)
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
if (~
z  m) ~
z ~
z - m;
2  m  (B  1)  m  (B  1)
~
z 
B
3  m  (B  1)

 4m
B
Subtraction-less Montgomery multiplication
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
R  4m
~
z  0;
for (i  0;i  s;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
~
u ,~
v  [0,2m - 1]
if (~
z  m) ~
z ~
z - m;
i1
0~
z  4m
~
z  4  m  2  m  (B  1)
4  m  2  m  (B  1)  m  (B  1)
~
z 
B
m  (4  2  B  2  B  1)

 4m
B
Subtraction-less Montgomery multiplication
~
~
~
~
u * v  u  v  R 1 mod m
Algorithm
R  4m
~
z  0;
for (i  0;i  s;i  )
{
~
z ~
z ~
u ~
v;
i
qM  (~
z0  m01) mod B;
~
z  (~
z  qM  m) div B;
}
~
u ,~
v  [0,2m - 1]
if (~
z  m) ~
z ~
z - m;
is
m
~
u
~
* v
~
z
0~
z  4m
~
vs  B / 2
4  m  (B / 2  1)  m  m  (B  1)
~
z 
B
m  ((3 / 2)  B  2)
~
z 
 2  m (B  4)
B
Conversion back and forth from ordinary
representation and Montgomery representation
~
u *~
v ~
u ~
v  R 1 mod m
Ordinary Representation
u
Montgomery Representation
u * (R 2 mod m)
~
u  u  R mod m
u  R 2  R -1 mod m
z
~
z *1
~
z  z  R mod m
(z  R)  1  R 1 mod m
Montgomery Bootstrapping
How to compute R2 mod m ?
Ordinary Representation
u
Montgomery Representation
u * (R 2 mod m)
R  2sk k  32 or 64 bits
~
2  2  2sk mod m
2 mod m

2sk mod m
~
u  u  R mod m
2sk mod m
2  2sk mod m

~
2 mod m 
s k
 2sk  2sk mod m
 R 2 mod m
Montgomery Bootstrapping
What about modular inversion?
Ordinary Representation
u
Montgomery Representation
~
u  u  R mod m
~
u 1  u1  R 1 mod m
~
u 1 * (R 3 mod m)
 u1  R 1  R 3  R -1 mod m
u1 mod m
 u1  R mod m
(R2 mod m) * (R 2 mod m)  R 2  R2  R 1 mod m
Montgomery Bootstrapping
How to compute m0-1 mod B?
m0
1
product  m0  result (mod B)
1
m01
e.g. B  24

product
result
0 0 1 1
0 0 0 1
0 0 1 1
 m0shift 
0 1 0 0 1

0 0 1 1
0 0 1 1

0 0 1 1
0 1 0 0
0 0 1 1
1 0 0 1

0 0 1 0

1 0 0 0
1 0 1 1
0 1 0 0 0 0 1
1  3  11 (mod 16)
B  24

product
result
0 0 1 1
0 0 0 1
0 0 1 1
 m0shift 
0 1 0 0 1

0 0 1 1
0 0 1 1

0 0 1 1
0 1 0 0
0 0 1 1
1 0 0 1

0 0 1 0

1 0 0 0
1 0 1 1
0 1 0 0 0 0 1
1  3  11 (mod 16)
Montgomery Squaring

u3
u2
u1
u0
u3
u2
u1
u0
u3u0 u2u0 u1u0 u0u0
u3u1 u2u1 u1u1 u0u1
u3u2 u2u2 u1u2 u0u2
u3u3 u2u3 u1u3 u0u3

w7
w6
w5
w4
w3
w2
w1
w0
Around 80% of the time required by Montgomery multiplication
RSA pseudorandom bit generator
To make computation simple x0  2
Using Montgomery arithmetic ~
x0  2
~
(not need to calculate 2 )
```