Lattice Dirac Operator for Domain

```Exact Pseudofermion Action for Hybrid
Monte Carlo Simulation of One-Flavor
Domain-Wall Fermion
Yu-Chih Chen
(for the TWQCD Collaboration)
Physics Department
National Taiwan University
Collaborators: Ting-Wai Chiu
Content
• Lattice Dirac Operator for Domain-Wall Fermion (DWF)
• Two-Flavor Algorithm (TFA)
• TWQCD’s One-Flavor Algorithm (TWOFA)
• Rational Hybrid Monte Carlo (RHMC) Algorithm
• TWOFA vs. RHMC with Domain-Wall Fermion
• Concluding Remarks
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
For domain-wall fermion, in general, the lattice Dirac operator reads
=   +  +   −
where  =  +  ,  =  −  and  = diag 1 , 2 , ⋯ ,  , and
is the standard Wilson-Dirac operator with −0 0 < 0 < 2 ,
= + +  + − −  =
+
,′
+
0
0
−

′ ,−1 , 1 <  ≤
=
−′ , ,  = 1, with  =  ,  = 1/ 20 1 − 0
−  = +

± () are the matrices in the fifth dimension and depend on the quark
mass.
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
For  = 4, the form of ± are
 0

1

L 
 0

 0
0
0
0
0
1
0
0
1
 m

0

0 

0 
 0

0

L 
 0

 m
1
0
0
1
0
0
0
0
0 

0

 1

0 
Using  =  +  and  =  −  , the  () can be written as
() =    + () +   − ()
Matrix in 4D
Matrices in 5-th dimension
+

−
+  − ()

Lattice Dirac Operator for Domain-Wall Fermion (DWF)
() =    + () +   − ()
+  − ()
1. If  are the optimal weights given in Ref. [1], it gives
Optimal Domain-Wall Fermion

2
≈ () =

∶ Zolotarev Optimal Rational Approximation
2 +
() =
1−
1+

=1

=1
→

2
,
as  → ∞
where  = (1 −  ) (1 +  ),  =   and  =  1 + 5
[1] T. W. Chiu, Phys. Rev. Lett. 90, 071601 (2003)
−1
= 5
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
() =    + () +   − ()
+  − ()
2. If  = ,  = . ,  = . , it gives
Domain-Wall Fermion with Shamir Kernel
3. If  = ,  = . ,  = . , it gives
Domain-Wall Fermion with Scaled ( = ) Shamir Kernel

2
≈ () =

() =
′
∶ Polar Approximation
2
+ ′
1−
1+

=1

=1
→

2
,
as  → ∞
where  = (1 − ) (1 + ), and  =  1 + 5
−1
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
For a physical observable ()
1
() =

1
=

()exp −   −
()det[ ()]exp −
If  () =  ∗ (), where the matrix K is independent of the gauge field
()det[ ()]exp −
det[ ()]exp −
=
()det[()]exp −
det[()]exp −
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
Using the redefined operator ()
1
() =

1
=

()det[()]exp −
† () exp − † −1   −
where  satisfies:
1) det  = det
2) H is Hermitian
3) H is positive-definite
Lattice Dirac Operator for Domain-Wall Fermion (DWF)
For DWF, since  and  are independent of the gauge field,
+  +−()
() +
+  −−()
()
=    + ()
→   =  +   =  +
+

±  =  −1/2 ±  + −1
−1  −1/2
±  = 1 + ± () 1 − ± ()
−1

−
−1

Two-Flavor Algorithm (TFA) [2]
For the DWF Dirac operator
=  + () =  + + +  + − − ()
we can apply the Schur decomposition with the even-odd preconditioning
4 − 0 + ()
=

()
where

=
5 ()
0

()
5 ()−1
≡
4 − 0 + ()

5 ()−1
0
0
5 ()−1 ()

5 ()−1

0

5 ()

= I − 5 ()
5 ()
We then have
det
= det[5  ]−2 × det[()]
[2] T. W. Chiu, et al. [TWQCD Collaboration],PoS LAT 2009, 034 (2009); Phys. Lett. B 717, 420 (2012) .
Two-Flavor Algorithm (TFA)
The pseudofermion action for HMC simulation of 2-flavor QCD with DWF is

1
† †
=   (1)
(1)
†

()
det
(1)
2
Two flavor
The field  can be generated by the Gaussian noise field
=
† †
1
1
1 †
1  = †

1
1
=
1  ⟺ =
()

(1)
generated from Gaussian noise
TWQCD’s One Flavor Algorithm (TWOFA)
For one-flavor of domain-wall fermion in QCD, we have devised an exact
pseudofermion action for the HMC simulation, without taking square root.
det[()]
det[()]
det[ (, 1)]
=
×
det[(1)]
det[(1)]
det[ (, 1)]
where   =  + () =  + + +  + − − ()
In Dirac space
− 0 + + ()
=
− ⋅ †
⋅
− 0 + + ()
− 0 + + ()
, 1 =
− ⋅ †
⋅
− 0 + + (1)
TWQCD’s One Flavor Algorithm (TWOFA)
Use type I Schur decomposition to  , 1 , and ()

=

−1
0

0
0
− −1
−1
0

we then have
det  , 1
det
det[  + Δ− ()]
1
=
= det  + Δ− ()
det[  ]

where
= 5  − 0 + −  +  ⋅
†
1
⋅
− 0 + + ()
Δ−  = 5 − 1 − − () , 5 = +′,
TWQCD’s One Flavor Algorithm (TWOFA)
Use type II Schur decomposition to  1 , and  (, 1)

−1

=

0

− −1
0
0

−1
0

we then have
det  1
det  , 1
det[(1)]
1
=
= det  + Δ+ ()
det[ 1 − Δ+ ()]
1 − Δ+ ()
where
1
1 = 5  − 0 + + 1 +  ⋅
+ ⋅
− 0 + − (1)
Δ+  = 5 + 1 − + () , 5 = +′,
†
TWQCD’s One Flavor Algorithm (TWOFA)
By using the Sherman-Morrison formula, we have found that the fifth
dimensional matrices ± can be rewritten as
±  =
−/ ± − −/

+
−/ ± ±  −/
+  −
where  is a scalar function of ,  and , ± are the vector functions of ,
and , and here we have defined
± = ± 0 + −1
With these form of ± (), Δ± () can be rewritten as
Δ±  = 5 ± 1 − ± () =  −1/2 ± ±   −1/2

1−
=
1 −  1 + (1 − 2)
TWQCD’s One Flavor Algorithm (TWOFA)
Next we use det  +  = det[ + ], one then has
det  , 1
det
1
= det  + Δ− ()

=det  +
−1/2 − −   −1/2
=det  + −
−1/2
1

1
−1/2 −

Positive definite and Hermitian
TWQCD’s One Flavor Algorithm (TWOFA)
Again, with det  +  = det[ + ], one also has
det  1
det  , 1
1
= det  + Δ+ ()
1 − Δ+ ()
=det  +
−1/2 + +   −1/2
=det  + +
−1/2
1
1 − Δ+ ()
1
−1/2 +
1 − Δ+ ()
Positive definite and Hermitian
TWQCD’s One Flavor Algorithm (TWOFA)
1 = 1 †  + −   −1/2
2 = 2
†
+ +
−1/2
×
1
−1/2 − 1

1
−1/2 + 2
1 − Δ+ ()
det  , 1
det  1
det
det  , 1
det[()]
=
det[(1)]
One flavor determinant
TWQCD’s One Flavor Algorithm(TWOFA)
Use 1 , 2 and some algebra, the pseudofermion action of one-flavor
domain-wall fermion can be written as
1
0
†
−1/2
−1/2
= 0 1
− −

−
1
()
+ 2
†
0  + +
−1/2
1
−1/2 +
1 − Δ+ ()+
where   = 5 5 () , 5 = +′,
Δ±  =  −1/2 ± ±   −1/2

1−
=
1 −  1 + (1 − 2)
2
0
TWQCD’s One Flavor Algorithm(TWOFA)
The initial pseudofermion fields of each HMC trajectory are generated by
Gaussian noises as follows.
1 = 1
†

+ −
−1/2
1
−1/2 − 1 = 1 † 1

1

−1/2
⇒ 1 =  + −
−1/2 −

2 = 2
†
+ +
−1/2
−1/2
1
1
−1/2 + 2
1 − Δ+ ()
1

−1/2
⇒ 2 =  + +
−1/2 +
1 − Δ+ ()
Gaussian noise
−1/2
2
TWQCD’s One Flavor Algorithm (TWOFA)
The invesre square root can be approximated by the Zolotarev optimal
rational approximation
1
1

=
=1

1 +
2
2

=
=1
Gaussian noise

1 +
Rational Hybrid Monte Carlo(RHMC) Algorithm
A widely used algorithm to do the one-flavor HMC simulation is the
rational hybrid Monte Carlo (RHMC)[3], which can be used for any lattice
fermion.
1/4
1/4
1
†
†
†
=
(1) 1
(1) 1

1/2
() †

1. det
(1)

=
det
(1) †
1
1/4
1
() †
1/2
(1) †
1
1/4
2. Positive definite and Hermitian
The fields  are generated by the Gaussian noise fields
=
1
(1) † 1
1/4
() †

1/4
Gaussian noise

[3] M. A. Clark and A. D. Kennedy, Phys. Rev. Lett. 98, 051601 (2007)
Rational Hybrid Monte Carlo(RHMC) Algorithm
=

†
†
(1) 1

1/
= 0 +
=1

+
1/4
1
() †
1/2
†
(1) 1
1/4

rational approximation
where  is the number of poles for rational approximation.
The  numbers of inverse matrices 1 ( +  ) can be obtained from the
TWOFA vs. RHMC with DWF
I. Memory Usage :
We list memory requirement (in unit of bytes) for links, momentum and
5D vectors as follows,
1)  ≡ 8 ∗  3 ∗
2)  = 48 ∗  , link variables
3)  = 32 ∗  , momentum
4)  = 24 ∗  ∗  , 5D vector
Then the ratio of the memory usage for RHMC and TWOFA is
20 + 3 3 + 2

=

32 + 10.5
where  is the number of poles for MMCG in RHMC algorithm
TWOFA vs. RHMC with DWF
20 + 3 3 + 2

=

32 + 10.5
TWOFA vs. RHMC with DWF
II. Efficiency:
The lattice setup is
= . ,
= . ,
=  ,
= ,
= ,
HMC Steps: (Gauge, Heavy, Light) = (1, 1, 10),  =  for RHMC
We compare RHMC and TWOFA for the following cases:
1) DWF with  = . ,  = .  and   = .  .  (Optimal DWF)
2) DWF with  = . ,  = .  and  =  (Shamir)
3) DWF with  = . ,  = .  ( = ) and  =  (Scaled Shamir)
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
Maximum Forces
1. Optimal Domain-Wall Fermion : Maximum Forces
= .
= .
= .
= .
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
1. Optimal Domain-Wall Fermion: ∆
∆
−∆ = . ()
erfc

= . ()

Accept = . ()
−∆ = . ()
erfc

= . ()

Accept = . ()
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
1. Optimal Domain-Wall Fermion :
= . ,
= . ,
= ,
= ,
HMC Steps: (Gauge, Heavy, Light) = (1, 1, 10),
= ,
=  for RHMC
ODWF (kernel  ) with  = . ,  = .  and   = .  .
Algorithm

Plaquette
Force (Gauge)
Force (heavy)
Force (light)
TWOFA
1.3
0.4
0.58051(09)
5.15555(34)
0.18971(13)
0.01401(29)
RHMC
1.3
0.4
0.58100(10)
5.15762(36)
0.35334(11)
0.06946(44)
Algorithm
Accept erfc( ∆ )
−∆
Memory
.
. (sec.)
TWOFA
0.980(8)
0.981(11)
0.9992(16)
1.00
1.00
0()
RHMC
0.987(7)
0.994(18)
1.0003(16)
6.58
1.21
()
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
Maximum Forces
2. Shamir Kernel ( = ) : Maximum Forces
= .
= .
= .
= .
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
2. Shamir Kernel ( = ) : ∆
∆
−∆ = .
erfc

= . ()

Accept = . ()
−∆ = .
erfc −

= . ()

Accept = . ()
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
2. Shamir Kernel ( = ) :
= . ,
= . ,
= ,
= ,
= ,
HMC Steps: (Gauge, Heavy, Light) = (1, 1, 10),  =  for RHMC
DWF (Shamir kernel) with  = . ,  = .  and  =
Algorithm

Plaquette
Force (Gauge)
Force (heavy)
Force (light)
TWOFA
1.8
0.1
0.59061(09)
5.17686(34)
0.14663(45)
0.03578(18)
RHMC
1.8
0.1
0.59094(14)
5.17866(35)
0.28522(68)
0.10757(06)
Algorithm
Accept erfc( ∆ )
−∆
Memory
.
. (sec.)
TWOFA
0.987(7)
0.987(13)
0.9999(16)
1.00
1.00
()
RHMC
0.997(3)
0.953(06)
1.0074(18)
6.58
1.00
()
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
Maximum Forces
3. Scaled Shamir Kernel ( =  and  = ) : Maximum Forces
= .
= .
= .
= .
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
3. Scaled Shamir Kernel ( =  and  = ) : ∆
∆
−∆ = .
erfc

= . ()

Accept = . ()
−∆ = .
erfc −

= . ()

Accept = . ()
TWOFA vs. RHMC with DWF on the  ×  ×  Lattice
3. Scaled Shamir Kernel ( =  and  = ) :
= . ,
= . ,
= ,
= ,
= ,
HMC Steps: (Gauge, Heavy, Light) = (1, 1, 10),  =  for RHMC
DWF (scaled Shamir kernel) with  = . ,  = .  ( = )and  =
Algorithm

Plaquette
Force (Gauge)
Force (heavy)
Force (light)
TWOFA
1.8
0.1
0.59061(09)
5.17854(34)
0.14646(13)
0.03359(13)
RHMC
1.8
0.1
0.59032(09)
5.17670(32)
0.28559(39)
0.10775(06)
Algorithm
Accept erfc( ∆ )
−∆
Memory
.
. (sec.)
TWOFA
0.983(7)
0.990(14)
1.0000(15)
1.00
1.00
()
RHMC
0.997(3)
0.967(07)
1.00038(18)
6.58
1.17
()
Concluding Remarks
1. We have derived a novel pseudofermion action for HMC
simulation of one-flavor DWF, which is exact, without taking
square root.
2. It can be used for any kinds of DWF with any kernels, and for any
approximations (polar or Zolotarev) of the sign function.
3. The memory consumption of TWOFA is much smaller than that
of RHMC. This feature is crucial for using GPUs to simulate QCD.
4. The efficiency of TWOFA of is compatible with that of RHMC.
For the cases we have studied, TWOFA outperforms RHMC.
5. TWQCD is now using TWOFA to simulate (2+1)-flavors QCD, and
(2+1+1)-flavors QCD, on 323 × 64 × 16, and 243 × 48 × 16 lattices.
Thank You
```