ppt - QEC14

QEC 2014, Zurich, Switzerland
Fault-Tolerant Quantum Computation in Multi-Qubit Block
Todd A. Brun
University of Southern California
With Ching-Yi Lai, Yi-Cong Zheng, Kung-Chuan Hsu
Why large block codes?
Many schemes for fault-tolerant quantum error correction have been
proposed. All require relatively low rates of error, though the
ability to tolerate errors has slowly been improved by a long string
of theoretical developments. Most also require a large amount of
overhead—in some cases, a very large amount. A logical qubit is
encoded in hundreds or thousands of physical qubits, if not more.
It was long ago observed (by Steane, and others) that codes encoding
multiple qubits can achieve significantly higher rates for the same
level of protection from errors. But performing logical gates in such
block codes is difficult.
I will briefly sketch a scheme for computation where multi-qubit block
codes are used for storage (storage blocks). Clifford gates and
logical teleportation are done by measuring logical operators, and a
different code (the processor block) is used with a non-Clifford
transversal gate for universality. Syndromes and logical operators
are measured by Steane extraction, so this scheme uses only ancilla
preparation, transversal circuits, and single-qubit measurements.
Steane, Nature 399, 124-126 (1999).
Outline of the scheme
1. When not being processed, logical qubits are stored in [[n,k,d]]
block codes. These codes are corrected by repeatedly measuring
the stabilizer generators via Steane extraction.
2. Logical Clifford gates can be done by measuring sequences of
logical operators. This is also done by Steane extraction with a
modified ancilla state.
3. A non-Clifford gate is done by teleporting the selected logical
qubits into the processor block, and teleporting back after the
gate. Teleportation is also used between storage blocks.
4. The processor blocks use codes that allow transversal circuits for
the encoded gates. E.g., for the T gate we could use the
concatenated [[15,1,3]] shortened Reed-Muller code.
5. Logical teleportation is also done by measuring a sequence of
logical operators.
Outline of the scheme
Memory Array
Ancilla Factor y
Correcting Errors
The storage blocks should use a code that stores multiple qubits
with high distance, and a rate that is better than that achieved
by encoding the qubits separately (as in a concatenated scheme
or topological code). The logical error rate must be sufficiently
low that we expect no logical errors in any block for the entire
duration of the quantum computation.
The processor block must also have a very low rate of logical errors,
but the demands are not quite as stringent as for the storage
blocks. The rate must be low enough that the probability of an
uncorrectable error during any of the non-Clifford gates is low.
How large a computation can be done in this scheme? Clearly, for a
fixed size of code block, one cannot do computations of
unlimited size. However, it may be possible to protect
information for sufficiently long to do a nontrivially large
quantum computation. If the error rate is low enough, it may be
possible to carry out any computation we could conceivably do in
Let pS be the logical error rate of the storage block, and pP be the
logical error rate of the processor block, per “round” of
computation. One storage block encodes k logical qubits.
Suppose we wish to do a computation involving N logical qubits
with circuit depth D (measured in rounds of computation), and
using M non-Clifford gates. Then, roughly speaking, we expect
to be able to complete this computation if
<< ,
M <<
A “round” of computation is essentially one iteration of Steane
syndrome extraction. We call 1/pS and 1/pP the code lifetimes.
Steane Syndrome Extraction
Steane syndrome extraction can be done on any CSS code. All the Xtype stabilizer generators are measured in one shot, and so are
all the Z-type stabilizer generators.
This requires two ancilla states, each of which is a copy of the same
code as the codeword. The first is prepared with all the logical
qubits in the |+> state, the second all in the |0> state. After the
circuit of transversal CNOTs, all the ancilla qubits are measured in
the Z or X basis, respectively. The syndrome bits are parities of
these measurement outcomes.
Measuring logical operators
A simple variant of the Steane extraction procedure will also let us
measure logical operators. Which logical operators are measured
is determined by the logical state of the ancillas.
Suppose we want to measure the logical operator Zj for the jth logical
qubit. Then we prepare the jth logical qubit of the Z ancilla state
in the state |0> instead of |+>. We determine the value of Zj by
taking a parity of qubit measurements, after first carrying out a
classical error correction step.
Similarly, suppose we want to measure the logical operator Xj for the
jth logical qubit. Then we prepare the jth logical qubit of the X
ancilla state in the state |+> instead of |0>.
Two very nice features of this way of measuring logical operators:
1. Because the ancillas are themselves error-correcting codes, we can
correct some errors in the measurement procedure. That makes
this protocol robust against noise.
2. In the course of measuring the logical operators, we also extract
all the error syndromes. So this can just be substituted for the
usual round of syndrome extraction.
Measuring products of logical operators
We are not limited to measuring single X’s or Z’s. Suppose I wish
to measure ZiZj. Prepare logical qubits i and j of the Z ancilla
in the state F+ . Then one can extract the product operator
ZiZj in an exactly analogous way. One can similarly measure
products of logical X operators.
What about measuring logical Y operators? Or, more generally,
arbitrary products of X’s and Z’s? This is done by preparing
both ancillas in an entangled state.
As mentioned above, this procedure has some robustness against
noise. But if that is not sufficient, one can repeat the
measurement a few times.
While measuring logical operators is a useful thing to be able to
do, we will use it as a building block for two key operations:
Clifford gates, and logical teleportation.
Clifford gates
If we have a buffer qubit in addition to the qubit(s) we
wish to act on, we can do arbitrary Clifford gates by
measuring a sequence of logical operators.
Suppose we have two qubits and a buffer qubit in the state
y1 y2 03 .
We can carry out a CNOT gate by measuring this
sequence of logical operators: X2X3, Z1Z2, and X1. This
leaves the three qubits in the state
P + C12 ( y1 y2 ),
where P is a (known) Pauli operator acting on the three
logical qubits. This can either be corrected, or kept
track of, along with the Pauli operators from the
syndrome extraction.
See, e.g., D. Gottesman, Caltech Ph.D. Thesis, 1997.
We can similarly do a Hadamard gate by measuring ZX and XI,
and a Phase gate by measuring XY and ZI. One can also do
SWAPs, and reset the buffer qubit in the Z or X basis.
Notice that the CNOT can be done by measuring pure X and pure Z
operators, but the Hadamard and Phase gates require us to
measure products of both.
One Clifford gate, by this method, takes 2 or 3 “rounds” of the
computation—more, if we need to repeat measurements.
Some overhead may also be necessary if we must return the
buffer qubit to its original state and location.
Performing non-Clifford logical gates
The above scheme allows us perform an arbitrary Clifford
operation on the logical qubits within a single storage block.
But it does not allow us to transfer logical qubits between
blocks. We also need the ability to do at least one nonClifford gate to get universality.
For both of these purposes we use logical teleportation.
We can teleport a qubit to another storage block to allow
CNOTs between blocks.
For a non-Clifford gate, we teleport the qubit into a processor
block, which allows a transversal gate outside the Clifford
group. For example, the concatenated [[15,1,3]] truncated
Reed-Muller code allows a transversal T gate.
Logical teleportation
We can teleport logical qubits between code blocks by measuring
an appropriate set of logical operators fault-tolerantly. Suppose
our [[n,k,d]] code has k pairs of logical operators (X1,Z1), (X2,Z2),
…, (Xk,Zk), and we want to teleport into another code block with
a pair of logical operators (X0,Z0). We will use logical qubit 1 as
a buffer to hold half of an entangled pair, and logical qubits
2,…,k hold actual qubits to be stored. Suppose we wish to act on
logical qubit 2.
1. Measure logical operators X0X1 and Z0Z1 to prepare a logical Bell
2. Measure logical operators X1X2 and Z1Z2 to teleport.
3. Do a transversal circuit on the second codeword to correct (if
necessary) and apply the desired logical gate.
4. Measure logical operators X0X1 and Z0Z1 to teleport back.
5. Do a transversal circuit on the first codeword to correct (if
Logical teleportation, as we see, can also be done by measuring
logical operators. But now we must measure products of
logical operators on different code blocks.
We do this in exactly the same way as measuring any other
logical operator, but now we must prepare the ancillas of two
different code blocks in an entangled state.
(We can also use this trick to directly do CNOTS between logical
qubits of two different storage blocks.)
Performance of the scheme
How well does this scheme perform in practice? To see, we
must choose particular codes for the storage and processor
blocks, and do a combination of analysis and numerical
simulation. For a first start, we are trying to estimate the
code lifetimes for different choices of codes.
Such an analysis should, in general, include all the different
sources of noise in the system. These include:
• Gate errors
• Memory errors
• Measurement errors
• Ancilla preparation errors
To simplify the analysis, we first transform all these noise
sources into a single effective error process, which we can
treat as acting only between rounds of the computation.
Effective Errors
data qubit
E 11
E 12
E 13
E 21
E 22
E 23
E 31
E 32
E 33
The individual qubits in Steane extraction undergo transversal circuits
like the one above. We have marked the locations where errors can
occur. Crucially, we assume that these errors are uncorrelated
between qubits.
The effective error process acts before and after the circuit, which we
treat as ideal. The errors, however, are now generally correlated in
time. These correlations are potentially useful in diagnosing errors.
Ancilla preparation errors can produce correlated errors.
Gate errors...
Measurement errors...
One-time effective error process
If we ignore the time correlations, we can get an effective error
process for a single time:
We could make use of the correlations to do soft-decision
decoding. But if the probability of an uncorrectable error at a
single time is sufficiently low, that is already enough to get a
lower bound on the code lifetime.
From the above, we see that the effective error rate is going to
be a multiple (but not necessarily a large multiple) of the
largest basic error process (e.g., CNOT errors).
Storage blocks
To build storage blocks that compromise between high distance,
high rate, and efficient decoding, we looked at concatenations
of pairs of codes.
Bottom level: [[23,1,7]] quantum Golay code.
Top level: [[89,23,9]], [[127,57,11]] or [[255,143,15]] quantum
BCH code.
These combinations would give us storage blocks with parameters
[[2047,23,63]], [[2921,57,77]], or [[5865,143,105]],
respectively. At least one logical qubit in each block must be
set aside to act as a buffer qubit. So these three combinations
represent a logical qubit by approximately 93, 52, or 41
physical qubits. Very good rates!
The Golay code can be decoded by the Kasami error-trapping
algorithm, and the BCH codes by the Berlekamp-Massey
Estimated Performance
[[2047,23,63]] (blue), [[2921,57,77]] (red), and
[[5865,143,105]] (green). These curves are generated by
Monte Carlo simulations and linear extrapolation. However,
the extrapolation at low error rates can be backed up by an
upper bound calculated purely from the distances of the code.
For peff = 0.007, we get bounds on the error rates of
approximately 10-16, 2.5x10-19, and 7x10-24.
Processor blocks
For the processor block, we looked at two and three levels of
concatenation of the [[15,1,3]] code. With hard decision
decoding the performance of this code is very poor at high
error rates, largely because it is very limited in its ability to
correct phase errors.
Using Poulin’s soft decision decoding algorithm the
performance markedly improved, even at effective error
rates above 0.01.
The upper bound used for the storage blocks is no use for soft
decision decoding, and extrapolating from Monte Carlo is
unlikely to be reliable in this case. But we were able to find
rough bounds at moderately low error rates.
D. Poulin, Phys. Rev. A 74, 052333 (2006).
The [[15,1,3]] code at two (blue) and three (red) levels of
concatenation. A combination of Monte Carlo simulations at
higher errors, a rough bound at lower errors, and (not to be
relied upon) extrapolation.
At peff = 0.007 (all contributing error processes below 5x10-4), the
block error rate is estimated to be roughly 2x10-12. This is
certainly small enough to carry out highly nontrivial quantum
Ancilla preparation
We have only begun to explore the problem of ancilla preparation
for this approach. The first idea we are studying is a type of
ancilla distillation: prepare M imperfect copies of the ancilla,
then perform a transversal error correction step that yields a
smaller fraction FM of high-quality ancillas.
A very important requirement is that error correlations within each
ancilla should be very small. (Across ancillas is not a problem.)
How does this affect the resource requirements for this scheme? If
the code is [[n,k,d]], and the distillation takes Td rounds with
yield fraction F, then the physical qubits needed per logical
qubit are increased by a factor of
n æ 2Td ö
kè F
If our available gates are local—as they almost certainly will be—
then ancilla distribution will also increase the demand on the
resources. That is yet a further question to explore.
Threshold theorems?
Does a scheme like this have a threshold? The short answer:
I don’t know.
• It certainly does not have a threshold with a fixed size of
storage blocks.
• To prove a threshold theorem, we would have to find a
family of sets of code blocks where we could prove that
lifetimes can be scaled up with problem size for sufficiently
low probabilities.
• This would be a very desirable result—but not for practical
purposes a necessary one. A single storage block code
could have a lifetime sufficiently long to do any
conceivable quantum computation in practice.
• We are using fault-tolerant methods to avoid propagating
errors, etc., without doing computations of arbitrary size.
Open questions and future work
The number of open questions is large, including: does this
really work? Here are some big ones:
• Can we better codes for block storage? Perhaps quantum
LDPC codes? Can we do the decoding efficiently in real time
during a computation? It is encouraging that without trying
very hard to find good goods, we got decent performance.
• What are the resource requirements for the construction,
verification, and distribution of the ancilla states? While
these are all stabilizer states, and can in principle be
prepared fault-tolerantly, the resources needed for them will
probably exceed those for the codewords by a factor of “a
lot.” Currently we are investigating (hopefully) efficient
distillation algorithms for these ancillas.
• If we include locality and communication costs, how much
does this performance degrade? These costs are mainly in
preparing and distributing the ancillas.
Are there better (or additional) choices of codes for the
processing blocks? We have looked a little at 3D color codes,
but haven’t (so far) found an example that outperforms the
concatenated [[15,1,3]] code.
Can we find families of codes where a threshold theorem can
be proven for this scheme?
Thank you for your attention!

similar documents