Toric code example

In this example, we’ll use PyMatching to estimate the threshold of the toric code under an independent noise model with perfect syndrome measurements. The decoding problem for the toric code is identical for \(X\)-type and \(Z\)-type errors, so we will only simulate decoding \(Z\)-type errors using \(X\)-type stabilisers in this example.

First, we will construct a check matrix \(H_X\) corresponding to the \(X\)-type stabilisers. Each element \(H_X[i,j]\) will be 1 if the \(i\)th \(X\) stabiliser acts non-trivially on the \(j\)th qubit, and is 0 otherwise.

We will construct \(H_X\) by taking the hypergraph product of two repetition codes. The hypergraph product code construction \(HGP(H_1,H_2)\) takes as input the parity check matrices of two linear codes \(C_1:=\ker H_1\) and \(C_2:= \ker H_2\). The code \(HGP(H_1,H_2)\) is a CSS code with the check matrix for the \(X\) stabilisers given by

\[H_X=[H_1\otimes I_{n_2},I_{r_1}\otimes H_2^T]\]

and with the check matrix for the \(Z\) stabilisers given by

\[H_Z=[I_{n_1}\otimes H_2,H_1^T\otimes I_{r_2}]\]

where \(H_1\) has dimensions \(r_1\times n_1\), \(H_2\) has dimensions \(r_2\times n_2\) and \(I_l\) denotes the \(l\times l\) identity matrix.

Since we only need the \(X\) stabilisers of the toric code with lattice size L, we only need to construct \(H_X\), using the check matrix of a repetition code with length L for both \(H_1\) and \(H_2\):

[1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.sparse import hstack, kron, eye, csr_matrix, block_diag


def repetition_code(n):
    """
    Parity check matrix of a repetition code with length n.
    """
    row_ind, col_ind = zip(*((i, j) for i in range(n) for j in (i, (i+1)%n)))
    data = np.ones(2*n, dtype=np.uint8)
    return csr_matrix((data, (row_ind, col_ind)))


def toric_code_x_stabilisers(L):
    """
    Sparse check matrix for the X stabilisers of a toric code with
    lattice size L, constructed as the hypergraph product of
    two repetition codes.
    """
    Hr = repetition_code(L)
    H = hstack(
            [kron(Hr, eye(Hr.shape[1])), kron(eye(Hr.shape[0]), Hr.T)],
            dtype=np.uint8
        )
    H.data = H.data % 2
    H.eliminate_zeros()
    return csr_matrix(H)

From the Künneth theorem, the \(X\) logical operators of the toric code are given by

\[\begin{split}L_X=\left( \begin{array}{cc} \mathcal{H}^1\otimes \mathcal{H}^0 & 0 \\ 0 & \mathcal{H}^0\otimes \mathcal{H}^1 \end{array} \right)\end{split}\]

where \(\mathcal{H}^0\) and \(\mathcal{H}^1\) are the zeroth and first cohomology groups of the length-one chain complex that has the repetition code parity check matrix as its boundary operator. We can construct this matrix with the following function:

[2]:
def toric_code_x_logicals(L):
    """
    Sparse binary matrix with each row corresponding to an X logical operator
    of a toric code with lattice size L. Constructed from the
    homology groups of the repetition codes using the Kunneth
    theorem.
    """
    H1 = csr_matrix(([1], ([0],[0])), shape=(1,L), dtype=np.uint8)
    H0 = csr_matrix(np.ones((1, L), dtype=np.uint8))
    x_logicals = block_diag([kron(H1, H0), kron(H0, H1)])
    x_logicals.data = x_logicals.data % 2
    x_logicals.eliminate_zeros()
    return csr_matrix(x_logicals)

Now that we have the \(X\) check matrix and \(X\) logicals of the toric code, we can use PyMatching to simulate its performance using the minimum-weight perfect matching decoder and an error model of our choice.

To do so, we first import the Matching class from PyMatching, and use it to construct a Matching object from the check matrix of the stabilisers:

from pymatching import Matching
matching=Matching(H)

Constructing the Matching object, while efficient, is often slower than the decoding step itself. As a result, it’s best to construct the Matching object only at the beginning of the experiment, and not before every use of the decoder, in order to obtain the best performance.

We also choose a number of trials, num_trials. For each trial, we simulate a \(Z\) error under an independent noise model, in which each qubit independently suffers a \(Z\) error with probability \(p\):

noise = np.random.binomial(1, p, H.shape[1])

Here, noise is a binary vector and noise[i] is 1 if qubit \(i\) suffers a \(Z\) error, and 0 otherwise.

The syndrome of the \(X\) stabilisers is then calculated from the dot product (modulo 2) with the \(X\) check matrix \(H\):

syndrome = H@noise % 2

We can now use PyMatching to infer the most probable individual error given the syndrome:

correction = matching.decode(syndrome)

The total error is now given by the sum (modulo 2) of the noise and the correction:

error = (noise + correction) % 2

PyMatching is guaranteed to give a correction that returns us to the code space, so a logical \(Z\) error will anti-commute with at least one of the \(X\) logicals. Therefore a logical error has occurred if the condition

np.any(error@logicals.T % 2)

is True, where logicals is the binary matrix \(L_X\) with each row corresponding to an \(X\) logical.

Taken together, we obtain the following function num_decoding_failures that returns the number of logical errors after num_trials Monte Carlo trials, simulating an independent error model with error probability p, with the \(X\) stabiliser check matrix H and \(X\) logical matrix logicals:

[3]:
from pymatching import Matching

def num_decoding_failures(H, logicals, p, num_trials):
    matching = Matching(H, spacelike_weights=np.log((1-p)/p))
    num_errors = 0
    for i in range(num_trials):
        noise = np.random.binomial(1, p, H.shape[1])
        syndrome = H@noise % 2
        correction = matching.decode(syndrome)
        error = (noise + correction) % 2
        if np.any(error@logicals.T % 2):
            num_errors += 1
    return num_errors

Using this function, we can now estimate the threshold of the toric code by varying the error rate \(p\), for a range of lattice sizes \(L\). Running this next cell may take a couple of minutes:

[4]:
%%time

num_trials = 5000
Ls = range(4,14,4)
ps = np.linspace(0.01, 0.2, 9)
np.random.seed(2)
log_errors_all_L = []
for L in Ls:
    print("Simulating L={}...".format(L))
    Hx = toric_code_x_stabilisers(L)
    logX = toric_code_x_logicals(L)
    log_errors = []
    for p in ps:
        num_errors = num_decoding_failures(Hx, logX, p, num_trials)
        log_errors.append(num_errors/num_trials)
    log_errors_all_L.append(np.array(log_errors))
Simulating L=4...
Simulating L=8...
Simulating L=12...
CPU times: user 3min 20s, sys: 267 ms, total: 3min 20s
Wall time: 3min 20s

Finally, let’s plot the results! We expect to see a threshold of around 10.3%, although a precise estimate requires using more trials, larger lattice sizes and scanning more values of \(p\):

[5]:
%matplotlib inline

plt.figure()
for L, logical_errors in zip(Ls, log_errors_all_L):
    std_err = (logical_errors*(1-logical_errors)/num_trials)**0.5
    plt.errorbar(ps, logical_errors, yerr=std_err, label="L={}".format(L))
plt.xlabel("Physical error rate")
plt.ylabel("Logical error rate")
plt.legend(loc=0);
_images/toric-code-example_9_0.png

Noisy syndromes

In the presence of measurement errors, each syndrome measurement is repeated \(O(L)\) times, and decoding instead takes place over a 3D matching graph with an additional time dimension (see Section IV B of this paper). The time dimension can be added to the matching graph by specifying the number of repetitions when constructing the matching object:

matching = Matching(H, repetitions=T)

where here \(T\) is the number of repetitions. For decoding, the difference syndrome should be supplied as an \(r\times T\) binary numpy matrix, where \(r\) is the number of checks (rows in \(H\)). The difference syndrome in time step \(t\) is the difference (modulo 2) between the syndrome measurement in time step \(t\) and \(t-1\), and ensures that any single measurement error results in two syndrome defects (at the endpoints of a timelike edge in the matching graph). The last round of syndrome measurements should be free of measurement errors to ensure that the overall syndrome has even parity: when qubits are measured individually at the end of a computation then the final round of syndrome measurement is indeed error-free (stabilisers can be determined exactly in post-processing), however the overlapping recovery method should be implemented when decoding must be completed before all qubits are measured.

The following example demonstrates decoding in the presence of measurement errors using a phenomenological error model. In this error model, in each round of measurements each qubit suffers an error with probability \(p\), and each syndrome is measured incorrectly with probability \(q\).

[6]:
def num_decoding_failures_noisy_syndromes(H, logicals, p, q, num_trials, repetitions):
    matching = Matching(H, spacelike_weights=np.log((1-p)/p),
                repetitions=repetitions, timelike_weights=np.log((1-q)/q))
    num_stabilisers, num_qubits = H.shape
    num_errors = 0
    for i in range(num_trials):
        noise_new = (np.random.rand(num_qubits, repetitions) < p).astype(np.uint8)
        noise_cumulative = (np.cumsum(noise_new, 1) % 2).astype(np.uint8)
        noise_total = noise_cumulative[:,-1]
        syndrome = H@noise_cumulative % 2
        syndrome_error = (np.random.rand(num_stabilisers, repetitions) < q).astype(np.uint8)
        syndrome_error[:,-1] = 0 # Perfect measurements in last round to ensure even parity
        noisy_syndrome = (syndrome + syndrome_error) % 2
        # Convert to difference syndrome
        noisy_syndrome[:,1:] = (noisy_syndrome[:,1:] - noisy_syndrome[:,0:-1]) % 2
        correction = matching.decode(noisy_syndrome)
        error = (noise_total + correction) % 2
        assert not np.any(H@error % 2)
        if np.any(error@logicals.T % 2):
            num_errors += 1
    return num_errors

We’ll now simulate the performance of the decoder for a range of lattice sizes \(L\) and physical error rate \(p\) (taking \(q=p\)) and estimate the threshold. This next cell takes around 20 minutes to execute:

[7]:
%%time

num_trials = 5000
Ls = range(8,13,2)
ps = np.linspace(0.02, 0.04, 7)
log_errors_all_L = []
for L in Ls:
    print("Simulating L={}...".format(L))
    Hx = toric_code_x_stabilisers(L)
    logX = toric_code_x_logicals(L)
    log_errors = []
    for p in ps:
        num_errors = num_decoding_failures_noisy_syndromes(Hx, logX, p, p, num_trials, L)
        log_errors.append(num_errors/num_trials)
    log_errors_all_L.append(np.array(log_errors))
Simulating L=8...
Simulating L=10...
Simulating L=12...
CPU times: user 23min 10s, sys: 1.21 s, total: 23min 11s
Wall time: 23min 11s

Plotting the results, we find a threshold of around 3%, consistent with the threshold of 2.9% found in this paper:

[8]:
%matplotlib inline

plt.figure()
for L, logical_errors in zip(Ls, log_errors_all_L):
    std_err = (logical_errors*(1-logical_errors)/num_trials)**0.5
    plt.errorbar(ps, logical_errors, yerr=std_err, label="L={}".format(L))
plt.yscale("log")
plt.xlabel("Physical error rate")
plt.ylabel("Logical error rate")
plt.legend(loc=0);
_images/toric-code-example_15_0.png