# Quantum channel coding with finite resources

One of the quintessential topics in quantum information theory is the study of reliable quantum information transmission over noisy quantum channels. Let us consider a communication scheme between two parties connected by a memoryless communication channel that can be used many times in sequence. The sender first encodes a quantum state into a sequence of registers and then sends them one by one through the channel to the receiver. The receiver collects these registers and then attempts to decode the quantum state with high fidelity. A primary goal of information theory is to find fundamental limits imposed on any coding scheme that attempts to accomplish this task.

Following a tradition going back to Shannon’s groundbreaking work, this problem is usually studied asymptotically: the quantum capacity of a channel is defined as the optimal rate (in qubits per use of the channel) at which one can transmit quantum information with vanishing error as the number of sequential channel uses (the blocklength) increases to infinity. Such an asymptotic analysis has proven to be pertinent in the analysis of classical communication systems — but is it also satisfactory in the quantum setting? Achieving the quantum capacity generally requires both the receiver and sender to coherently manipulate an array of qubits that grows proportionally to the blocklength. More precisely, the sender needs to prepare states that are arbitrarily entangled over the channel inputs and the receiver needs to perform a joint quantum operation on all channel outputs. While classical computers can readily operate on very large amounts of data, at least for the near future it appears unrealistic to expect that encoding and decoding circuits can store or coherently manipulate large numbers of qubits. Thus, in recent joint work with Mario Berta and Joseph M. Renes (published as Nature Communications 7, 11419), we ask how well quantum coding schemes perform when we restrict the blocklength and thus the size of the quantum devices used for encoding the channel inputs and decoding its outputs.

Before we discuss our results we need to make the setup a bit more formal. Let us assume we are given a quantum channel $\mathcal{N} \equiv \mathcal{N}_{A \to B}$ from quantum states on a space $A$ to quantum states on a space $B$ and denote by $\mathcal{N}^{\otimes n}$ the $n$-fold parallel repetition of this channel. A code (more precisely, an entanglement transmission code) for this channel is given by a triplet $\{|M|,\mathcal{E},\mathcal{D} \}$, where $|M|$ is the local dimension of a maximally entangled state $|\phi\rangle_{MM'}$ that is to be transmitted over $\mathcal{N}^{\otimes n}$. The quantum channels $\mathcal{E} \equiv \mathcal{E}_{M' \to A^n}$ and $\mathcal{D} \equiv \mathcal{D}_{B^n \to M''}$ are encoding and decoding operations, respectively, and act on the whole block. The systems $M'$ and $M''$ are isomorphic to $M$. This is summarized in the following diagram:

With this in hand, we now say that a triplet $\{R,n,\varepsilon\}$ is achievable on the channel $\mathcal{N}$ if there exists an entanglement transmission code satisfying

$\displaystyle \frac{1}{n} \log |M| \geq R$    and    $\displaystyle F \left( \phi_{MM''}, \mathcal{D} \circ \mathcal{N}^{\otimes n} \circ \mathcal{E} ( \phi_{MM'} ) \right) \geq 1 - \varepsilon\,.$

Here, $R$ is the rate of the code, $n$ is the number of channel uses, and $\varepsilon$ is the tolerated error or infidelity, measured in terms of Uhlmann’s fidelity $F$. The non-asymptotic achievable region of $\mathcal{N}$ is then given by the union of all achievable triplets $\{R, n, \varepsilon\}$. The goal of (non-asymptotic) information theory is to find tight bounds on this achievable region, in particular to determine if certain triplets are outside the achievable region and thus forbidden. For this purpose, it is useful to define its boundary as

$\displaystyle R(n, \varepsilon) := \max \{R : (R,n,\varepsilon) \textnormal{ is achievable on } \mathcal{N} \}\,.$

This simply describes the maximal code rate that can be achieved as a function of the blocklenght, $n$, and the allowed infidelity, $\varepsilon$. Alternatively, one could also look at the minimum infidelity as a function of code rate and blocklength, $\varepsilon(n, R)$, or the minimal blocklength as a function of the code rate and the infidelity, $N(R, \varepsilon)$. These three functions are obviously intimately related and our focus on $latex R(n, \varepsilon)$ is motivated mostly by the fact that we can find an aesthetically pleasing approximation for it when the blocklength gets large.

As a warmup, let us recall the traditional asymptotic results first. The quantum capacity $Q$ is defined as the asymptotic limit of $R(n; \varepsilon)$ when $n$ goes to infinity  and $\varepsilon$ vanishes. (It is important that the $n \to \infty$ is taken before the limit $\varepsilon \to 0$. Otherwise we enter the realm of zero-error coding.) The famous Lloyd-Shor-Devetak (LSD) theorem and its converse state that the capacity can be expressed in terms of the regularized coherent information:

$\displaystyle Q := \lim_{\varepsilon \to 0} \lim_{n \to \infty} R(n;\varepsilon) = \sup_{\ell \in \mathbb{N}} \frac{1}{\ell} I_c\left(\mathcal{N}^{\otimes \ell}\right)\,.$

where $I_c$ denotes the coherent information of the channel. (The exact definition of the coherent information is not so relevant here.) This result is highly unsatisfactory, not least because the regularization makes its computation intractable. (In information theory the regularization, i.e. the optimization over $\ell$, is also called a multi-letter formula.) Worse, the statement is not as strong as we would like it to be because it does not give any indication of the fundamental limits for finite $\varepsilon$ or finite $n$. For example, even sticking to the asymptotic limit $n \to \infty$ for now, we might be willing to admit a small but nonzero error in our recovery. Formally, instead of requiring that the error vanishes asymptotically, we only require that it does not exceed a certain threshold, say $\varepsilon$. Can we then achieve a higher asymptotic rate in the above sense? For classical communication channels this was ruled out by Wolfowitz, who showed the so-called strong converse. The strong converse, in the language of quantum communication, states that any code with a rate exceeding the channel capacity necessarily has vanishing fidelity when $n$ gets large. However, quite surprisingly, it is not known if the strong converse holds for general quantum channels.

So if we do not even fully understand the asymptotic limit when $n$ goes to infinity and $\varepsilon$ goes to zero, how can we hope to characterize the boundary for finite $n$ and $\varepsilon$. Well, there are two answers to this:

1. For general channels, the functions $R$, $\varepsilon$ and $N$ are optimization problems where we need to minimize or maximize over all possible codes. We can often find inner and outer bounds on the achievable region by relaxing these optimization problems. Of particular interest are relaxations that lead to semidefinite programs which can be solved (relatively) as long as both the dimension of the channel input and output as well as the blocklength are small. Unless we have further symmetries that simplify the problem, small here really means small and we will not get far beyond $n = 3,4,5$ for qubit channels. In our work we provide such general relaxations for $R$, and before us Debbie Leung and William Matthews (IEEE Transactions of Information Theory, 61(8), 4486-4499) provided such relaxations for $\varepsilon$.
2. These bounds are very useful to do numerics but they do not necessarily provide us much intuition beyond that. What we hope for instead is some kind of analytical tradeoff between the different parameters $n$, $R$ and $\varepsilon$ that is governed by only a few parameters that can be calculated efficiently from the channel description.  Obviously the foremost such parameter is the capacity which governs the fundamental rate limit for large $n$. Hence, before trying to go any deeper, we must restrict our attention to channels for which the capacity is in fact computable. A class of such channels are the degradable channels for which the coherent information is additive and we have $Q = I_c(\mathcal{N})$. In recent joint work with Mark M. Wilde and Andreas Winter (arXiv: 1406.2946) we found that the strong converse holds for a class of degradable channels, called generalized dephasing channels. These channels are thus particularly well-behaved and we can hope to characterize them also for finite $n$. In the following we will show this for the example of the qubit dephasing channel. (The paper similarly treats the qubit erasure channel and presents many upper and lower bounds that are valid for general channels.)

For the remainder we focus on one particular example, the qubit dephasing channel. The qubit dephasing channel is given by the map $\mathcal{Z}_{\gamma}: \rho \mapsto (1-\gamma) \rho + \gamma Z \rho Z$ where $\gamma \in [0,1]$ is a parameter and $Z$ is a Pauli operator. Dephasing channels are particularly interesting examples because dephasing noise is dominant in many physical implementations of qubits. These recent results thus allow us to fully characterize the achievable region in the limit $n \to \infty$ for such channels, and in particular ensure that $\lim_{n \to \infty} R(n;\varepsilon) = I_c(\mathcal{Z}_{\gamma})$ independent of the value of $\varepsilon \in (0, 1)$. As mentioned before, regularization is not required here since dephasing channels belong to the class of degradable channels for which the coherent information is additive.

Our main result for the qubit dephasing channel gives a much more precise characterization of the achievable region that gives us nontrivial bounds for finite values of $n$. It states that

$\displaystyle R(n; \varepsilon) = 1 - h(\gamma) + \sqrt{\frac{v(\gamma)}{n}} \Phi^{-1}(\varepsilon) + \frac{1}{2n} \log n + O\left(\frac{1}{n} \right)\,,$

where $\Phi$ is the cumulative normal distribution function, and $\Phi^{-1}$ its inverse. The two channel parameters are the quantum channel capacity, $1 - h(\gamma)$, and the quantum channel dispersion, $v(\gamma)$. These are defined as

$\displaystyle h(p):=-p\log p-(1-p)\log(1-p),$    and
$v(p):=p(\log p+h(p))^2+(1-p)(\log(1-p)+h(p))^2\,.$

This result is a third order expansion of the rate $R$ as a function of $n$ for fixed $\varepsilon$. (This means that the implicit constants in the $O(\cdot)$ notation can depend on $\varepsilon$ and the channel.) The constant first order term, $1 - h(\gamma)$, is the capacity of the channel. The second order term, proportional to $\sqrt{v(\gamma)} \Phi^{-1}(\varepsilon)$ vanishes as $\frac{1}{\sqrt{n}}$. It is characterized by the channel’s dispersion and depends on $\varepsilon$ through the cumulative normal distribution function. Hence, in contrast to the first order term, we here see a tradeoff between all three parameters, $R$, $n$ and $\varepsilon$. The third order term $\frac{1}{2n} \log n$ yields an improved approximation for small $n$.

It is best to explain this relation with a few plots. In Figure (a) we plot the smallest achievable error $\varepsilon$ as a function of the rate $R$ for different values of $n$. For all the plots we use a qubit dephasing channel with $\gamma = 0.1$. Here we use the second order expansion without the term $\frac12 \log n + O(\frac{1}{n})$ since it can conveniently be solved for $\varepsilon$. In the limit $n \to \infty$ we see an instantaneous transition of $\varepsilon$ from 0 to 1, the signature of a strong converse: coding below the capacity is possible with perfect fidelity whereas coding above the capacity will necessarily result in a vanishing fidelity.

In Figure (b) we plot the third order approximation without the term $O(\frac{1}{n})$ for the highest achievable rate, $R(n; \varepsilon)$, as a function of n for a fixed fidelity of $95\%$ (i.e. we set $\varepsilon = 0.05$). For example, this allows us to calculate how many times we need to use the channel in order to approximately achieve the quantum capacity. The third order approximation shows that we need approximately $850$ channel uses to achieve 90% of the quantum capacity. Note that a coding scheme achieving this would probably require us to coherently manipulate $850$ qubits in the decoder, which appears to be a quite challenging task. This example shows that the capacity does not suffice to characterize the ability of a quantum channel to transmit information, and further motivates the study of the achievable region for finite n.

Finally, we remark that the third order approximation is quite precise even for small n. To prove this we compare it to exact upper and lower bounds on $R(n; \varepsilon)$ in Figure (c) and see that the remainder term $O(\frac{1}{n})$ becomes negligible already for fairly small $n \approx 100$ for the present values of $\gamma$, $n$, and $\varepsilon$.

In summary, what our results show is that the quantum channel capacity is insufficient to characterize achievable communication rates in the finite resource setting. We provide a remedy, showing that the capacity and quantum channel dispersion together provide a very good characterization, in particular for the practically-relevant qubit dephasing, depolarization, and erasure channels. This is crucial for practical considerations where one would like to rely on a simple and easy to evaluate formula to estimate the achievable rate region. For instance, one can use the estimated non-asymptotic achievable rate region as a benchmark to see how close to optimal a concrete codes performs.

We obtained very precise results for the qubit dephasing and erasure channels, namely a second order (or even third order) expansion of the rate. A natural question our work leaves open is whether such an approximation can always be found for channels that have additive coherent information. But first we would need to show that such channels satisfy the strong converse…