QCrypt in Toulouse: Key Distribution

title: "QCrypt in Toulouse: Key Distribution"
published: 2024-02-03
template: ../../templates/post.html
prev-article: ../../posts/accidentally-quadratic

These are notes for the third session of the Quantum Cryptography Workgroup in Toulouse. More info can be found here.

In the previous two sessions we have introduced the fundamental ideas underlying (quantum) cryptography and developed helpful mathematical tools. We now want to distribute actual keys. First classically, then quantumly.

Definition. A key distribution protocol is a protocol that lets Alice and Bob agree on an -bit key . It is

-correct if the protocol succeeds with probability at least , that is, if Alice and Bob have derived keys according to the protocol then .

-secure if an eavesdropper Eve is almost ignorant about the key, that is .

Such a protocol may work only under specific assumptions on the communication channels that Bob and Alice have access to. Generally, we will distinguish between the following types of communication channels.

Classical channel
Classical authenticated channel
Classical secret channel
Classical secret and authenticated channel
Quantum channel

A classical toy protocol

As a warmup, let us consider a purely classical protocol in a toy model. We assume that Alice and Bob have access to a classical authenticated channel (CAC) and want to establish a key over a classical channel to which Eve has limited access in the following sense. If Alice sends a bit over the channel,

Bob correctly receives
Eve receives with probability and with probability .

The induced channel Alice Eve is also called the binary symmetric channel in the information theory literature.

Recall the definition of an -strong seeded extractor.

Definition. A -strong seeded extractor is a function such that for any qc state with ,

Now consider the following protocol.

Protocol 1 -- Key distribution over a special channel.

Alice chooses uniformly at random and sends the bits over the channel.

Alice picks random seed and applies an -strong seeded extractor to , computing .

Alice sends over the CAC.

Bob computes .

Clearly the protocol is -correct, as there are no errors in the communication between Bob and Alice, and they compute the same function to obtain the key .

It is also not hard to see that the protocol is -secure. Recall that we introduced randomness extractors as a way to "purify" some shared -bits of entropy between Alice and Bob. Sending bits over a channel to which Eve has only limited access is simply a way of establishing these initial bits of entropy. The fact that Eve has some knowledge about the seed is unimportant since we use a strong seeded extractor.

We can determine what the constants and have to be: From Eve's perspective the cq state of the key shared by Alice and Bob is

Assuming , this state has min entropy

so we must choose . Selecting among a 2-universal family of hash functions, we can choose as large as , as seen in the last session.

Remark. Of course the same protocol works for other restrictions on the channel, for example if Eve receives all sent bits correctly but only has limited memory.

The CAC assumption

Before we introduce a key distribution algorithm that works without additional restrictions on potential eavesdroppers, we need to address the assumption that Alice and Bob have access to a classical authenticated channel.

There are generally two approaches to implementing an authenticated channel: Message Authentication Codes (MACs) and Digital Signatures. However, both approaches have drawbacks in a quantum key distribution context. MACs use private key cryptography --- so Alice and Bob already need to have a shared private key in advance! Digital Signatures, on the other hand, do not require an established private key, but make use of computational assumptions that generally no longer hold in the quantum era.

Does this mean that the CAC assumption is too strong? The answer is nuanced. Let us take a look at both approaches individually.

MACs

The idea behind message authentication codes is that based on a shared private key, Alice generates a tag for her message. With the same key, Bob can verify the validity of the tag for a given message. Formally, a message authentication code is a pair of functions

, which takes a message and a key, and returns a tag.
, which takes a message, a key, and a claimed tag for the message. It then returns whether the tag is valid for the message.

Both functions are required to run in polynomial time.

A MAC is correct if

It is secure if, informally speaking, it is impossible for an adversary to generate a new pair of a message and a corresponding valid tag, even if he has access to arbitrarily many valid (message, tag) pairs (except with negligible probability).

Example. An example of a one-time MAC^[1] can be constructed using a 2-universal family of hash functions. Given such a family ,

The key is an element .

.

.

Obviously this scheme is correct. Moreover it is secure by the 2-universality of : Suppose an adversary is given a single valid pair . Since is in particular -universal, the value of does not tell them anything about . Hence the only way to generate a new valid pair is to guess , which is successful with probability

For maximal security, one would choose to be as large as possible, i.e. . Note, however, that any -universal family of hash functions needs to have at least elements, so the key needs to have length , that is, be just as long as the message.

Under the assumption that one-way functions exist^[2], one can find secure MACs that use shorter keys or allow reusing a key arbitrarily many times.

In the context of key distribution algorithms, however, the conceptual problem that Alice and Bob need to already share a private key remains.

Digital Signatures

As opposed to MACs, digital signatures make use of public-key cryptography. The idea is that Alice generates a pair of a public and a private key. Alice keeps her private key secret and shares her public key. A digital signature scheme is then a pair of functions

, which takes a message and a private key, and outputs a signature.
, which takes a message, a public key, and a claimed signature and outputs whether the signature is valid.

Both functions are required to run in polynomial time, may be probabilistic.

The scheme is correct if

Security is defined similarly as for MACs.

Example. A common digital signature scheme^[3] is based on RSA. Here we choose two large primes and let . We then choose a positive integer and compute a such that . The public key is , the private key is .
We then let

where is an embedding of the message space into .

Of course, the above algorithm is no longer secure in a post-quantum setting. An obvious idea is to base the signature scheme on a post-quantum cryptographic algorithm like LWE instead. However, these types of algorithms are exactly what quantum key distribution is supposed to replace!

A slightly more convincing argument for digital signatures is that keys generated by a quantum key distribution protocol enjoy everlasting security, even if the authentication scheme underlying the CAC is broken afterwards. It suffices for the CAC to remain authenticated for the duration of the protocol, which is at most on the order of seconds. The signature scheme used to implement it may hence not necessarily need to be provably secure as long as it is "strong enough".

The BB-84 quantum key distribution protocol

As an example of an actual quantum key distribution protocol, we look at the well-known BB84 protocol, proposed by Bennett and Brassard in 1984.

Protocol 2 -- BB-84.

Fix a large integer and let .

Alice chooses strings and uniformly at random. She sends Bob each bit by encoding as a quantum state according to as .

Bob chooses a string uniformly at random, and measures the th received qubit in the basis corresponding to . This yields a string .

Bob tells Alice over the CAC that he has received and measured all the qubits.

Alice and Bob exchange their basis strings over the CAC.

Alice and Bob discard all bits of and respectively where , yielding strings and

Alice and Bob check for eavesdroppers:
a. Alice picks a random subset and tells Bob over the CAC.
b. Alice and Bob exchange and over the CAC. They compute the number of errors .
c. If , Alice and Bob abort the protocol. Otherwise they proceed with the strings and .

Alice and Bob perform privacy amplification: Alice picks a random seed and computes . She then sends to Bob, who computes .

Consider the following example run of the protocol.

+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| | | | | | | | | |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| Sent | | | | | | | | |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| | | | | | | | | |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| Measured | | | | | | | | |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+
| | 0 | | 1 | | | 0 | | 1 |
+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+------------------+

It is not hard to see that the protocol is correct. Clearly, when and agree then so will and . It follows that and hence .

Security of BB-84

The security of BB-84 is intuitively clear: If Eve tries to measure Alice's qubit in a basis that does not agree with the one that Alice used for the encoding, she will introduce an error in Bob's measurement with probability . The requirement that the error rate be on a random subset of qubits of size then lower-bounds the min-entropy . A round of privacy amplification then extracts a key that Eve is ignorant about.

In this sense, the general idea mirrors our toy protocol.

Formalising this idea is more involved, in particular because Eve might apply more involved attacks like entangling her system with the sent qubits. In fact it is so difficult that it took more than 20 years to give a proof that does not only work asymptotically. We will show security asymptotically in a restricted case.

To simplify the analysis, we first adapt the protocol somewhat. We assume that Alice and Bob share an EPR pair, and Alice measures her part of the state in some choice of basis to obtain the random string . We also allow a non-zero error rate in step 7.

Protocol 3 -- Purified BB-84

Fix a large integer and let

Alice prepares EPR pairs and sends the second qubit to Bob respectively.

Alice chooses a string uniformly at random. She then measures the first qubit of the th EPR pair in the or basis depending on to obtain a random string .

Bob chooses a string uniformly at random, and measures the second qubit of the th EPR pair in the basis corresponding to . This yields a string .

Bob tells Alice over the CAC that he has measured all the qubits.

Alice and Bob exchange their basis strings over the CAC.

Alice and Bob discard all bits of and respectively where , yielding strings and

Alice and Bob check for eavesdroppers:
a. Alice picks a random subset and tells Bob over the CAC.
b. Alice and Bob exchange and over the CAC. They compute the error rate , where .
c. If is too large, Alice and Bob abort the protocol.

Alice and Bob perform privacy amplification on the remaining bits.

Note that the use of entangled states is equivalent to the original approach. Moreover, the weakened requirements on the error rate make Protocol 3 generally less secure than Protocol 2. If we can prove security for this modified protocol, then the original protocol must also be secure.

Next, we significantly increase the power of an eavesdropper. We assume that it is Eve who initially prepares a state , whose and systems consist of qubits respectively, that she distributes to Alice and Bob. Alice and Bob then proceed as usual with the measurements on their respective qubits.

If we just recover Protocol 3.

Any possible attack (excluding side-channel attacks) can be modeled as Eve simply distributing a specific . On the other hand, not every state can be achieved by only interacting with the qubits sent over the channel. Eve's power has strictly increased.

We will show that despite this additional power given to Eve, the protocol is still secure, assuming is not entangled between rounds. When there are no restrictions on the protocol is still secure, but the analysis becomes too complicated to cover here.

To show that Protocol 3 is secure, it suffices to show that it is able to detect whether is close enough to . The obvious way Alice and Bob could verify this is to add an initial step to the protocol as follows:

After receiving from Eve, Alice and Bob jointly measure their qubit pairs using the POVM . If for some more than of the pairs are found not to be in state , Alice and Bob abort the protocol.

Such a step would also make step 7 of Protocol 3 redundant. Of course this hypothetical step 0 is impossible to implement, because Alice and Bob can only perform local measurements on their respective systems. We argue that step 7 is essentially equivalent to step 0.

Lemma 1. Let be a tripartite state, where and are systems of a single qubit. Then the probability that respective measurements of systems and in the computational basis result in matching outcomes is exactly , where
Similarly, the probability that the outcomes match when measuring in the Hadamard basis is , where

*Proof.*
The probability of matching measurement outcomes in the computational basis is given by $$ P_{\text{match}} = \braket{00 | \rho_{AB} | 00} + \braket{11 | \rho_{AB} | 11} = \operatorname{Tr} ((\ket{00}\bra{00} + \ket{11}\bra{11})\rho_{AB}) $$ and we have $$ \ket{\Psi^+}\bra{\Psi^+} + \ket{\Psi^-}\bra{\Psi^-} $$ $$ = \frac{1}{2}(\ket{00}\bra{00} + \ket{00}\bra{11} + \ket{11}\bra{00} + \ket{11}\bra{11} + \ket{00}\bra{00} - \ket{00}\bra{11} - \ket{11}\bra{00} + \ket{11}\bra{11}) $$ $$ = \ket{00}\bra{00} + \ket{11}\bra{11}. $$ Completely analogous, in the Hadamard basis we have $$ P_{\text{match}} = \operatorname{Tr}((\ket{++}\bra{++} + \ket{--}\bra{--})\rho_{AB}) $$ and $$ \ket{\Psi^+}\bra{\Psi^+} + \ket{\Phi^+}\bra{\Phi^+} = \ket{++}\bra{++} + \ket{--}\bra{--}. $$

Now consider an arbitrary that we assume to be of the form . Expressing the -marginal of a single round in the Bell basis yields

The probability that the POVM in step 0 measures is , while by Lemma 1 the probability of succeeding in step 7 of Protocol 3 is given by

In other words , so if the probability of success in step 7 is close to one, say , then the overlap with is correspondingly large, . Of course the converse holds trivially.

However, this does not yet prove that step 0 and step 7 are essentially equivalent. This is because step 7 only tests a subset of the qubits. Intuitively however, since is chosen randomly, this testing step should upper bound the number of errors on the remaining qubits as well. We can make this precise through a Hoeffding-type inequality.

Lemma 2 (Tomamichel & Leverrier). Let and consider binary random variables . Let be a uniformly random subset of of size . Then for any , it holds that

We will let denote the indicator random variable such that if in step 7 of Protocol 3. We then let , and where is the largest tolerated error rate in step 7c. In the asymptotic limit, we may assume that the chosen in step 7a has size exactly , hence Lemma 2 applies. We thus get the bound

which vanishes as .

It follows that step 7 implies that the error rate on the whole qubit string is sufficiently low, which by our previous observations implies that the overlap of with is sufficiently high. This completes the asymptotic security proof of Protocol 3 in the restricted threat model. Since Protocol 3 is less secure than Protocol 2, we can conclude the security of BB-84.

Information Reconciliation

Until now, we assumed that all communication is error-free. Of course this is not the case, in particular not in the quantum case. Do our protocols break in the presence of errors?

The answer is no, because we can perform error correction. In fact, since we distribute classical bitstrings, we may view the quantum protocol as a lossy classical channel and simply do classical error correction afterwards. There are many performant classical error correction codes.

The general idea is that Alice sends a message to Bob through a lossy channel, who receives it as , where is a string of errors. They then exchange some additional information over the channel that allows Bob to make an estimate of based on .

Definition. An information reconciliation protocol for , lets Bob recover an estimate of given .

It is -correct if .

It leaks bits if the total length of the messages exchanged over the channel is .

In the context of key distribution it is important to limit the number of bits a protocol leaks. This is because any leaked bits reduce the min-entropy of the key. Concretely, we have

so if we leak bits, we can still be sure to have lost at most bits of min-entropy.

An important class of information reconciliation protocols are built on linear codes.

Definition. An -ary linear code is a dimensional subspace of .

A linear code is also induced by a parity-check matrix by letting . The syndrome of a message is then given by .

Protocol 4 -- Information Reconciliation through Linear Codes.

Let be a parity-check matrix. Assume Alice has sent a string that Bob received as .

Alice sends over the public channel.

Bob computes and the syndrome of .

Based on the syndrome and the parity-check matrix , Bob determines an estimate of .

Bob computes .

If induces an binary linear code, then Protocol 4 leaks bits, where can be chosen as low as . Of course, the larger our code and hence the smaller the syndrome, the harder the error-estimation becomes.

The correctness completely depends on the chosen code.

Example.

Consider the binary linear code given by the parity-check matrix

where Bob's error-estimation is based on the following table.

+----------+--------------+
| Syndrome |Error Estimate|
+----------+--------------+
|00 |000 |
+----------+--------------+
|01 |001 |
+----------+--------------+
|10 |100 |
+----------+--------------+
|11 |010 |
+----------+--------------+

Then Bob can correct any error of at most one bit-flip. Consequently, the code leaks bits and is -correct for
where is the probability of a single bit-flip occurring (assuming uncorrelated errors).

There is a theoretical minimum to the number of bits leaked for optimal information reconciliation. Suppose and are sampled bitwise from some joint probability distribution . Then the theoretical minimum number of bits that have to be leaked is

The efficiency of information reconciliation protocols is usually given in terms of a constant which says that the protocol leaks approximately

bits.

Examples of codes that are widely in use are LDPC codes, for which is between and .

A one-time MAC is only secure if the key does not get reused between rounds. ↩︎
Which is generally believed, even when taking quantum algorithms into account. ↩︎
The actual signature scheme involves an additional hashing of the message. As described here, it is vulnerable to Existential Forgery. ↩︎