They should have sent a complexity theorist

My O(1) readers are probably restlessly wondering where I’ve been, how I survived Israel’s freakishly sweaty summer, and what’s up in general.
Well, the truth is, I did manage to sweat under the Mediterranean sun, but most of the summer I spent in the United States. The official reason, given to the consulate and on my visa documents, was to do a project on “jump-starting a recent protocol on infinite randomness, using quantum-mechanical experiments”. The visa was approved, but I doubt that the people who stamped my passport understood what it was all about; hell, even I didn’t really know what I was going to do. I therefore dedicate this post to the men and women of the not-so-lubricated bureaucratic machinery that made my trip possible, in hopes that it will teach them all they ever wanted to know about unbounded randomness expansion using untrusted quantum devices, but were too afraid to ask. (Further dedication goes to Scott, who kindly took me under his wing and oversaw the project).
(The following post isn’t really technical, but feel free to skip over parts you don’t understand; there’s zounds more text that doesn’t require any fancy quantum mechanical stuff).

Randomness is important in life. For example, the optimal strategy in rock-paper-scissors is to pick each option with probability 1/3rd. This might seem easy, but it isn’t: humans are quite bad at imitating random sequences or won’t do so even if they know it’s best for them (best for them in theory; but then, what else is there?). It’d be much better if we had an infinite sequence of random bits that we could use whenever we wanted to. How do we go about getting such a sequence?
Ask any physicist, and she’ll tell you, “why it’s easy! Use quantum mechanics!” And indeed, quantum mechanics seems to be the place in nature where randomness comes not from a lack of prior information for us humans (i.e, a coin flip is random because we don’t know its precise position or the precise flipping force or the precise air pressure and currents), but is inherent in the physical reality itself. For most part, most reasonable “hidden variable” theories – theories in which the randomness observed in experiments stems from quantities that *are* deterministic, but we just don’t know them – have been ruled out.
So, the easiest way to get random bits using quantum mechanics is to take a quantum state that is in superposition – say a photon with both horizontal polarization (represented as the qubit $|0 \rangle$) and vertical polarization (represented as the qubit $|1 \rangle$) – and just measure which one it is. Thus, the overall state of the photon is the superposition $\frac{1}{\sqrt{2}}(|0 \rangle +|1 \rangle)$, and measuring its polarization yields $|0 \rangle$ with probability 0.5, and $|1 \rangle$ with probability 0.5.
So far so good. In an ideal world, we would be done here. We’d build a nice small black box with a large red button on top. Every time we press it, the box would create a superposed photon, measure it, and output the result. Infinite bits at the press of a button.
But alas, we do not live in an ideal world, and most of us, while avid rock-paper-scissors players, do not have the necessary equipment or training to build quantum mechanical boxes. Of course, in this capitalistic global entrepreneurship enterprise world we live in, this isn’t much of a problem – we can always count on the market adjusting to the needs of the people, and companies selling quantum mechanical random number generators will sprout up like mushrooms after the rain. Hey, they already have.
The problem with these companies is that you can never quite be sure that they are honest. How do you know that they aren’t selling you only a pseudorandom number generator, which uses a deterministic algorithm and a small random seed? There are statistical tests you can run on the output, but we don’t know yet if it’s possible to discern between a pseudorandom output or a truly random output in reasonable time. If they are cheating you in this way, then your entire “random” sequence is vulnerable.
Further, even if the company you bought your box from did give you truly random bits, how do you know that they were created on the spot? Perhaps the company generated a gigantic random string back in their HQ, and just put it in your box. Every time you press the big red button, you get a new bit out of that string. The output is indeed random, but it wouldn’t be secure – the company could sell information about your random bits to the highest bidder, and you would face a gratuitous defeat in the rock-paper-scissors nationals.
These two problems apply to any true random number generators, but if you are using quantum ones there is yet another issue: even if the company did generate the bits on the fly, they could still get information on your bits by quantum entanglement. In a jiffy, instead of creating a single photon in the state $\frac{1}{\sqrt{2}}(|0 \rangle +|1 \rangle)$, they’d create two photons in the entangled state $\frac{1}{\sqrt{2}}(|00 \rangle +|11 \rangle)$: a superposition of “both photons have horizontal polarization” and “both photons have vertical polarization”. One photon they’d put in your box, the other they’d keep for themselves back in their HQ. The rules of quantum mechanics then say that when you press the big red button and the box measures the state – say you got a $|1 \rangle$ – then when the company measures their photon, they also get a $|1 \rangle$. They always get what you got, and again, your information is not secure. This does not involve any communication between the box and the company – it would work even if you put it in an underground vault in Andromeda – its just a peculiar property of quantum states.
So right now things are looking pretty grim: there’s this wonderful startup idea – to produce random bits using quantum mechanical phenomena – but buyers don’t have a guarantee that the companies aren’t cheating them. And we all know where that leads to.
But not all is lost! For if we are allowed to tweak the way our quantum mechanical boxes operate, we can build a statistical test that has nothing to do with ordinary randomness / pseudorandomness tests, and that test guarantees honesty. Boxes which pass the test must produce random bits; they produce these bits on the fly; and they can only have a tiny amount of entanglement with the company HQ, giving the company almost no information about your random sequence. A magical cure for all our maladies!
To see how it works, we’ll look at the famous CHSH game. In this game, Alice and Bob are two players who play cooperatively in the following very realistic scenario: they are both put into separate rooms, and each one is presented with a bit: Alice is shown X, and Bob is shown Y. Based on that bit, they have to output a bit themselves: Alice outputs A, and Bob outputs B. They win the game if

$A \oplus B = X \wedge Y.$

They are allowed to decide on a strategy beforehand, but once they are put into the rooms, they cannot communicate.
Suppose that X and Y are chosen uniformly at random, that is, they each have a 0.5 probability of being 0, and 0.5 probability of being 1. What is the optimal strategy for Alice and Bob – the one which gives them the highest chances of winning?
Here is one strategy: Alice and Bob ignore the inputs X and Y completely, and always both output 0. So $A \oplus B = 0$, and this is equal to $X \wedge Y$ for 75% of the cases: whenever either X or Y is 0. The success rate for this strategy is 0.75.
It can be shown that there is no better deterministic strategy (you can do the truth table, if you want). But then there is also no better probabilistic strategy, since it would just be a convex combination of deterministic ones. So the best Alice and Bob can do, if they are not allowed to communicate, is to win 0.75 of the time.
Well, classically, that is true, but it is not true if they are allowed to share quantum resources. Specifically, if they each have one photon of the entangled state $\frac{1}{\sqrt{2}}(|00 \rangle +|11 \rangle)$, then once they are shown the bits, they can measure their photons from a set of agreed upon non-trivial measurement, and output whatever their measurements give. Their outputs will be completely random (individually), but correlated with each other. If they choose the right measurements, they can boost up their win rate to 0.84 (!). This is the best, known strategy that does not involve communication between the players. (For a billion details, see here)
But wait, there’s more! The CHSH game is robust, in the sense that if Alice and Bob have a success rate very close to 0.84, then with high probability they are using a strategy that is not very different than the known optimal one. This means that the bits they output are very close to random (what does it mean “very different strategy”? There is a formal definition which we won’t go into here, but as an example, the strategy “0.01 of the time output 0 and 0.99 of the time use the optimal strategy” is very close to optimal; so is “measure in a slightly different way than the optimal, so the correlations are changed just a bit”).
We now have a test for our random number generator box. Instead of having one big red button which measures a $\frac{1}{\sqrt{2}}(|0 \rangle +|1 \rangle)$ photon, we’ll ask the manufacturer to give us two boxes. These boxes will act as Alice and Bob: they will accept as input two random bits, and output bits of their own. We can play the CHSH game many many times, and measure their success rate: if its very close to 0.84, then they cannot have coordinated their outputs in advance; they cannot have used a deterministic algorithm; and the company cannot have a lot of information about the output (ok, we haven’t shown this here, but in a jiffy: in order to have information about it, the company needs to be entangled with the two boxes; but almost all the entanglement is “used up” in order to boost the win rate from 0.75 to 0.84, so “nothing is left” for the company to know).
This is what is commonly called (well, common among the relevant community) as “certified randomness”- the CHSH game can be used to test the randomness of our devices (in fact, there is a broad class of “XOR games” that can be used – games which are similar to the CHSH, but may involve different requirements or more players).
We would really like to say that we are done here, but the keen eyed among you must have already noticed the bug in the plan. We have a pair of boxes that, when given two random bits, output two correlated bits. We need random numbers just to test if a random number generator works. What’s worse, we put in two, and get back less than two. We are actually consuming bits in the process! Alas, the market for quantum random number generators is much more blooming than the one for quantum random number extinguishers.
But not all is lost! If we are allowed to tweak the inputs to the pair of boxes, we can create a test that uses less random bits than it puts out. The main concept is as follows: we still let the boxes play a lot of CHSH games, only now, instead of having totally random X and Y (requiring 2 random bits per game), we alter the input a bit: Most of the time, we’ll have X = 0 and Y = 0. This is like a “dud” game, and if the boxes anticipate this, they can potentially output 0,0, as described before. However, for a very small percentage of randomly selected inputs, X and Y are selected at random, as usual; these are called “real” games. On these games, if the boxes are to perform with high win rate, they have to be honest – they have to actually play the winning CHSH strategy. The point is that the real games are chosen at random, and the boxes have no idea which ones they are. If they play assuming the X = 0, Y =0 dud games, they run the risk of falling on real games and winning with only 0.75 probability. The trick of these tests, then, is to find out how to distribute the real games among the duds, how strict to be when deciding if the boxes pass the tests, etc. This type of test is called a randomness expansion protocol, in that it takes requires a certain number of bits (for choosing which games are duds and which are real, and also for the inputs of the real game), but outputs more than was used. Both polynomial and exponential expansions have been developed, and more recently, even infinite expansion! The latter is generally done by back-feeding the output as input for the boxes, but the details are a bit complicated, especially the whole “proving that it works against entangled opponents” thing. It means that you can start with a finite string of randomness (say, one you obtained from a trusted source), and expand it into one as long as you wish! There will be errors, but they grow exponentially smaller the more initial bits you use.
Personally, I think this whole thing is really cool. If you trust your quantum box, then generating an infinite random string is as easy as $|1 \rangle$ $|2 \rangle$ $|3 \rangle$. But even if you don’t, you can still obtain an infinite random string. It requires a bit more pyrotechnics, and it requires you to somehow obtain a small number of random bits elsewhere, but it’s possible. And actually, despite the fact that we called our boxes quantum, they don’t necessarily have to be. All they have to do is win the CHSH game with probability close to 0.84. Quantum mechanics isn’t the final say in physics; maybe we’ll find better theories which supersede it. Any mechanism which wins better than the classical 0.75 can be used in this type of protocol.
And that’s pretty much the gist of “a recent protocol on infinite randomness, using quantum-mechanical experiments”: a method to use untrusted quantum boxes in order to take a small number of random bits, and turn it into an unbounded number of (nearly) random bits. That’s it.
Where does your humble servant come into play in this whole ordeal? A very tiny part, and that’s the “jump starting” in the clause “jump starting a recent protocol on infinite randomness”. An initial random string is needed in order to run the protocol, and the question is: how large is that string, as a function of the errors you are willing to tolerate? (there are plenty of places where errors accumulate, but I skipped the discussion of errors and robustness because it really only complicates matters, and it’s complicated enough as it is).
So that’s what I set to find out. I basically gave bounds for various expressions described in the protocols which relate to the number of random bits outputted. The answer? Well it depends on the error, of course. But, let’s say, the bound I got is on the order of O(1,000,000) for reasonable error. For the rest, you’ll have to read my paper, I guess.