字幕表 動画を再生する 英語字幕をプリント [MUSIC PLAYING] PING YEH: Hello. I'm Ping Yeh of Google at Quantum team, and I'm going to talk about the statistical significance of the quantum supremacy experiment with our Sycamore processor. So a quick reminder on statistical significance, you start with a null hypothesis, H0 or H null, which means that there's nothing interesting. And you have a statistic called F, and also a probability distribution function of F given H0. Then you go ahead and measure F in your data. Let's say you come up with a value of F hat, and the tail probability gives you the p-value. And if p-value is smaller than a pretty fine significance level alpha, then we say it is statistically significant. And we reject H0, OK? So that's how it is. And for major scientific claims, we usually set alpha to a so-called 5 sigma level for Gaussian, which corresponds to this value. So a question is which null hypothesis are we talking about rejecting for a [INAUDIBLE] purpose? OK so before going to that, I have a new hypothesis for my talk. So hopefully, you can help me reject it at the end. OK? So the null hypothesis for the quantum supremacy-- the value F here is the fidelity of the Sycamore processor for a circuit. So the first one is that F is consistent with zero. That means the processor has lost quantum coherence. And the second one is that F is not zero, but it's low enough so the classical simulation is easy. So that means no supremacy. OK, so we want to recheck both of these, no quantum and no supremacy hypotheses. What we do is, of course, we follow this p-value thing. And here, apparently, if you could reject the second one, the first one is rejected. So we set our H0 to be the second one, and we set this threshold to be 0.1%, which comes from a complex analysis of classical simulations. So at which value the simulation should be already hard enough? OK? And we want to reject it. That means we are significantly above it. All right, so the tail dr is that with 53 qubits, 20 cycles of circuit, and 3 million samples per circuit with 10 different random circuits would come up with an F hat of this value, which corresponds to a p-value of about 6.4 sigma in Gaussian. OK, so that, of course, is above 5 sigma, so that's good. And of course, there is a systematic uncertainty on the value of a hat. So there's an uncertainty here. So we estimated the uncertainty to be 4 times 10 to the minus 5. And the p-value with that distribution here of F hat is estimated to be 2 times 10 to the minus 10, which corresponds to about 6.2 sigma in Gaussian. So again, both null hypotheses are rejected. OK? So if you are interested in knowing how we came up with those numbers and this function, let's continue. So there are a few factors in coming up with-- in getting this p-value. First is the distribution function of the H0. The second is the estimation of F hat. And the third is the distribution around F hat. So let's try to get those. First of all, the data set I used for analysis can be downloaded here, bit.ly/quantum supremacy dataset. So Google's quantum supremacy experiment is based on the quantum circuit sampling. So this is an illustration of a random circuit. And at the end of the circuit, we come up with a wave function, psi, which is a linear combination of, of course, 2 to the n different computational basis states. So you sample those bitstrings from this n state many times, and the probability of sampling a particular bitstring is basically just [INAUDIBLE] squared. OK, that's standard quantum mechanics. And for random circuit, those probabilities actually follow a distribution. a so-called Porter-Thomas distribution. And here I'm using a variable called scaled probability, which is the dimension of [INAUDIBLE] space times the probability itself. Then the distribution becomes a very simple exponential distribution, which is independent of the number of qubits. OK? It's easier to analyze. All right. So now we do sampling. So typically we sample about millions of bitstrings for each random circuit. And for 53 qubits, that's, of course, much, much smaller than the [INAUDIBLE] space. So it's a tiny sampling. There are two different sampling strategies that we are interested in. The first one is a uniform random sampling. That means each qubit gives you a 0 or 1 in a 50%, 50% chance. And then the bitstring you sampled and the x value-- I mean, the scaled probability value you get-- will be distributed according to the population distribution, which is the Porter-Thomas itself. And this is what a decoherent quantum computer would give you. And if it is a perfect quantum computer, then the bitstrings with higher probability will be sampled more often. So the distribution becomes x times exponential. OK? So I call these two distributions P1 and P2. And it so happens they look like this. And the average value of P1 is 1, of P2 is 2. And this comes in very handy when we want to estimate fidelity. So we have error model, which is a linear combination of the perfect density metrics and a totally random metrics. So the corresponding distribution of the scale probability goes like this. It's also a linear combination of the two distributions. And if we want to measure-- or we can calculate a mean value out of this distribution, you can find out it's actually just very simple. It's an F plus 1. So that means that the mean value of the measure x is a fidelity estimator. And this is our so-called linear cross-entropy fidelity formula. OK? And now we want to see whether-- how this x distribution looks like. So we took data from a Elided circuit with 53 qubits, 20 cycles, and 3 million measurements. Here Elided circuit means we remove 22 quibit gates out of this circuit to make the computation in classical computer as possible. And we estimate the fidelity to be that value, 0.18%. OK, the next is we want to see whether that distribution looks like what we predicted with the survey. So we overlay that. OK, just by eyeballing it, it looks similar. We want to quantitatively measure how similar they are with each other. So we use the Kolmogorov-Smirnov test. So it will give you a p-value of the kind that you can interpret as a probability, that the data is drawn from this distribution function. So a p-value close to 1-- in this instance, 0.98 means that it's very close. We have high confidence that this is from that distribution. And if we change the theoretical distribution from the estimated fidelity to, for example, 0 fidelity, the p-value goes down to very low. OK? So we have confidence here that the model PDF is actually a very good description of the data. All right, and the next is we can try-- move on to estimated statistical uncertainty on the [INAUDIBLE] fidelity. And because it is an error-- it is a mean, so error on mean is kind of the standard way to do that. So from data, we estimate to be this value, and from the theoretical PDF you can also estimate. And you find out that there is an excellent agreement between the two. So that means that theoretical prediction can be used actually for our new hypothesis distribution. And furthermore, we verify the statistical uncertainty by bootstrapping because we have this central limit theorem that the distribution of the mean value should go with the Gaussian when you have-- when the number of samples goes to infinity. So we perform 10,000 bootstraps. Each bootstrap sample contains 3 million samples. And the mean value, or the fidelity from each bootstrap sample, is plotted here. And this is the histogram of them. So it indeed looks like a very good fit to Gaussian, and the width, the standard deviation, is very close to the estimated one. So here we know that OK, the new hypothesis PDF is a Gaussian with our theoretical-- I mean, this structural fidelity of 0.1% and standard deviation of theoretical prediction. All right. Now we have more than one random circuit. We have 10 of them. So we can combine them. And there we used two different ways of combining them, and we get I think basically identical results. And with a combined sample, we can again test the agreement between theory and data. And we can see that with this combined sample, there are 30 million samples. So the p-value is still reasonable, 66%. But if you say what's the p-value for structural fidelity, it becomes very, very low. And for 0 fidelity, it's even lower. So this give us more confidence that the combination process makes sense. All right. So the next is we need to go into a supremacy region, where the classical computation of those probabilities is not possible. But nonetheless, we need to estimate the significance of full circuit. So we go to a lower number of qubits, from 12 qubits to 38 qubits, and check the ratio between full circuit and [? Elided ?] circuit in a similar way of the [? Elided ?] circuit in 53 qubits. And we found out the ratio of these two fidelities is about 97%. So that is a factor we apply to the combined fidelity for our estimate of the full circuit fidelity, which is this value. And then the next is the systematic uncertainty. So there are many, many sources of uncertainty here, and they are captured in one big number, which is the drift, so how fidelity drifts with time after a calibration. And here we took data for 17 hours on the same random circuit. And we found out it drops down-- not too much, but kind of visibly. Within this range, I think, the linear fit seems to be working OK. So we use a linear fit. The data of supremacy experiment is taken in the first hour. So we use the variations of the residual in the first hour, plus the variation of the intersect as a variance of the fidelity itself. And we treat that as a systematic uncertainty. And we take that ratio and multiply the ratio to the estimated fidelity to be the final estimate of the systematic uncertainty. So we get that number. So now coming back to this factor, all the factors for coming up with calculating the p-value, we have all of them estimated. So it's straightforward to plug them in to get a p-value. So for this tail probability of F hat, we get a p-value of this number, which corresponds to about 6.4 sigma in Gaussian. But then, this is one I've had. We do have a systematic uncertainty here. So how do we deal with that? So the way we deal with it is that OK, we can try. For example, we subtract by 5 sigma and see what's the p-value here. Of course, p-value is higher because you're integrating with more area. But then there are more infinite number of possible choices of the value for checking the p-value. So how do we do that? We found out that actually we can do an expectation value of p-value by integrating the p-value with this Gaussian distribution around F hat. And after we do that, we'd get a p-value of 2 times 10 with minus 10, which corresponds to about 6.2 sigma in Gaussian. So that's our final p-value for the whole quantum supremacy experiment. OK, so conclusion, both null hypotheses are rejected, with more than 5 sigma of statistical significance, along with [INAUDIBLE] several checks that give us some confidence on those numbers. And the dataset's available here. OK? And coming back to the null hypothesis on my talk-- so please help me reject this hypothesis by leaving comments below. Thank you very much. [MUSIC PLAYING]
B2 中上級 量子至上主義の統計的有意性の推定 (QuantumCasts) (Estimation of statistical significance of quantum supremacy (QuantumCasts)) 2 1 林宜悉 に公開 2021 年 01 月 14 日 シェア シェア 保存 報告 動画の中の単語