字幕表 動画を再生する 英語字幕をプリント We're going to talk about going from Galois fields to Reed-Solomon codes. We must be mad, Sean. Really, I mean, so many of you have said: "Just do Reed-Solomon, y' know. You've done Hamming codes. They [Reed-Solomon] can't be that much more complex?" Oh yes they can! The stuff we have done earlier - and we've done a fair bit on Hamming codes, which if you remember are basically going to correct a ... put right a single error. That all happened in the 1960s. And in two of the videos out there, which we'll put the links out for you, I did a thing called Multi-Dimensional Error Correction where I had two bits of information (which was a San Francisco state of weather). So there was two bits for the state of the weather, like I think it was '11' for "sunny", or something like that. But in order to protect those bits against a 1-bit damage, I had to add no fewer than 3 extra bits. I had to make it into a 5-bit code. But it was OK, because checking up whether it had been damaged was easy. It was a simple parity check. It was effectively saying : "Look, here's the 3 parity-check bits at the end". I've made the rule that overall it must be even parity. I know that one of them says '1' Another one says '0' but this [third] bit I don't know. But it's got to be even parity overall so if I've only got a single '1'and a '0', I need another '1' to make that grouping be even parity. Now that's the easy error correction. How did it advance from that? What changed? Well, we're talking about 10 years hence. I mean Richard Hamming was very definitely late '50s / early '60s. The next phase of this or, this "big jump forward" thst we would want to do, was by Reed and Solomon [and] was 10 years later. It was late '60s / early '70s. O!h and big surprise: Were Reed and Solomon at Bell Labs ?! No! For a change. Everybody else seemed to be at Bell Labs [but] Reed and Solomon were, I think, at MIT's Lincoln Laboratory. But there was a realization that if you wanted more powerful error correction there were several things you could do but, the more they looked into it, the more they found themselves having to learn about Pure Mathematics concepts called Galois Fields. And this is finite field arithmetic where you've got to be able to find a multiplicative and an additive inverse. We've tried to prepare you for this by doing ISBN book codes, which is a very simple manifestation of those two things. We thought in our early things [i.e. videos] where a code word became a string of bits. And some of those are information bits. And interleaved with them, or parked at the right-hand end very often, were parity check bits. What we're now going to try and do is instead of just having one great long string of bits, linearly, we're going to try and make it be almost two-dimensional. Think of every one of these positions in your code-word now as not being a single bit but a column of bits. Let's say it's an 8 bit column. If you're doing something like Reed-Solomon correction in a CD context, y'know, how big is the bucket size of these columns? I think it's something like 40-odd bits but then even that couldn't cope. They had, like, two layers of Reed-Solomon encoding - one backing up the other. So, if you like, the filling up of these things, instead of linearly going like that - and then at the very far end you put a few parity check bits - what we now do is we declare that every one of these positions isn't a *bit* position it's a *symbol* position but a symbol can be multiple bits, OK? For the sake of argument let's say it's an 8-bit symbol, a byte. The way they get filled up is the bit stream comes in and it fills up a column of 8, and then it fills up a next column 8 and a next column of 8. So it's almost like we've got a 2-dimensional array of bits - and symbols in that direction- but every symbol is composed of 'n' bits, as it goes along. What's the advantage of doing that? Well, you can see one advantage, when you think about it, straight away. Hamming codes, for example, the old way, tended to presume you've got the occasional error, now and then, wide apart. What this is anticipating - if you can fill up symbol positions - is you might get 'burst errors'. Yeah, you might get - here we go - bits coming off a CD, trying to play your music. It's encoded music. Now, if they're filling up a bucket in a column, in some sense, and then moving on to the next one there is just a chance that a localized scratch will get all its 'bit-clobbering' over and done with, within two symbols shall we say. So, we know that we can devise codes that will can detect and potentially correct a certain number of errors but if we can make it so that those errors are not [only] bit errors within the symbol but just the symbol itself ... something's wrong with it! Right? You then might stand a chance if you've got, again, parity-check *symbols* at the far right-hand end - not [just] parity-check bits, if they've got enough information in them, you might be able to say: "Something went wrong. I got a burst error there, as a scratch on that CD. Can I put it right?" Yes, but it's not going to be simple-minded parity checking; it's gonna be serious hardcore stuff. Because those check symbols at the end will normally be arranged so that if the information is correct and nothing's got damaged there'll be zeros. If something gets damaged, the first thing you know about it is that the parity-check symbols at the end are nonzero. You're getting [perhaps] 3, 5, 12, 15. What does that mean? well the answer is using lots and lots of detective work - by the way those symbols that you put on the far right-hand end - revel in the name of "syndrome", which i think is a wonderful word. And my first thought was what on earth are pure mathematicians, or communications engineers, doing with syndromes? They're medical aren' they?! Well, I looked up in the dictionary and as far as I can make out it all makes sense. If you have a certain syndrome it means you are exhibiting symptoms of an underlying problem So the grouping of symptoms that's caused by an underlying problem is often called a syndrome. [There's] something like the Guillain-Barre syndrome isn't there? I don't know what it is but I'm no doubt you'll all put me right on that. But you can see it makes sense. The signal that something has gone wrong is you get all of these information bytes or even bigger columns, multiple bytes, whatever. But right at the end is now a "checksum from hell". You have got a syndrome a set of remainders if you like, that are not zeros. Given only that information, how can you work backwards and find out which of these columns got hit and where in the column it got hit? And the answer is: By using Galois Field theory over finite fields and doing lots and lots of long divisions and additions. So, the bottom line is that for this work and particularly if you're using, as we are of course now, powers of two, Galois said: "I can liberate you from it having to be primes all the time, I can do powers of primes. And for all of you future computer scientists that I'm not even aware of, you'll love this because your beloved 2 comes into the real world because where you say '4' and I say: Don't think of it as 2 x 2 Think of it as 2 ^ 2." And so he said: "Yeah, I can do powers of any prime, including two, but what he didn't say is: Here are the rules if you you light on 2 and say you want to use my methods, here's rule number 1. Everything must be done modulo 2 Not modulo your bigger number - like 256 - that's different. Modulo 2. So, for addition what do we know about addition modulo 2, from Bletchley Park Sean? What do the mathematical logicians call 'addition modulo 2' ? >> Sean: This is exclusive OR >> DFB: It's Exclusive OR! So, no sweat for computer scientists. All our additions of these numbers, represented in bytes or whatever, are going to be done with Exclusive OR. Worse still - as we found out in the ISBN previous example - you've got to be able to find multiplicative inverses. And that's going to lead us into doing long divisions modulo 2 in a Galois Finite Field. Now that is a bit hair-raising but not too terrifying. But if you're prepared in the end to do all of that work then fine. You can use your 'syndrome' and analyze it to tell you where the errors went. That's a lot of equation solving. For those of you into these things it's like solving lots of simultaneous equations saying [for example]: "The only logical solution to this damage is it absolutely must be column 3 and column 13. That's where the damage has come. Does that syndrome always work back to getting your precisely one answer? Yes - it's just magic the way that this error correction works but it is complex because the brute force way to do it is to get a set of simultaneous equations and - some of you will know this - get lots of simultaneous equations to solve you use 'matrix inversion'. That is computationally a very heavy undertaking. And so you're looking all the time for simplifications because, if you don't, you're sitting there, like Sean and I were in the early '80s, thinking: " ... this error correction [for CDs] this is in real time! How is this thing solving sets of matrix equations in real time to make sure I can listen to this CD without hearing the scratches? Where's our supercomputer free with every CD - a Cray XMP [in those days] - to solve your syndrome equations? Nope! what's the answer? You shake head and think: it's those hardware types! It's custom hardware! Yes, custom hardware can fairly easily attain supercomputer capabilities so long as it's in a tightly defined field. And it turns out, thank heavens, that error correction in Reed-Solomon schemes lends itself very nicely indeed to things that computer engineers love like "shift registers" [and pipelining] and all sorts of hardware 'specials'.