字幕表 動画を再生する
I think, as far as I know, it was Brian Kernighan and Dennis Ritchie who first
introduced it to me. I don't if it goes back earlier than that, but certainly in
the C book - there it is: 'printf ("hello world\n")', you know, and the use of '\n'
to denote a new line at the end of it, and all that. It's now really become a
part of Comp Sci legend. The first thing you do when you show that you've
mastered [or just begun] a new language be it Python or whatever, you know, "Oh yes! here's
how to do 'Hello World' ". Of course, "Hello World" is a characters-based challenge.
And from what we now know about characters - in modern computers at least -
being stored in addressable bytes - does it sort of follow then, that "Hello
World" would be somewhat easier [at low level] on a byte-based machine? Oh yes! it would be a lot
easier on a byte-based machine. But there's other things as well. So as, perhaps, an
illustration of just how horrible it could be - and given that we have done
some stuff on EDSAC already - let's go and do that. If you haven't seen the
other EDSAC stuff I think you'll be able to follow what I'm doing anyway.
And you could always go back later and pick up some more background about EDSAC.
But when we were on this EDSAC simulator, the last time, we actually did
run the program that Martin Campbell-Kelly supplies with it. And he got fed up of
doing "Hello World". He said: "I'll just do a brief version that says 'HI' ". We did that.
Thanks to a combined programming effort now, by those in this room, I have here
the new version "HelloWorld_SR_DFB.txt" And there it is. It's quite a lot
longer, of course, than the previous one was. >> Sean: So, is each of those lines using a word, then ?
>> DFB: Yes. EDSAC was designed around the most minimalist set of things
It was basically ... the story was ... if it's possible to do [it] with what we've got
already, then don't start inventing new flavours of
instructions. So, all you've got here is - this is the stuff of course for setting
up where the load point is, and where the relative offsets of these
addresses is, relative to 64. The '@' symbol at the end [of an instruction] signals to David
Wheeler's Initial Orders that what comes here is a relative address. So what it's
saying is: letter O - not a zero - "output the character which you can find in the
memory location 16 further on than 64 is". So, all these offsets: 16, 17, 18, 19, 20 are
all relative to 64. So in actual fact, then, it turns out that address 80 holds
the very first thing you want to output. And of course 16 on from 64 .... well if 64
is here this is where the actual data starts. The 'ZF' and the things like that
correspond to what are nowadays called assembler directives. It's not always the
case that these things go one-for-one into occupying a word. Some
of them are messages to the assembler. All the stuff up here is basically
saying: "I want you to remember 64 and start locating everything relative to that".
>> Sean: Because if we look specifically at the line numbers on the left there that
wouldn't be the place you're trying to get to, right?". >> DFB: No, this stuff up here is
what would probably be done in modern assemblers by saying something like "ORG = 64".
[where ORG = "origin"] In other words that isn't a program instruction. It's telling you, the
assembler: "Please start me [loading] at 64". And it's for your own [assembler] internal knowledge. It's not
to be translated into a program instruction. So the ZF says "Stop" - stop
execution. But in the meantime what we're expecting is the thing that is 16 on
from 64 will actually get us to here for *F. What does *F do? * is a short
code for saying "Put yourself in letter shift". Veterans of 5-hole paper tape will
know - you've got to make sure that you're in letter shift to print meaningful
messages. The other possible shift is figure shift and all hell breaks loose
if you start forgetting to shift out [of that]. It's just like the shift key on a typewriter,
that's where it comes from historically >> Sean: Can you use that as a very, very
simplistic code ?! >> DFB: [laughing] Yes! Possibly! Anyway, so turn into letter shift and, look, this
makes sense now! Can you see HF in one [single-length] word? F means: "This
is a single length word". Yeah, 18 bits. Actually the op-code field for those
who've got the EDSAC tutorial. The op-code field is occupied by an H
but the O command will output these [bits in the op-code field] as if they were characters, - and meant to be characters.
They've got to be in the op-code field but the O command says:
"Look in the opcode field". Regard it - as not a Baudot character, remember Maurice
Wilkes had invented EDSAC code - subtly different but never mind. And it's so you
end up coming to here and saying: "Oh! it's a letter H [that] I am to output when this O
instruction, with a relative address offset on it. And you go all the way. Look
here H-E-L-L-O. What's the exclamation mark? Look it up in the EDSAC tutorial, as I
had to do. That's the marker you put in if you want to force an explicit space
between HELLO and WORLD. Which we did. And we finally ... what are @F and
&F after the 'D' of "HELLO WORLD"? Well, let's take a guess. We're trying to
be neat and tidy - make it look good - that's the code for "give me a carriage
return; give me a line feed". And then we say "end of the whole thing; end execution".
And this is a marker also to Initial Orders:
You can stop relocating this program for me. I'm done.
OK. so that - since it's on top now - Oh! - fingers crossed Sean - what do we do? We do
Start don't we? We noticed that, way back up at the top [of the program], we put in a Stop, just to make
sure. Because [puts on 'ironic' voice] with our incredible knowledge of EDSAC binary. Sean and I
can see, straight away, [looks at oscilloscope display] that that, of course, is HELLO WORLD. I mean,
we're not kidding. David Wheeler would know that it said HELLO WORLD. I'll tell you
something else, Sean. After only half a day's familiarity with this,
John von Neumann would know that that was HELLO WORLD! He'd find it so
comfortable to remember the details of the binary. Y'know, I'm sure he would!
I really do. So, here we go then. Let's do a Single
EP, a single instruction, Single Shot, it's sometimes called nowadays.
Right! There we are! It's still blinking. We turned into letter shift with that click,
next click 'H'.Oh! isn't this wonderful Sean?! Aren't we demon programmers?! E-L-L-
O-space. Yes! W-O-R-L-D- carriage return - line feed. So, that was pretty
painful! Although the T64K gives you relocatability -
[e.g.] you could change that to be T256K, say, if you wanted to - [i.e.] shove the
whole thing up memory and then maybe turn it into a subroutine? You want to
push it somewhere else in memory. So, the bulk relocation, against the base address,
is taken care of by Initial Orders, but you've still got to get the offsets
right. And it's painful! It's utterly, utterly painful. We're now
gonna jump forward [in time] into safe byte-addressed territory, for handling
characters, and [focus on] the ARM 32-bit ARM chip, which we use for
teaching assembler programming here [at Univ of Nottingham] to our first years [undergrads.] Yeah, it is a 32-bit word,
broken up into four bytes, 8-bit bytes, which of course use ASCII not IBM EBCDIC
Fine, so down at the assembler level for the ARM, then, what does
the byte addressability give us and what other things have happened between the
EDSAC era and this era, where we're talking late 80s, 90s - this sort of
thing. What else has happened to make this {ARM assembler] thing so much more compact, so much
easier to understand and so much more flexible? Well, let's go here through, step
by step. Comments: anything after a semicolon is a comment. I've put a
comment up at the top saying to put out the "Hello World", we've used the so-called
software interrupts - the system calls - as provided by the
University of Manchester's KoMoDo ARM development environment, which is what we
use. So when we get to actually printing the character out, don't get worried by
SWI, it means 'software interrupt', to ask the [KoMoDo] operating system to print something for
me, or something like that. So let's start up here. Programs on the ARM will
cheerfully expect - if you don't tell them otherwise - that they will start executing
at line 1 of your program, and go madly on. I put this data for "Hello World"
up at the top of the listing. Not at the bottom as I could have done. But the rule
then is: if I declare "Hello World" here, as being a piece of text, and this DEFB
here means ' ,,, just define a bunch of bytes'. And you put them in " quotes like you
would in C. And even - taking over some of its story from C - it even allows you
to ask for a newline to be put in there with \n. And the only difference
is whereas C implicitly plugs its strings with a null character at the end,
ARM doesn't do that for you. You must explicitly put in a null character at
the end of your string - if that is your stop indicator. But in order to stop the
ARM chip executing "Hello World" as if it was bit-patterns for instructions - which you
don't want - you want to jump past it, I've put in here, look, an unconditional branch
to [the label] 'main'. Branch to 'main'. Aw! now this is wonderful! You don't have to say branch
to an absolute address and be like David Wheeler and John von Neumann and have
them all in your head, you just say: "Let's label it 'main' and this thing called 'an
assembler' will work out what 'main' means in terms of the address you want to jump
to. Isn't that wonderful! [In fantasy] von Neumann stares at you and says: "That's for the weak-brained
who can't keep track of their addresses!" Y' know! Anyway, so, we branch to 'main'
and the first thing it says, very self-evidently, really is:
"Get me the start address of the text string and put that start address into
register 1 [r1]". Next thing we notice - as long promised:
modern CPUs [like ARM] have [typically] 15 or 16 special-purpose registers to make life
bearable. EDSAC didn't - it only had the accumulator! And if you wanted other
storage places, you had to start parking it in memory, in all sorts of horrible
ways. So, that helps us straight away: r1 is going to be our so-called index
register; it's going to start off by pointing at the address of 'H'. Now I don't
know what the byte address of 'H' is. It might even be relatively zero here. It's
the first thing that happens in this program. But whatever is the actual
bytes address of 'H' is now in register 1. Here is the crux of the whole thing:
LDRB [B=byte] "load into a register the byte specified as follows; here I say r0,
that's the register I want to load it into. But where does it come from?
In square brackets [r1]. That says look in r1 and you will find an address of the
start of that string. I don't want you to load the address into r0, I want
you to load the character that is at that address into r0. It's [called]
"indirection" and that is indicated by that square bracket [i.e.] not putting the address
that's in r1 into r0; I'm following the pointer from r1 saying:" Oh! that's the
letter 'H'at the moment and that's what I put into r0. And here's the other
cute thing at the end - wouldn't those pioneers have given the world for this -
is to say: " ... and when you've done that, please, for next time around the loop
increment that r1 address by one". So, if it was pointing at 18, shall we say to
start with, it's 19 now, for next time around the loop. So you keep on going
around that loop. And here's the thing where you check whether you've hit the null
character: "Compare the contents of register 0 - which would be a character
contents - against literally 0, which is what the null character is. Now, is the
answer "yes" or "no"? Is it equal, or not equal, to 0. And here's another lovely
thing about the ARM chip that Steve and I love
dearly. This is the 32-bit ARM chip - I think in the 64-bit one they've [decided] it's not
so important to do it nowadays. They have a thing in the 32-bit one called
'conditional execution', which can save you often using a branch instruction, which
are relatively expensive in pipeline terms. So here we've got SWINE- which is
wonderful! Software interrupt 0 says " ... punch out this character for me on the
display, on the screen". But NE says: " ... but do that only if the last thing you did
didn't yield 'equal' [so it's 'not equal'] Well, we're checking for the null character. So, as
long as it wasn't the null character it'll say: "No - I'm not equal to the null
character". And you print it out and out it comes, character by character. After
that, of course, you loop back to go around and print another character,
remembering that the #1 has incremented your address pointer along
that string. So you keep on going round here you don't have to remember what [the]
address 'loop' is. You don't know! [But] the assembler knows [and] it fixes it up for you.
And then, right at the very end, the way to say: "Stop execution - I've done it"
SWI flavour 2, on this emulated environment says "Stop it completely". The development
of that from EDSAC? You think "Oh! my golly, I am so pleased I've got that!" And Martin
the inventor of the EDSAC simulator here, I emailed him the other day and he came
back to me and said: "Yes, the need for an index register was realized so quickly
that that's why my [EDSAC] emulator is [only] early '49 to late-ish 1950, because in late 1950
David Wheeler and everybody said "My golly, we need an index register!" And they built
one in. So, in a way then, this is what is happening. It's that the pioneers were
using their early machinery to lead the way into saying: "What extra
facilities do we need to make life tolerable for us?" Now, there is the
hardware facility of having the index registers and they've just become
standard kit, afterwards every other machine has index registers. But also what interests me is
the role of a proper assembler. Initial Orders II is not a full-blown assembler.
It helps you a little bit by turning decimal addresses into binary but you
have to remember that that letter A - that you put in the leading five bits -
could be the character 'A', but if you're regarding this as an instruction,
that's an ADD instruction. So, but then Initial Orders II is relocating; it's
relocating; doing a bit of binary translation; it's a single-pass process;
it's wonderful! The problem with assembler is it has to be a two-pass
process. The trouble always is that if you jump back to labels you
have already seen, you will know already what address that will be at. But it's when you
jump forward. How do I know where the heck that label down there is gonna be [in address terms]?
I don't even want to calculate it! I want the assembler to say: "Oh! I'm on location 186
now - how handy!" But then it can't fix up the addresses till it knows and has
counted its way through the program. So then it says: "Right,
I will now output you a definitive thing - that you can put in through David
Wheelers Initial Orders II - because I've made it so much easier; because I've
allowed labels. One doesn't think of labels as being a structuring convention
and yet at this low level they are, in a way. Because this [label] is saying 'loop' - it
starts here. Another label. Oh! it ends here. Please calculate the addresses of
what's happening there and fix it up for me. And so you might say: "Well, all right
but didn't everybody say 'we must have assemblers it's the modern way to do
things' ?" There were very mixed views about this. And I don't think EDSAC got
an assembler until EDSAC 2 - when another friend of mine, David Hartley
did, I think, a macro-assembler for EDSAC 2 - not EDSAC 1. Because there's a
story here related to von Neumann as well. I don't know whether it was EDVAC or
his version of EDVAC that he had in his basement (called Johnniac ?!). Apparently he
really berated a grad student who wrote an assembler. [Invented quote] "Assemblers are for the weak-
brained who cannot work out their own addresses! You do realize that in
running this assembler of yours - punching out a paper tape - I'm behind you in the
queue. I don't get my turn next! You come to me and say: 'Ah but this is ready to
load now, in the second phase, as absolute binary' You're wasting time! If you're so weak-brained
you can't program in absolute ... [I'm putting words in his mouth !! ]. But this was
essentially it. He, no doubt, had dreams in Absolute Binary. There was no problem
with John von Neumann about coping as close to binary as possible. He could
keep it all in his head and he would, I think have found Initial Orders on EDSAC
about, yes, nice and helpful. Single pass Not slowing down things a lot. But an assembler!
You're wasting time on this machine! By doing assemblers. I mean it really really
brings it home to those of us who always joked about, y'know: "Real Programmers
use Assembler" The answer, certainly from John von Neumann - possibly even from David
Wheeler - but he wouldn't have been as extreme as that - is: "Real Programmers use
Absolute Binary!"