Placeholder Image

字幕表 動画を再生する

  • [MUSIC PLAYING]

  • SPEAKER 1: All right.

  • So this is CS50.

  • And this is the start of week 3.

  • And you'll recall that last time, we introduced a number of new ideas

  • that we'll now start taking for granted as being already under our belt.

  • In particular, we looked last time at debugging more generally.

  • And recall that debugging is the process of removing bugs from your code,

  • removing mistakes from your code.

  • And we proposed these three CS50-specific tools.

  • In particular, help50, which is a command line tool.

  • Something that you can run at your keyboard

  • and actually run it before your own command.

  • So that if you don't understand some error

  • message from a program you're compiling or a command you're running,

  • you can actually see some helpful output from CS50 staff.

  • eprintf, which, by contrast, is something you can put into your code,

  • like printf, and print a message to yourself.

  • And nicely enough, it's going to actually include the line number

  • and the file name from which it came.

  • So you can identify where some output is coming.

  • And then lastly, debug 50 which is the most sophisticated of these programs.

  • And debug 50 when you run it at the command line,

  • followed by the name of your program in the usual way,

  • it's actually going to open up what's called a debugger.

  • A graphical user interface that CS50 staff put together

  • that will actually let you step through your code,

  • see variables, and generally understand it better.

  • So we also looked at this in the context of Ovaltine

  • in particular, and Little Ralphy.

  • And we looked at cryptography, the art of concealing information.

  • And indeed, among the challenges in CS50,

  • ultimately if you choose to accept them, are

  • things like implementing Caesar cipher and Vigenere

  • cipher, cracking passwords.

  • So indeed, those problems lie ahead and draw inspiration from this domain.

  • And then, we focused more technically on what a string is.

  • So Zamyla, a name we keep taking for granted, that a computer can express.

  • Z-A-M-Y-L-A, as a string or a phrase in a program.

  • But it turns out that underneath the hood,

  • strings are a little more mundane.

  • They really are just this grid of information.

  • And we might draw Zamyla's characters boxed in like this.

  • But if you think of what's inside of your computer itself,

  • you have a whole bunch of memory, or RAM.

  • And inside of that RAM, you have Z-A-M-Y-L-A,

  • but then this special character.

  • What is this /0 used for.

  • Why is there this seemingly strange character at the end of Zamyla's name?

  • Anyone?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yeah, to denote the end of the string.

  • Because after all, if we were to have someone else's name here,

  • if the next string I got is Maria, for instance.

  • M-A-R-I-A, we need to be able to know where Zamyla's name ends

  • and Maria's name starts in memory.

  • Because at the end of the day, C, and programming in particular,

  • isn't really all that sophisticated.

  • At the end of the day, the only place you have to store things

  • is your computer's memory or RAM.

  • And each of these boxes represents a byte of memory inside of your computer.

  • And so if all this is-- so far as the computer cares--

  • is a sequence of six alphabetical letters,

  • the computer needs to be told or given a hint as with this /0,

  • which just represents 8 zero bits, as to where Zamyla's name ends and something

  • else begins.

  • And we use this as an opportunity to introduce ultimately

  • command line arguments.

  • So in particular, we've been writing programs thus far

  • with just int main void.

  • And void, we realized, signifies that a program takes no command line inputs.

  • It can still call functions, like get string and get int and get [? flow. ?]

  • But when you run the program, you can't put any additional words at the prompt

  • after it if you've declared main as taking void as input.

  • Nothing as input.

  • But we introduced instead this, int main int arg

  • c string arg v open bracket close bracket, which

  • means main can alternatively take two inputs from the so-called command line

  • when you run it.

  • It can take arg c and arg v. And what was arg c?

  • What was arg c?

  • Arg c.

  • Yeah, argument.

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yeah, the number of command line arguments.

  • Arg c for argument count.

  • So how many words did you type at the prompt?

  • And arg v means argument vector, which is

  • another way of saying it's an array, a contiguous block of memory

  • inside of which are all of the words themselves

  • that you might have typed at the prompt.

  • So this time, we're going to build on that.

  • We're going to focus today mostly on ideas.

  • Mostly on algorithms.

  • Mostly on problem-solving and less on the lower-level implementation

  • details of code.

  • Though, we'll look at a couple of examples.

  • So this more generally is what's called an array.

  • If you think of your computer's memory as just a block of memory,

  • block of bytes.

  • And maybe it's byte 0, then byte 1.

  • Then, byte 2, just because you can number them, certainly.

  • If you think of them that way, this is kind of an arbitrary generic data

  • structure, if you will.

  • A generic way of just storing things underneath the hood.

  • Now, what do I mean by that?

  • And why is this useful?

  • Well, consider something like this.

  • Consider having 8 doors, behind which are some numbers.

  • And the goal at hand is to solve a problem.

  • Find a number behind these doors.

  • Now, unfortunately, you don't know in advance

  • what numbers are behind those doors and where they are.

  • Indeed, that's why the doors are closed.

  • But you know that, say, the number 50 is somewhere behind those doors.

  • So what do we do to actually find this?

  • Well, let's go ahead and do this.

  • Let me go ahead and summon up a brave volunteer if you would.

  • There's a CS50 stress ball in it for you.

  • OK, come on up.

  • You want to come up on up over here?

  • So we have not 8, but rather 7 doors here.

  • Even easier.

  • So 7/8 the size.

  • And what's your name?

  • AUDIENCE: Derek [INAUDIBLE].

  • SPEAKER 1: Derek?

  • OK.

  • David, come on up.

  • Nice to meet you.

  • All right.

  • So here we have 7 doors.

  • And the goal at hand for you is to simply find the number 50.

  • How do we find it?

  • What are you going to do?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: It's behind one of these doors, I think.

  • AUDIENCE: All right.

  • I'll just try a door?

  • SPEAKER 1: Try a door.

  • Not the door.

  • So 16 turns up.

  • OK.

  • Stress ball gets one step away.

  • Try another.

  • 42.

  • Answer to life, the universe, and everything, but not this question.

  • 50.

  • OK, very well done.

  • So a round of applause if we could for Derek here.

  • Very nicely done.

  • OK.

  • And let's pause for just a moment.

  • What was the algorithm?

  • What was the set of instructions that you used to find the number 50?

  • AUDIENCE: Just try and see if I could find any sort of arrangement.

  • But otherwise, try random doors.

  • SPEAKER 1: OK.

  • So try random doors.

  • And indeed, given no additional information from me,

  • that's the best you can do.

  • Just try random doors.

  • Or, if it's a little easier, just go left to right or right to left.

  • But random is perfectly equivalent to all of those algorithms

  • because you don't know anything necessarily a priori.

  • So how about for two stress balls?

  • Find the number 50 this time, but the additional ingredient

  • is that we have taken the time in advance

  • to sort the numbers behind these doors.

  • Does that change your thinking in any way?

  • AUDIENCE: Well, I don't know whether you're

  • sorting increasing left or right.

  • SPEAKER 1: Good.

  • So good challenge.

  • He doesn't know if we've sorted from left to right or right to left.

  • Left to right.

  • So generally, a computer scientist when he or she says "sorted"

  • would say left to right.

  • But reasonable question.

  • AUDIENCE: Left to right increasing or decreasing?

  • SPEAKER 1: Left to right increasing.

  • Yes.

  • It would be left to right in any way.

  • 16.

  • Not 50.

  • So what are you going to do?

  • AUDIENCE: Well, let's see.

  • I know these have to be less, so I can remove them.

  • SPEAKER 1: Good.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: 42 is still not 50.

  • And to the right.

  • Very well done for Derek.

  • OK, come on.

  • OK.

  • They're less impressed that time.

  • So what did you do, though, that second time that was actually more effective?

  • I mean, frankly, at first glance it looks like this approach was just as

  • fast or just as slow as your first approach

  • because it still took you through steps.

  • AUDIENCE: So what I did was since you told me that it was sorted left

  • to right increasing, I decided to look at the middle number first

  • and see whether it was greater than or less than--

  • SPEAKER 1: Good

  • AUDIENCE: --the target number.

  • SPEAKER 1: Good.

  • AUDIENCE: And since it was less than, I knew that all of these three values--

  • SPEAKER 1: Wonderful.

  • And indeed, are smaller by design.

  • And so you can throw half of that problem away, much like in week 0

  • when we tore the phone book in half.

  • You can throw half of that phone book away as well.

  • So thank you very much for finding us the number 50.

  • So this is something that we did see, indeed, in week 0.

  • And it's what we would generally call binary search.

  • Or more generally, just a divide and conquer algorithm.

  • But today, what we start to do is to assess

  • just how much better this actually is.

  • And how we can go about finding things more programmatically.

  • So what Derek could have tried is what someone would generally

  • call linear search.

  • And as the name suggests, linear, meaning line,

  • you can just kind of solve a problem from left to right.

  • And indeed, while Derek did take a random approach,

  • it would have been functionally equivalent

  • for him simply to go left to right or right to left and find the number 50.

  • It's indeed there.

  • So let's introduce some pseudocode for this.

  • And there is no one right way to implement pseudocode.

  • Here is one of the most succinct ways I could come up

  • with for expressing this kind of left to right approach.

  • Or equivalently, right to left.

  • For each element in the array, if element you're looking for return true.

  • And so this is a very common paradigm in programming.

  • Just say something generally, like for each element in an array.

  • And the implication is that you would generally go left to right,

  • but it's equivalent to go right to left.

  • And if we indeed see at some location in that array

  • that array of boxes, the element we're looking for, we can just return true.

  • And yet, here I have return false.

  • But the indentation, so to speak, is very different.

  • Why have I aligned return false with the for loop

  • here all the way on the left as opposed to it being aligned in the middle

  • or one line deeper?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yeah.

  • In order to denote that all of the indented elements

  • are within the for loop.

  • So it doesn't strictly matter how much you indent.

  • And indeed, in some languages, C among them,

  • it doesn't matter that you indent at all.

  • But for the sake of good style and readability,

  • we've indented this to convey to the human

  • especially that this "if" is only applicable when you're inside

  • of the for loop.

  • This return is only applicable when you're inside of this "if" condition.

  • And this return false really only happens

  • as the very last step in this program if nothing else actually returns a value.

  • So only if return true happens does this program say,

  • yes, I found the element you're looking for.

  • And you can think of this, therefore, as the sort of default case.

  • Now, just to toss it out there, why would this be wrong?

  • Let me go ahead and temporarily change this program

  • to be a more familiar if/else block.

  • So why is this seemingly logically fine?

  • It's a valid puzzle piece if you will, a la scratch.

  • But this program is now incorrect.

  • Why?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Once you hit it, it's automatically

  • going to come out of the for loop when?

  • Exactly.

  • When will this come out of the for loop?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Good.

  • So it's going to return false if the first element isn't

  • what you're looking for.

  • In so far as for each element in an array

  • kind of implies that you should be looking from left to right

  • or right to left, the first question you're

  • asking yourself when looking in that first element

  • is, is this the element you're looking for?

  • If it is, great.

  • You're going to return true.

  • And your answer is correct.

  • But if it's not the element you're looking

  • for, you're executing else return false immediately saying no, the element

  • I'm looking for is not here.

  • But when really the only question you've asked is, is the element

  • I'm looking for the very first in the array?

  • And so, indeed, this would just be logically incorrect

  • because you might as well not look at any of the array

  • if you're only looking at that very first element.

  • So not what we intended.

  • So indeed, that original pseudocode is just fine.

  • So alternatively, there's something called binary search, which

  • Derek actually used intuitively for his second approach,

  • where he was told in advance that the array is sorted numerically from left

  • to right.

  • This gives him more information, much like I had in week 0

  • when I searched a phone book that I knew was alphabetical.

  • And you can write this algorithm in any number of ways, too.

  • I kind of borrowed the spirit of our Mike Smith algorithm from week 0

  • and I proposed this.

  • Look at the middle of the array.

  • If element you're looking for return true.

  • We just get lucky if it's smack dab in the middle of the array.

  • Else if the element is to the left, because the element

  • you're seeing in the middle is too big, then search the left half of the array.

  • Else if the element is to the right, as it was in Derek's case,

  • search the right half of the array, else return false.

  • But there is no go-to instruction here.

  • And there's no line numbers.

  • Recall from week 0 when we searched for Mike Smith,

  • I deliberately numbered all of the lines.

  • And then I said, go back to step 2 or 3 I think it was.

  • Here, I seem to be cutting a corner.

  • I'm just kind of waving my hand and saying,

  • search the left half of the list.

  • Or, search the right half of the list.

  • But what algorithm should we use in either of these cases?

  • How do you search the left half of a list?

  • How do you search the right half of a list?

  • How do I do this?

  • Well, where do I have an algorithm for searching?

  • Well, we saw one a moment ago.

  • I've already got in my toolkit, so to speak, linear search.

  • So I could certainly use that.

  • But sort of cyclically, or sort of confusingly,

  • I also have this other seemingly fancier, faster algorithm

  • called binary search.

  • And yet, I'm in the middle of defining binary search with my pseudocode.

  • And yet, here I am sort of trying to define myself with myself.

  • You can't get away with this in English.

  • If you're asked to define an English word

  • and you use the word in the definition, usually a teacher or someone

  • in a conversation will sort of verbally slap you on the wrist

  • because you're not answering the question.

  • And yet somehow in programming, you can get away with this.

  • Somehow, you can get away with defining a problem with a solution like this,

  • and then using that same solution inside of your solution

  • because something is changing.

  • This line here, search the left half of the array,

  • and this line here, search right half of the array,

  • is not equivalent to asking the same question again.

  • Is this element in the array?

  • What changes when I either search the left half or the right half that

  • makes this acceptable?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: What's that?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yes.

  • That's the key ingredient.

  • In so far as search left half and search right half imply

  • search in the same way, but search a smaller problem,

  • this problem will eventually stop.

  • You're not going to get stuck in an infinite loop so to speak.

  • Because eventually-- you can't half something forever.

  • You're eventually going to end up with one or zero elements.

  • At which point, the answer's going to be no.

  • Return false, it's not here per the else condition.

  • But these sort of reflexive calls or inductive,

  • or more properly here, recursive calls-- the new buzzword

  • for today-- means for a function, or in this case pseudocode, to call itself.

  • To use its own definition again and again

  • and again, which is bad instinctively, unless you keep doing it

  • on smaller and smaller problems.

  • And so this is a general principle, actually,

  • that we're going to be able to leverage ultimately, not only for searching but

  • also, as we shall see, for sorting.

  • So let's go ahead and let me share actually

  • how this example went last year.

  • We're fortunate to capture sort of memories on film.

  • And I thought it would be fun to share this,

  • when one of your predecessors in the class came up,

  • was asked the exact same question, find us the number 50.

  • But it didn't quite go as well as we hoped.

  • Let's take a look at 2014.

  • [AUDIO PLAYBACK]

  • -All I want you to do now is to find for me and for us,

  • really, the number 50 one step at a time.

  • -Number 50?

  • -The number 50.

  • And you can reveal what's behind each of these doors

  • simply by touching it with a finger.

  • Damn it.

  • [APPLAUSE]

  • [END PLAYBACK]

  • SPEAKER 1: OK.

  • So that didn't really demonstrate anything that year.

  • So we tried again, this time saying, all right, this time the array is sorted.

  • What are you going to do this time if you want to find the same number 50,

  • but the array is now sorted?

  • I thought I'd share this.

  • [AUDIO PLAYBACK]

  • -The goal now is to also find us the number 50, but do it algorithmically.

  • And tell us how you're going about it.

  • And if you find it, you keep the movie.

  • If you don't find it, you give it back.

  • -Man. [INAUDIBLE].

  • OK.

  • So I'm going to check the ends first to determine if there's-- oh.

  • [APPLAUSE]

  • [END PLAYBACK]

  • SPEAKER 1: OK, now I feel a little bad because clearly we

  • used to give away movies and now we give away stress balls.

  • But no one really has DVD players anymore, so it's OK probably.

  • And then, let me share one other clip back from our SD days.

  • Many years ago when Shawn, who has since graduated, came up.

  • Was asked the same question.

  • So Derek, you are in this wonderful history of students

  • who have tried this demonstration.

  • This was, as you can, see before we had touchscreen technology.

  • So we had sheets of paper up on the screen.

  • But the idea was the same.

  • And Shawn is perhaps one of our favorite memories in so far

  • as he, too, was asked to solve the same problem in his way.

  • [AUDIO PLAYBACK]

  • -OK, so your task here, Shawn, is the following.

  • I have hidden behind these doors the number 7.

  • But tucked away in some of these doors as well are other non-negative numbers.

  • And your goal is to think of this top row of numbers

  • as just an array, or just a sequence of pieces of paper

  • with numbers behind them.

  • And your goal is only using the top array here, find me the number 7.

  • And we are then going to critique how you go about doing it.

  • Find us the number 7, please.

  • No.

  • 5, 19, 13.

  • It's not a trick question.

  • 1.

  • At this point, your score is not very good, so you might as well keep going.

  • 3.

  • Go on.

  • -[INAUDIBLE].

  • -Frankly, I can't help but wonder what you're even thinking about.

  • Only the top row.

  • So you got 3 left.

  • So find me 7.

  • -[INAUDIBLE].

  • -17.

  • -[INAUDIBLE].

  • -7.

  • [END PLAYBACK]

  • SPEAKER 1: All right.

  • So we don't ask people to search for 7 anymore.

  • But this invites the question, how are we

  • allowed to have that assumption, right?

  • I've been assuming in week 0 that the phone book was alphabetized.

  • And therefore, I can find Mike Smith really fast in logarithmic time.

  • And we were just assuming a moment ago that we

  • could find the number 50 super-fast because of divide and conquer.

  • Again, but only if that array were sorted.

  • Indeed, Derek technically just got lucky in so far

  • as he found 50 the first time in 3 steps.

  • But it could have been as many as 7 steps because it was, indeed,

  • a random algorithm.

  • Or even if he had used linear search.

  • So suppose we want to actually sort something.

  • So these are still used on campus sometimes.

  • So these are blue books for exam period.

  • And suppose that we're at the end of a semester and a bunch of students

  • have written their names on these things.

  • And so A's and B's and C's and D's.

  • And suppose that they're handed in at somewhat random times.

  • There's always that kid who hands in his or her exam at an hour

  • into the 3-hour exam, and then most of them

  • come in around like 2 hours 45 minutes.

  • And so therefore, they're all in this pretty random arbitrary

  • order like this.

  • Actually, if I just them down like that, they're not really random at all.

  • And so they're just in some random order.

  • And the goal at hand, ultimately, is for the head TF, for instance,

  • or the professor to actually sort all of these blue books

  • and come up with an alphabetical order so you can make sure that everyone

  • has actually submitted on time.

  • So how do we do this?

  • What is the algorithm with which to do this?

  • Because indeed, if he or she, the professor,

  • later wants to find a certain name of the alphabet, like Smith,

  • it'd be nice if they don't have to sift through all 26, or all 100, or all 500

  • blue books.

  • They can just jump roughly to the middle,

  • and then divide and conquer and find Smith there after.

  • Can we get one volunteer to come up and propose how to sort?

  • You want come on up?

  • Come on up.

  • What's your name?

  • AUDIENCE: Allison.

  • SPEAKER 1: Allison.

  • All right, come on up.

  • So if you've ever sorted something before,

  • now's your chance to show that off, Allison.

  • All right.

  • So here are a whole bunch of blue books.

  • And I've just arbitrarily written not names, but first letters

  • of the alphabet on them.

  • So go ahead and sort them.

  • AUDIENCE: So I start with a letter.

  • And if this is before that, I just put it on top.

  • SPEAKER 1: OK.

  • So that's X and L. Gotcha.

  • AUDIENCE: And because Z is after that--

  • SPEAKER 1: There's Z.

  • AUDIENCE: --I put it below.

  • And D is here.

  • SPEAKER 1: OK.

  • So you seem to be taking them one at a time and just dealing with the problem

  • then and now?

  • AUDIENCE: Yes.

  • SPEAKER 1: OK.

  • So that's J. What happens next?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: OK.

  • That's K. K, L, good.

  • B, that's easy.

  • On top.

  • A, even easier.

  • On top.

  • AUDIENCE: Am I going to finish this entire thing?

  • SPEAKER 1: Yes.

  • We need 13 more, give or take.

  • M and--

  • AUDIENCE: N.

  • SPEAKER 1: N, O. I didn't do a very good job there.

  • U, V. OK, definitely not good.

  • Y. That's good.

  • It goes toward the end.

  • R, a little harder.

  • But this is actually a good situation for us

  • to be in because-- you keep going, I'll keep talking.

  • So even as Allison is sorting this, she's

  • dealing with each of these blue books one at a time.

  • But notice what's happening to her right hand.

  • There literally is more and more work, it seems, happening in her right hand

  • because as she encounters R or now W, not only does she need more thought,

  • she also needs to do more flipping to actually find

  • the right spot for W. This is H. She's doing the same thing again.

  • So it feels like the algorithm has started to slow down.

  • At first, it was pretty easy to just make a short list.

  • But now that the list is getting longer, it

  • feels like she's fixing a lot of work, or finding a location is

  • getting harder and harder.

  • Or at least taking longer and longer.

  • Maybe the alphabet itself is not that much more difficult.

  • Now, we have I. I can only stall so much longer.

  • G.

  • AUDIENCE: Sorry.

  • SPEAKER 1: No, that's fine.

  • E, good.

  • OK.

  • So it's getting a little easier because we're at the beginning.

  • Sorted.

  • So what were you thinking?

  • What was that algorithm?

  • AUDIENCE: It was just take a letter, take another letter,

  • and then test to see if the letter is before or after.

  • SPEAKER 1: OK, good.

  • AUDIENCE: And then put it in the correct place on either side.

  • SPEAKER 1: Perfect.

  • So a perfect way of thinking about this as well.

  • So thank you so much.

  • AUDIENCE: Thank you.

  • SPEAKER 1: Very nicely done.

  • And suppose the problem's a little harder.

  • I'll do this one myself here.

  • But suppose we have some playing cards.

  • And these are the type that probably everyone here can see.

  • So here is a standard deck of cards, including

  • the jokers, which we won't need.

  • How do you go about sorting things like this?

  • I don't know if-- actually, I didn't think in advance.

  • I should have shuffled these.

  • So if you want to shuffle these cards.

  • This is not going to end well.

  • OK.

  • So suppose we have all of these cards nicely shuffled here.

  • And I have before me a big problem, a big pile of cards.

  • And the goal, for whatever reason, you wanted to start the game

  • or you like to be orderly, is to sort the cards.

  • What's your algorithm going to be for sorting these cards?

  • What might you do?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Do them in piles based on the number.

  • OK, so here's the number 2.

  • So you want to make a pile for all the 2's?

  • OK, so we can do that.

  • So here's a pile for all the 4's.

  • And let me make myself some more room.

  • So we can put 2's here.

  • 4's here.

  • Here's a 3.

  • So I'm going to put this in between the 2 and the 4.

  • Here's a 5.

  • I'm going to put that over here.

  • And if I jump around, here's a Jack.

  • So this is going to go way over there.

  • 7, 8.

  • 6 can go over here.

  • Here's another 4, so I can put it here.

  • And I can bucketize all of the cards.

  • So this is good intuition because it makes a pretty big problem, 52 cards,

  • a little more tenable.

  • You can just now flip through all of the cards one at a time,

  • or whatever you encounter first.

  • And just drop it in the correct bucket.

  • But that's, of course, not going to get you to the end.

  • Let's assume we fast forward to the ending.

  • And now I have these 13 piles from Ace all the way up to King.

  • What do I then do?

  • What's step 2?

  • AUDIENCE: Stack them.

  • SPEAKER 1: So I want to stack them.

  • So I can take all the 2's, all the 3's, all the 4's.

  • And I didn't specify earlier, one of the things

  • that complicates this problem with cards is that you have different suits.

  • And so if you really want to be nit-picky,

  • maybe you want to keep all the suits the same,

  • which is like a nested sorting problem of some sort.

  • But this general idea of bucketizing in this way is good intuition,

  • but it clearly takes some time.

  • I mean, enough time that I'm not going to sort of drag this out

  • by sorting all of those cards myself.

  • And therein lies the point, though.

  • Even though in week 0 with Mike Smith, we had that sorted phone book.

  • And even though a moment ago, Derek, albeit with a few numbers,

  • had them already presorted, sorting takes time.

  • And even when we were just sorting the blue books alphabetically,

  • even though it felt pretty fast, pretty obvious, it starts to drag eventually.

  • And god forbid there were more than 26 books.

  • Maybe there were 260 books, or 2,600 books, or 2 million books.

  • Or web pages.

  • Or database records.

  • Any number of problems in this world now require sorting sometimes.

  • Whether it's on Google, or Facebook, or your own computer.

  • It feels like we need a good algorithm for actually doing that sorting.

  • So how can we go about doing that, not just with blue books, not

  • just with cards, but more generally?

  • Well, let's try to frame the problem first with a simpler world.

  • In fact, we can generalize blue books and playing cards

  • as really just numbers.

  • Or some piece of data that there is a defined order or sequence to them.

  • And suppose I have these numbers here.

  • How do I actually go about sorting this many numbers?

  • Well, for this-- and this will be our last chance

  • for audience participation here.

  • I need eight brave volunteers to actually come on up.

  • All right, so 1, 2.

  • This is what everyone's been waiting for.

  • 1, 2, 3, 4, 5, 6, 7, and 8.

  • Thank you.

  • Come on up.

  • And I will meet you over here.

  • All right.

  • Come on over here.

  • If you want to grab two music stands.

  • OK.

  • So let's go ahead and turn these around like this if you could,

  • as though the music's going to face the audience.

  • OK.

  • All right.

  • And let me go ahead and give you sheets of paper, and then we'll resume here.

  • All right.

  • So here is our eighth music stand.

  • And what we have here is 8 pieces of paper.

  • And each of you guys is going to take on the role of a certain number right now.

  • So I'm going to go ahead and give each of you a number in sorted order

  • initially, but that would be too easy.

  • So if you don't mind, let's go ahead and arrange yourselves behind these music

  • stands in the same order that you see on the screen.

  • So on the left end here near me should be the number 4.

  • Then, 2.

  • Then, 6.

  • Then, 8.

  • Good job.

  • Then, 1.

  • Then, 3.

  • Then, 7.

  • Then, 5.

  • All right.

  • And you can put the numbers on the music stand

  • there, but just move them with you as you go.

  • So now the question is, here is a seemingly random list

  • in just some order I came up with somewhat arbitrarily.

  • And I want now to sort my list.

  • How do I go about doing it?

  • What would you propose?

  • The goal is to get from 1 to 8.

  • What are our possible approaches here?

  • Oh, you want-- sure, OK.

  • We'll just do it over here.

  • AUDIENCE: Take out the smallest number.

  • SPEAKER 1: OK.

  • AUDIENCE: Take it, shift everyone else down.

  • Have the smallest number in the first place.

  • SPEAKER 1: OK, good.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: So let's go ahead and do that.

  • For those tuning in at home, what's your name again?

  • AUDIENCE: Allison.

  • SPEAKER 1: Allison.

  • And?

  • AUDIENCE: Alex.

  • SPEAKER 1: Alex.

  • AUDIENCE: Adam.

  • SPEAKER 1: Adam.

  • AUDIENCE: Brandon.

  • SPEAKER 1: Brandon.

  • AUDIENCE: Joseph.

  • SPEAKER 1: Joseph.

  • AUDIENCE: Vicky.

  • SPEAKER 1: Vicky.

  • AUDIENCE: [? Arpin. ?]

  • SPEAKER 1: [? Arpin. ?]

  • AUDIENCE: Derek.

  • SPEAKER 1: Derek.

  • OK.

  • So now, because that all just was too much to absorb.

  • Number 1 is now the first smallest element

  • according to the proposed algorithm.

  • So what are we going to do exactly with number 1?

  • AUDIENCE: He's going to [INAUDIBLE].

  • We're [INAUDIBLE].

  • SPEAKER 1: OK, good.

  • So pop back if you would.

  • You guys are going to shift three places this way.

  • And we're going to insert number 1 at the beginning of the list.

  • All right, so what's the next step just to be clear?

  • AUDIENCE: Number 2 is going to pop up.

  • SPEAKER 1: Good.

  • AUDIENCE: I'm going to shift down.

  • Number 1 is going to stay in place.

  • SPEAKER 1: Good.

  • So we find the next smallest element, which happens to be 2.

  • We insert him at the beginning of the list here.

  • And now, we move on to the number 3 problem Excellent.

  • So pop back.

  • Shift.

  • And insert.

  • Good.

  • 4, we got lucky.

  • So we don't have to do any extra work here.

  • So that's a freebie.

  • Now we look for 5.

  • Oh, there's 5 at the end.

  • Good.

  • And now 6.

  • We got lucky.

  • 7 and 8, we need to fix that.

  • OK, good.

  • So I like this.

  • It felt pretty fast.

  • Though, to be fair, the list is pretty short.

  • But we were doing kind of a lot of work.

  • In fact, that was perfect the fact that number 1 was all the way over here

  • because we shifted 3 humans.

  • And can you guys reset for just a moment?

  • Let me propose an alternative approach.

  • So if number 1 is in the middle here and really belongs all the way there

  • on the left, we went to great lengths, it

  • seems, to shift all of these volunteers to the left, which is a lot of steps.

  • Because at the end of the day, let's assume for today's purposes,

  • that a computer can only do one thing at once.

  • So even though all of you humans moved pretty quickly,

  • that was really like one of you moved.

  • Then, the next of you moved.

  • Then, the next of you moved.

  • Then, the next.

  • So that was like 4 total steps just to put number 1 on the end there.

  • Let me propose this.

  • 1, can you pop out again?

  • Now, this list, so far as I'm concerned as the computer,

  • is perfectly random from the get-go.

  • So why do I even care that you guys move?

  • I haven't even looked at the rest of these elements

  • yet or done anything with them.

  • Why don't I just evict number 4 where number 1 belongs?

  • If you want to go over here.

  • And if the list is already randomly sorted in the beginning,

  • well, then fine.

  • 4, you're going to go over here.

  • So I've made the problem no worse, but I have made it slightly better

  • by putting 1 on the end.

  • Now I'm going to go ahead and select the next smallest element again.

  • 2, I got lucky.

  • Now, I'm going to look for the next smallest element.

  • 3 indeed is the smallest.

  • So let me do the same thing.

  • 3, if you could pop back.

  • 6, you're out of there.

  • And notice that we're doing less work.

  • Now, the humans are moving a little farther physically, but that's fine.

  • The computer can just move values around in memory pretty fast.

  • But I've moved fewer of you.

  • So this is a slight optimization.

  • It turns out this isn't fundamentally better as we'll see.

  • But it's this instinct that you should appreciate

  • of trying to do the least amount of work possible.

  • Because otherwise, it's going to add up.

  • Let's try another one.

  • Well, let me clarify one thing then.

  • So what am I doing on each iteration of this algorithm?

  • As you suggested, I'm selecting the smallest element.

  • But the only catch is, if I'm now looking for the next smallest

  • element, that's going to be 4.

  • But I'm only sure that it's 4 once I get to the end

  • and realize, yep, that was the smallest element and then I act.

  • So once we fix that problem-- if you guys want to switch--

  • I start again over here.

  • And I have to find the next smallest element.

  • And let's see, 8 is pretty big.

  • 6 is smaller.

  • Good.

  • 7, not smaller.

  • But only once I get to the end do I realize, oh, here is the smallest.

  • So in short, to find the smallest element

  • as you proposed again and again, I have to keep going through the whole list

  • because it might be all the way at the end.

  • We, humans, have the advantage of just kind

  • of looking, sort of taking a step back and saying,

  • OK, this is obviously unsorted.

  • I know where everything is.

  • A computer can only look at one number at a time,

  • one element of an array at a time.

  • So let's try a fundamentally different approach.

  • If you guys could reset to your original 8 locations.

  • Let me propose that we just looked at an algorithm

  • that we'll call selection sort.

  • Selection sort in so far as you iteratively again and again select

  • the next smallest element.

  • Let's now do an approach that we'll call bubble sort, which

  • has sort of the visual effect of numbers bubbling up over time as follows.

  • You know what?

  • I'm just going to look, not at the whole list

  • because it felt like that was creating a lot of work for me.

  • I'm going to look at 2 numbers at a time, 4 and 2.

  • Are these in order or out of order?

  • Obviously.

  • So they're out of order.

  • So you know what?

  • I don't care about the rest of the problem.

  • Let me just swap you two.

  • All right.

  • Now, let me take one step.

  • Look at 4 and 6.

  • In order or out of order?

  • In order.

  • So I'm going to leave it be.

  • 6 and 8?

  • In order.

  • 8 and 1?

  • Not in order.

  • So let me swap that again.

  • 8 and 3?

  • Out of order.

  • Let me swap that again.

  • 8 and 7?

  • Out of order.

  • Let me swap that again.

  • 8 and 5?

  • Out of order.

  • Let me swap that again.

  • And now, what has happened?

  • Is the list sorted because I fixed all the pairwise mistakes?

  • So obviously not.

  • But is it better, the list?

  • Why is it better?

  • Derek.

  • AUDIENCE: Because now you have one less element to worry about.

  • SPEAKER 1: Yeah, exactly.

  • 8 bubbled all the way up to the top, if you will.

  • So now he is effectively done.

  • And frankly, every other number that's smaller

  • bubbled one step closer to its position.

  • So we've taken one big pass through this to solve at least one of the problems.

  • And now maximally, 7 problems remain.

  • So let me go back to the end here.

  • And let me look at pairwise.

  • 2 and 4?

  • You're good.

  • 4 and 6?

  • You're good.

  • 6 and 1?

  • Let's fix that.

  • 6 and 3?

  • Let's fix that.

  • 6 and 7?

  • We're good.

  • 7 and 5?

  • Let's fix that.

  • And now I can ignore number 8 because I know the end of my list is sorted.

  • So now, I've improved it slightly.

  • 7 has bubbled up to where he needs to be.

  • I just have 6 problems left to solve at most.

  • So now I look again.

  • 2 and 4?

  • You're good.

  • 1 and 4?

  • Let's swap that.

  • And notice, 1 is moving closer to the left of the list.

  • 4 and 3?

  • Let's swap.

  • 4 and 6?

  • You're good.

  • 6 and 5?

  • You're not.

  • But now I fixed 6.

  • And I can stop here, leaving me with just five potential problems.

  • And then you can see where this is going.

  • 2 and 1?

  • Let's swap.

  • 2 and 3?

  • You're good.

  • 3 and 4?

  • You're good.

  • We already solved these problems.

  • So let me go back.

  • And now, 1 and 2?

  • You're good.

  • 2 and 3?

  • You're good.

  • 3 and 4, 4 and 5, 5 and 6, 6 and 7, 7 and 8.

  • What do I know now as the computer?

  • What's that?

  • AUDIENCE: It's sorted.

  • SPEAKER 1: It's sorted.

  • Why?

  • AUDIENCE: Because everything's in order

  • SPEAKER 1: Everything is in order.

  • But what does that mean?

  • We humans can see that, but the computer can only look at individual values

  • at a time.

  • It can't take a step back and say, yep, sorted.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Good.

  • At a lower level implementation detail, I did no swaps.

  • There were no two numbers that were out of order.

  • So it would just be foolish of me and wasteful

  • for me to bother doing another pass.

  • Because assuming the numbers are not changing on their own,

  • I'm just going to do the same thing again and again otherwise.

  • So I can just stop.

  • And indeed, the bubble sort allows us to stop like this.

  • Let's do one last approach.

  • If you could, let me toss the starting points on the board again.

  • Let's reset one last time to this.

  • And let me propose-- let me draw inspiration, actually,

  • from the approach you took with the blue books, which was

  • a little different from how we began.

  • Which was recall that you took the problem on your left hand

  • and you just dealt with it one at a time.

  • You didn't go sifting through looking for A.

  • You didn't go sifting through looking for B.

  • You didn't go sifting through looking for C, which would really

  • be the analog of what you proposed a moment ago.

  • Instead, you just dealt with each problem as it came.

  • And this is our third and final algorithm

  • for sorting called insertion sort.

  • Take each problem one at a time and just deal with it then and there.

  • Don't postpone.

  • So here's 4.

  • You know what?

  • Done.

  • I now claim that this list, which we'll visually make a little separate,

  • is sorted.

  • So I'm going to mentally divide my list into a left half and a right half.

  • And now I just claim trivially-- I mean, almost silly-- this is sorted.

  • Because it is.

  • But now the right half of my list remains.

  • So let's go ahead and do this.

  • So number 2.

  • OK, where does this belong?

  • So if you want to go ahead and pick up the number.

  • The stand can stay.

  • Number 2 goes where?

  • Here's my left list.

  • And we'll make a little more space here just to be clear.

  • It goes over to this side.

  • So what has to happen?

  • So this time, we do need to shift you over.

  • So let's take 2 out.

  • Let's shift 4 over.

  • And now we won't move the stand.

  • We'll just assume now that my left half is of size 2

  • and the right half is of size 6.

  • But now, the left half of the list is still sorted,

  • but it costs me a little bit of work.

  • I had to move 4 over.

  • All right, let me deal with 6.

  • Where does 6 go?

  • It's already in order.

  • I don't need to shift anyone.

  • Where does 8 go?

  • Phew, it doesn't need to go anywhere.

  • It's already in order.

  • Uh-oh, 1.

  • Where does it go?

  • Now we incur a good amount of cost.

  • So let's do this.

  • Where do you go?

  • One shift, two shift, three shifts, four shifts.

  • Feels-- yep-- as bad as our original approach.

  • But notice I only have 4 more problems left.

  • I don't have to keep going back through the list.

  • So where does 3 go?

  • All right, let's pop you out.

  • Shift you in there.

  • 7.

  • We got a little lucky.

  • Let's pop you out and insert you right where you belong.

  • And then 5, you belong a few shifts over.

  • So let's shift three of you.

  • And then, done.

  • All right.

  • So let's give all of you a stress ball.

  • And maybe a round of applause.

  • And let's see if we can't tease apart what just happened here.

  • All right.

  • And then you can exit stage left-- right if you'd like.

  • All right, thank you.

  • Derek, you're really racking them up here.

  • All right.

  • AUDIENCE: I'll [INAUDIBLE].

  • SPEAKER 1: All right.

  • Thank you.

  • All right.

  • Sure.

  • So what was the point of this exercise?

  • And how do we actually now get to a decision point where one of these

  • is the way to go?

  • So in bubble sort, recall that the key was to do something

  • again and again until we did no swaps.

  • At which point we could conclude that the list was sorted.

  • So let's do exactly that, repeat until no swaps.

  • And then, let's give ourselves a counter,

  • just to make things a little more methodical,

  • and say for i from 0 to n minus 1-- rather n minus 2.

  • Why?

  • Well, the end of the list recall is it n minus 1.

  • Because if you have a list of size n, the beginning is at 0.

  • The end is at n minus 1.

  • And so one shy of the end is going to be n minus 2.

  • So we want to do this up through the second to last element.

  • If the i-th and the i-th plus 1 elements are out of order.

  • So if left hand and right hand are out of order, swap them.

  • And repeat this again and again and again.

  • And thanks to that outer loop, we're going

  • to do it again and again and again until we have no such swaps.

  • And now to be clear, why this n minus 2?

  • We want to make sure as we're sort of walking down our row of volunteers

  • that my left hand does not actually point at the last human on stage

  • because then, where would my right hand point?

  • There's no person to beyond that point.

  • So this then is bubble sort.

  • Now, let's propose pseudocode for our other algorithm, selection sort,

  • wherein we walked through the list iteratively

  • trying to select on each pass the smallest element

  • and putting it into its place.

  • So for i from 0 to n minus 1, we're going

  • to do this from beginning through the end,

  • find the smallest element between i-th and n-th minus 1 element.

  • Swap the smallest with that i-th element.

  • In other words, walk through the list.

  • Find the smallest, put it on the end.

  • Walk through the list, swap it with the second to the end.

  • Walk through the list, do the same, do the same, do the same,

  • each time grabbing the then smallest element.

  • And then lastly, we had insertion sort, whereby we iterated through the list,

  • but we pretty much dealt with every element as we encountered it.

  • So let me propose this pseudocode code here.

  • For i from 1 to n minus 1.

  • So let's just assume that the very beginning of the list

  • is sort of trivially sorted, no matter what number he or she is.

  • Because if that left side of the list is of size 1, it is, indeed, sorted.

  • And indeed, let's call the 0-th element through the i minus [? 1-th ?] element,

  • wherever we are, the sorted side.

  • So essentially, call the left the sorted side.

  • And you can think of the right-hand side as the unsorted side.

  • Remove the i-th element, whatever element you're dealing with,

  • and insert it into the sorted side in order.

  • In other words, just walk through the list, one at a time,

  • plucking out elements, and then forcibly insert them

  • into the sorted side of the list, making room,

  • as our humans did, for that list by shuffling everyone over.

  • This then was bubble sort, selection sort, and insertion sort.

  • But which to choose?

  • So let's introduce now an answer to why.

  • Not just what these things are, but why you might use one over the other.

  • So the running time of an algorithm might

  • be the number of seconds an algorithm takes to run,

  • or the number of minutes, or the number of steps, or the number of comparisons.

  • It doesn't really matter what your unit of measure

  • is, so long as you're consistent, whatever feels the most

  • expensive or useful to express.

  • So let's consider an example, like bubble sort.

  • If we want to consider the efficiency of bubble sort

  • and make a claim as to whether we should or shouldn't

  • use it because it's good or bad, let's consider it a little formulaically.

  • So a little mathy, but pretty generically.

  • So if there are n elements to store, and a computer scientist,

  • recall from week 0, will generally just call n the size of his or her problem.

  • n is the number of numbers to sort, or humans to sort.

  • How many steps is it going to take to sort that many people?

  • Well, when I had 8 music stands and 8 people here before,

  • I considered a pair, a pair, a pair, a pair, a pair.

  • And so if there were n humans up here.

  • Or let's speak concretely, 8 people initially.

  • How many pairs of people did I consider in my first pass?

  • Say again.

  • AUDIENCE: 7.

  • SPEAKER 1: 7.

  • Because if you pair the first and the second,

  • then the second and the third, then the third and the fourth,

  • that gives you n minus 1 generally speaking.

  • You can't make 8 pairs out of 8 people.

  • You can make 7 pairs if you keep walking through the list.

  • We're not doing permutations.

  • We're just doing adjacent neighbors.

  • OK.

  • So that might generically be n minus 1 total steps

  • for the first pass of my algorithm.

  • But the upside of that was that our eighth volunteer, number 8,

  • bubbled his way all the way up to the end.

  • And so with bubble sort, I didn't need to consider him again.

  • The biggest number bubbled all the way up.

  • And so that was done.

  • So how many pairs did I need to consider the second time through bubble sort?

  • 6.

  • Or more generically, n minus 2.

  • And then dot, dot, dot, until there was just one pair of two people left.

  • So I'll just do plus dot, dot, dot plus 1.

  • Now, you might recall from high school especially,

  • at least my math or physics textbooks always

  • had a little cheat sheet for like what these kinds of recurrences or formulas

  • add up to.

  • Does anyone recall what this one adds up to in a math book?

  • Or just mathematically?

  • So if we actually do this out, it's the same thing, beautifully,

  • as n times n minus 1 over 2.

  • And that, of course if you multiply things out,

  • is just the same thing as n squared minus n over 2.

  • And that, of course, feels like it would just multiply out to this.

  • So in other words, if asked, what is the efficiency of bubble sort,

  • or what is the running time of bubble sort to soar n elements,

  • you might sort of impressively say, well, it's

  • n squared divided by 2 minus n over 2.

  • But what does that actually mean?

  • Well, it turns out that when talking about the efficiency of algorithms,

  • you should really generally care about the component that

  • has the biggest order of magnitude.

  • The number that contributes the most to the total cost.

  • And by that I mean this, which is obviously bigger, n squared or n?

  • Assuming positive values of n.

  • So n squared, right?

  • Especially as n gets bigger, n squared is going to get even bigger than n.

  • So you know what?

  • I'm just going to kind of propose, let's just ignore the n minus 2

  • because as n gets really big, the dominating factor really

  • is going to be n square.

  • And for that matter, the n over 2, like who cares about the over 2?

  • n squared is kind of the essence of this formula right now.

  • So what does that mean?

  • Well, let's just do a concrete example to convince you of the fact

  • that we can cut a little bit of mathematical corner here

  • and only care about the big number for the following example.

  • This is proof by example, which is not a proof of anything.

  • It's just really to paint a picture of why we do this intuitively.

  • So suppose that there are a million volunteers on stage or a million

  • numbers that we want to sort.

  • Much bigger than 8.

  • Let's plug this in.

  • So if the total cost of bubble sort is n squared over 2 minus n over 2,

  • let's plug that in.

  • So that's a million squared divided by 2 minus a million divided by 2.

  • So if we multiply that out, that is 500 billion minus 500,000.

  • Big numbers.

  • But when you multiply that out, I mean-- my god, 499,999,500,000.

  • I mean, to my own eyes, that's pretty darn close to 500 billion

  • in the first place.

  • Why don't we just call it 500 billion?

  • So n squared, in other words.

  • So again, this is not a formal proof of anything.

  • But this is why, especially as n gets bigger,

  • that lower-ordered term, the minus n over 2,

  • just matters less and less and less in absolute form.

  • And so we'll generally say that something like bubble sort,

  • its running time is on the order of n squared.

  • Big O is actually formal computer science notation for on the order of.

  • And it has a formal mathematical definition.

  • But for us, we'll consider it really to be an upper bound on the running

  • time of this algorithm.

  • So an upper bound on how long this algorithm might take given n steps.

  • So big O itself is a formal notation.

  • We'll see it in a number of different contexts.

  • Depending on the algorithms we talk about in class,

  • we might say that the running time of an algorithm

  • is on the order of n squared, which is pretty bad.

  • Pretty slow. $500 billion sounds like a lot.

  • Well, maybe it's a little better on the order of n times log n.

  • And more on that some other time-- later.

  • But on the order of n, it was pretty good.

  • If there's n elements, it only takes you roughly n steps.

  • Maybe it's even better big O of log n.

  • Or really best would be big O of 1.

  • It just takes 1 step, or 10 steps, or 100 steps,

  • but a constant number of steps is what big O of 1 represents.

  • And there's any number of other formulas we

  • could use or equations to express the running time of algorithms.

  • But many of the algorithms we'll look at in CS50 certainly

  • fall into these categories.

  • So let's take an example.

  • What algorithm has the running time of big O of n squared?

  • Well, bubble sort for one thing.

  • And it turns out selection sort.

  • And it turns out insertion sort.

  • All three of those algorithms actually have the same running time,

  • even though they're fundamentally different in execution.

  • But they do share a commonality.

  • All of them are comparison-based.

  • In each case did we actually compare values,

  • either adjacent or we would look at a number there, compare it to a number

  • here, and then decide whether or not to do a switch.

  • So they were all comparison-based.

  • And all of them, frankly, involved quite a bit of movement or walking,

  • either on my part or the humans-- volunteers part.

  • For instance, in bubble sort's case, I had

  • to walk through the whole list, sorting elements pairwise like this.

  • But that wasn't enough.

  • I then had to do it again and walk across the stage.

  • And then I had to do it again and walk across the stage.

  • And this is what n squared feels like.

  • And even though the problem was getting a little smaller

  • as the big elements bubbled up, mathematically, even that algorithm,

  • it's kind of essentially like doing n things, looking at n people,

  • n times, iterating n total times, which is roughly n times n, or n squared.

  • It's just going to add up a lot.

  • So what about selection sort?

  • What did I do in selection sort's case?

  • Here, it's even more obvious, perhaps.

  • To find the smallest element in a list, I

  • have to look logically at every element before I can confidently conclude,

  • yes, I have found the smallest element.

  • Once you find it, you can plop it at the beginning.

  • And that's easy.

  • But then you have to find the second smallest element.

  • And even though we humans could kind of eyeball things and realize,

  • oh, 2 is the next smallest.

  • You don't know that as the computer unless you look at every element.

  • Ah, I found 2.

  • And then you could put 2 where it belongs.

  • So again, there's this walking back and forth.

  • And if you're going to walk as many as n steps and have to do this n times--

  • one for every element-- that, too, is n times n.

  • And insertion sort, same damn problem, even though it's

  • sort of logically different, the algorithm.

  • Insertion sort, beautifully I only move one step at a time

  • and I never turn back.

  • Because I just deal with, as you did for the blue books, each problem one

  • at a time, never again backtracking.

  • But where was all the work happening?

  • In the shifting of humans.

  • Now, technically you guys helped out by doing the shifting yourselves.

  • But if I'm the computer, it actually would

  • have been like me tiptoeing back, shifting, shifting, shifting,

  • inserting, shifting, shifting, shifting.

  • So again, in the worst case, I might have

  • had to shift everyone all the way over to make room for someone.

  • Because in the worst case, what's the worst possible input

  • you might get to a sorting algorithm?

  • What input would create the most work for you?

  • AUDIENCE: Backwards.

  • SPEAKER 1: Backwards.

  • If the list is perfectly backwards, every darn element

  • is out of place, which means you have to traipse back and forth

  • and back and forth to insert or to select or to swap all of those elements

  • that many times.

  • So in all of these cases, they are not numerically identical.

  • Some of those algorithms might take fewer steps than others in totality,

  • but they're all on the order of n squared for the reason

  • that they're all involving comparisons as many as n times n times.

  • All right.

  • What about an algorithm that's big O of n?

  • An algorithm that only takes n steps?

  • We've seen one.

  • Derek showed us one.

  • What's an algorithm that takes n steps?

  • Derek, hopefully.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: A random search?

  • Kind of, sort of.

  • It's a little harder to put a number to because it

  • depends what you mean by "random."

  • Because if you have random with repetition such

  • that you're checking foolishly the same elements again and again,

  • it can actually add up to more than n steps.

  • So not random, but-- yeah.

  • AUDIENCE: Binary.

  • SPEAKER 1: Binary is even better.

  • Hold that answer.

  • You're about to be correct.

  • Linear search, right?

  • If you just blindly go from left to right

  • or right to left looking for some element,

  • that is on the order of n steps.

  • Because you are just doing foolishly unnecessary

  • work if you go beyond the end of the list

  • or beyond the beginning of the list.

  • You only need n steps for linear search.

  • But what if you want to instead find an algorithm that's logarithmic?

  • What algorithm that we used might be logarithmic in nature?

  • Binary.

  • So binary search, as Derek proposed, once we knew the list was sorted

  • was an alternative to that.

  • And constant time.

  • What about this?

  • What's an algorithm, maybe independent of today,

  • that allows you constant time, running time?

  • It could be anything.

  • How about hello world?

  • The [? say ?] block, or printf like that.

  • If you can just say a whole phrase at once.

  • Or maybe just adding two numbers together is like one step.

  • No matter how big the numbers are, it still just takes one step

  • to add them all together.

  • It depends on some of the lower-level implementation details.

  • And even printf or the [? say ?] block has multiple characters.

  • So we'll actually come back to this in the future

  • as to what it really means to be a step.

  • But anything that takes a constant number of steps.

  • Checking an "if" condition, returning a value.

  • Statements, generally, might be broadly speaking just constant time,

  • but we'll see contradictions to that before long.

  • So let's take a look at two other symbols.

  • This one being a capital omega symbol.

  • And this omega symbol is the opposite, really, of big O. Big O

  • is an upper bound on the running time.

  • In the worst case, how many steps might it

  • take to find an element in a list of n elements?

  • n steps.

  • So big O of n.

  • Bubble sort, selection sort, insertion sort?

  • They're on the order of n squared.

  • But what about the lower bound?

  • Especially when you get lucky in the best case so to speak,

  • how few steps might you solve problems with?

  • So for instance, if we consider the same formulas,

  • but this time with capital omega, what's an algorithm

  • that no matter what takes n squared steps,

  • even if, for instance, the list is sorted?

  • So suppose our 8 volunteers here were already 1, 2, 3, 4, 5, 6, 7, 8.

  • Which of those algorithms would still have required n squared steps?

  • So not linear search or binary search because those are much lower.

  • But insertion, bubble sort, selection sort.

  • Well, what about selection sort?

  • Even if the list is already sorted, how do I find the smallest element?

  • I'm lucky and it's right here, but I don't know that yet, right?

  • To know that, I have to confidently walk the whole list and realize,

  • well, that was a waste of time.

  • 1 is indeed the smallest element.

  • It's sorted.

  • How do I find the next smallest element?

  • Start at the left.

  • But you can ignore number 1.

  • So it's a minor savings.

  • OK, 2.

  • Feel's pretty small, but got to check.

  • Dammit, that was the smallest.

  • And so selection sort, even in the best case when the list is already sorted,

  • is going to take at least on the order of n squared

  • steps because of that naivete of it.

  • Because of that underlying design of just looking

  • for the smallest element with no optimizations means

  • the algorithm's always going to take us that many steps.

  • But it doesn't have to be that way.

  • What about bubble sort?

  • Suppose that all 8 humans are sorted, 1 all the way through 8.

  • And here I am comparing the first two.

  • 1 and 2?

  • Not out of order.

  • 2 and 3, 3 and 4, 4 and 5, 5 and 6, 6 and 7, 7 and 8.

  • OK, everything looks good.

  • And in particular, what did I not do while walking from left to right?

  • I didn't do any swaps.

  • So it would be foolish of me algorithmically to do any of that

  • again because the answer is not going to change.

  • And so done.

  • It took me n steps to sort n elements with bubble sort

  • if the list is already sorted.

  • So it would be omega of n because in the best case, it takes you n steps.

  • Could a sorting algorithm be constant time

  • or logarithmic, log n or order of 1?

  • Or omega of 1 rather?

  • Can you sort n elements in logarithmic time or constant time?

  • What's the intuition there?

  • These are kind of like now the fundamental meaning of computing

  • and what you can actually do.

  • What does it mean for me to ask, can you sort n elements

  • in log n time or constant time?

  • Well, that effectively is like saying, can you

  • sort n elements without even looking at some of those elements?

  • Because both log n and 1-- or mathematically,

  • and if you don't recall, take it on faith today-- less than omega of n

  • in general.

  • So can you possibly, intuitively, in the real world,

  • sort n elements in some random order in fewer than n steps?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Good question.

  • Could you make the whole list a big integer?

  • You could, that and other tricks.

  • But to do that, you're going to need to touch or somehow manipulate

  • all of those values at least once to kind of do that, I would think.

  • And so you've already spent n steps.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: OK.

  • So if you have a like guessing algorithm-- yes, this is sorted.

  • That's constant time.

  • You're just not always correct.

  • So I should clarify, can you sort correctly n elements

  • without necessarily looking at all n-- the short answer is no.

  • There is like a fundamental limit of computation.

  • And also, just human intuition here.

  • You can't possibly claim to me and convince me, in a court of law even,

  • that these n elements are sorted if you can't even

  • testify to having looked at all n of those elements.

  • It's just not possible given the computational model

  • we've been discussing where a computer can only look at one element at a time.

  • So sorting is out of the question.

  • But an algorithm that takes omega of one step?

  • For instance in the best case, when might an algorithm need only one step?

  • How about something like linear search or even binary search?

  • The running time of those algorithms was big O of n and big O of log n

  • respectively.

  • Linear search, binary search.

  • What's the lower bound on those two algorithms' running times?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yeah, it's omega of 1.

  • But why?

  • Derek wasn't so lucky.

  • But AJ, last year, touched the number 50-- voila, 1 step.

  • And that was actually, while sort of bad pedagogy for us,

  • was a perfect manifestation of yes, in the best

  • case the lower bound on your running time might indeed just be one step.

  • Or two steps, but some constant number of steps.

  • So we have a range of these options.

  • Now, sometimes these ranges coincide.

  • If your big O notation, so to speak, and your omega notation, so to speak,

  • are one in the same values, you can claim that the algorithm

  • has a running time in capital Theta.

  • So theta just means when big O and omega are the same,

  • you can instead use this formula and sort of convey two thoughts at once.

  • But for now, the takeaway is that as best we've

  • seen for something like sorting, the best we've done so far, at least

  • in the worst case, is big O of n squared.

  • And this doesn't feel very good, especially since sorting and searching

  • is all around us.

  • Anyone with an iPhone or an Android.

  • I mean, there's so much data that we're carrying around in our pockets

  • these days, let alone in the cloud and on our laptops and desktops,

  • that it's just generally good practice to keep sorted.

  • In your phone book, it would be pretty tedious

  • if all of your friends and family were in random order in the contacts list.

  • And you had to like scroll through the whole darn list, big O of n,

  • to find someone.

  • No, instead we have a search box, of course.

  • And most of us probably wouldn't blindly scroll through the whole list.

  • But even that search box, if you start typing in D-E-R to look up Derek,

  • or S-M-I-T-H to find Smith.

  • Well, how is iOS?

  • How is Android finding Mike Smith or finding Derek?

  • Those pieces of software themselves have algorithms that underneath the hood

  • are hopefully leveraging smarter algorithms than we've even seen here.

  • Because especially as you have more and more friends, more and more family

  • members, even big O of n squared starts to feel kind of slow

  • as we can see here.

  • Let me go ahead and open up an example.

  • So let's take a look at a demonstration.

  • Here is a visualization that someone on the internet kindly put together.

  • And it focuses on sorting numbers.

  • But rather than just show the numbers textually as 1, 2, 3, 4, 5, 6, 7, 8,

  • it instead shows them as bar.

  • So the shorter the bar, the smaller the number.

  • The taller the bar, the taller the number.

  • And let me go ahead and increase the speed of this,

  • just so it doesn't get too tedious.

  • And let's take a quick look at bubble sort.

  • Now, notice what's happening.

  • The two red bars adjacently are the ones that are

  • being compared and potentially swapped.

  • And you'll notice that just like in our human demo, what is happening over

  • there on the right-hand side?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Yeah.

  • So the biggest elements seem to be bubbling up, if you will, to the top.

  • Even though there's still a lot of work to be done to the left of those,

  • we indeed have the ability here to see the list getting sorted progressively.

  • But you know, I'm already getting bored.

  • This algorithm doesn't feel very fast.

  • It's very pretty.

  • It's very correct, it would seem, but it's a little slow.

  • Let's try another one to see if it's any better than that.

  • That's the spoiler.

  • That's what it should look like.

  • Let's try our friend selection sort, which is already sorted.

  • Let me stop this.

  • Let me click on Randomize Array.

  • And now click Selection Sort.

  • And what's happening here?

  • This is like me walking across the audience again and again

  • and again and again.

  • You don't see any of this bubbling effect,

  • but you do see the smallest element being selected again

  • and again and again.

  • So we're kind of cleaning up that mess one step at a time.

  • It's still going to take some steps, but you can perhaps

  • see it a little better and a little faster here.

  • And spoiler alert, when we skip forward to the end, that's what it looks like.

  • If we now randomize it one last time and do insertion sort,

  • this one kind of feels and looks different.

  • It's kind of mostly fixing things, but then it

  • kind of goes in and fills in the gaps.

  • Because indeed, it's moving forward one at a time, grabbing the element,

  • and then shifting the element as needed to make room for whatever number

  • it's encountered.

  • So that's why sometimes it only takes a few steps

  • because you find where it belongs.

  • That time it took a while to find the space that it belongs.

  • But generally, the list is getting more and more sorted.

  • And once we do get to the very end, this one

  • we can actually feel being a little faster.

  • Which isn't necessarily a testament to it being fundamentally better.

  • All of these remain on the order of n squared steps.

  • And now indeed, it's starting to drag, kind of like our blue books.

  • And now we're actually complete.

  • But what if we were to compare these things?

  • Bubble sort on the left, selection sort in the middle,

  • and then a third algorithm that we've not seen this time called merge sort.

  • So insertion sort is roughly the same as all of these.

  • So this visualization lets us try three.

  • To start this, I have to click on them.

  • So there's a little bit of a fudge factor

  • here where I have to click on all three programs quickly in succession

  • to see them.

  • But let's see if we can't do fundamentally better.

  • Merge sort is our new friend on the right.

  • Oh my god.

  • And these are different visualizations, but they're the same algorithms.

  • And it's not that they're just poorly implemented.

  • It's arguably that they're poorly designed.

  • They are all correct, but merge sort clearly

  • seems to have some upsides over selection sort.

  • And if you'll take it on faith for now, just because we can only fit three

  • on the screen, insertion sort, too.

  • You see the bigger elements bubbling up to the top.

  • You see us selecting.

  • Again, it turns out this person implemented selection sort

  • by selecting the biggest element.

  • But that's fundamentally equivalent to selecting the smallest element.

  • And that's because you can see them all stacking here in this way.

  • But my god, I wonder, can me start this again?

  • Let's try.

  • Let's see if we can lap it.

  • Come on.

  • Come on.

  • Come on.

  • Come on.

  • Oh, so close.

  • All right.

  • Had I not talked as much.

  • So merge sort.

  • Where is this magical algorithm?

  • What are we actually doing wasting time on these other things?

  • Well, let's consider exactly that, but we need sort of a secret ingredient.

  • And let me revisit Mike Smith for just a moment.

  • So you'll recall, this was the pseudocode

  • that we had from Mike Smith in week 0.

  • And we decided when Mike is earlier in the book

  • or when Mike is later in the book, to just go to an earlier step.

  • Thereby, inducing some kind of loop.

  • But suppose we changed that language to be this instead.

  • Don't go to a specific line number.

  • Don't go to some earlier step.

  • But just more generally, say, search for Mike

  • in the left half of the book or search for Mike in the right half of the book.

  • This is borrowing an idea that I proposed a moment ago when

  • we were searching for something else.

  • And what word did we ascribe to this kind

  • of approach of this cyclical use of your own definition?

  • Yeah.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Recursion.

  • So recursion is actually this secret ingredient

  • that we can leverage, both intuitively and in code,

  • to implement this new and improved fourth algorithm, merge sort.

  • And indeed, this we'll use demonstrative of a class of algorithms

  • that doesn't behave in quite the same way fundamentally as selection

  • sort, insertion sort, and bubble sort.

  • Indeed, you can do better than big O of n squared.

  • You can do much better, as you felt a moment ago with visualization.

  • And merge sort is one such algorithm with which we can do that.

  • And amazingly, the pseudocode code is this simple.

  • When you're given n elements, if n is less than 2, just return.

  • If you have fewer than 2 elements, the list is of size 1, which is sorted,

  • or size 0, which is also meaninglessly sorted.

  • So you're done.

  • So that's kind of a base case, like a default scenario.

  • Let's just make sure that no matter what, we eventually

  • quit, even if the list is super-small.

  • But then, the magic is hidden in those last three lines.

  • Else, sort the left half of the elements,

  • then sort the right half of the elements,

  • which is completely punting so to speak.

  • Just kind of recursively calling yourself.

  • But the key detail here is that then you have to merge the sorted halves.

  • You're not just looking for Mike to the left and looking for Mike to the right.

  • You're doing something on the left, something on the right,

  • sorting each half, but then you have to kind of weave things together.

  • And we'll see how expensive that actually is.

  • So let's take an example.

  • And I mocked this up with some very simple animation so we

  • could walk through it step-by-step.

  • But consider the following.

  • So here is that same list from before where

  • we had a whole bunch of volunteers.

  • But rather than call up humans, we'll just do this visually.

  • Let me consider the fact that this is really just an array.

  • So I'm just going to box everything like this.

  • And this is actually key that we have random access to these elements.

  • We can jump right to the elements we want

  • and code ultimately, too, if we cared.

  • And now, let's focus on the left-hand side.

  • So I've wrapped 4, 8, 6, 2 in a solid line

  • and left everything else dash because I want to focus

  • our attention on sorting the left half.

  • So again, to rewind this is the algorithm.

  • And the goal at hand is to walk through visually

  • an example whereby we sort 8 elements.

  • And we do that by sorting the left half, sorting the right half,

  • and then somehow merging these things together.

  • So how do we do this?

  • Well, let's consider the left half.

  • All right.

  • How do I sort a list of 4 elements?

  • What does it mean to sort the left half?

  • Well, what algorithms do I have in my toolkit for sorting elements?

  • Yeah, selection sort, insertion sort, bubble sort, but-- merge sort.

  • And that's where this gets weird quickly.

  • You want to use your same algorithm to sort the left-hand side of this list.

  • And yet, we haven't even finished defining this algorithm,

  • but that's going to be OK we'll see.

  • So let's sort the left half.

  • How do you sort the left half of a list?

  • Well, you sort a list of size 4.

  • What does that mean?

  • How do you sort a list of size 4?

  • Sort the left half of that.

  • So let's focus on 4 and 8.

  • How do you sort a list of size 2?

  • Merge sort it.

  • So how do you sort a list of size 2?

  • Sort the left half.

  • OK, beautifully.

  • Done.

  • Done.

  • Sort the right half.

  • Nice, done.

  • I mean, I'm doing real work, or claiming to.

  • But now I have to do something more.

  • Now I have to merge the two halves.

  • So I'm taking a step back.

  • I have two lists now, one of size 1, one of size 1.

  • I have to merge these things, two together.

  • Which comes first, 4 or 8?

  • 4.

  • So let's create some extra space for it.

  • Let's put 4 here and let's put 8 here.

  • So here's a key detail with merge sort, kind

  • of cheating, at least at the moment, I'm using some extra space.

  • So I hope you won't mind.

  • To achieve this, I'm using more memory, more RAM,

  • that I wasn't allowing myself for selection sort, bubble

  • sort, or insertion sort.

  • OK.

  • So now it's going to be easy to get lost quickly.

  • So what have I just done?

  • I have sorted a list of size 2 by sorting the left half, done.

  • Right half, done.

  • Merging it together.

  • So now I have a sorted list of size 2.

  • Rewind the story.

  • What step comes next?

  • Sort the right half.

  • That was of size 2.

  • So if you kind of mentally buffer all of the to-dos that we're doing,

  • and then pluck them off as we go, you'll see where the story goes.

  • Now I want to sort the right half of the original list.

  • Sorry, the right half of the left half of the original list.

  • How do I do that?

  • Sort of left half, done.

  • Sort the right half, done.

  • Merge them together.

  • What comes first, 6 or 2?

  • OK.

  • So let's give ourselves a little more memory for that

  • so we have a place to put it.

  • 2.

  • And then 6.

  • All right.

  • So it doesn't look like all that much progress, frankly.

  • My left half of the list is now 4, 8, 2, 6, which is not really sorted.

  • But we'll get there.

  • What step comes next?

  • I've just sorted the right half of the left half of my original list.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: You're missing a step.

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: Merge, that third step.

  • Once you've sorted the left half of something,

  • then you've sorted the right half of something.

  • Now you have to merge them or weave them together somehow.

  • So the left half of the list in the story at the moment is 4 and 8.

  • The right half is 2 and 6.

  • So here's where mentally you can kind of bring out

  • your fingers to think through this sort of conceptually, or in pseudocode code.

  • Here's the left list.

  • Here's the right list, or the right half.

  • So how do I merge these two together?

  • Well, it turns out-- and you can see it now.

  • Before it was idiotic that we just were merging lists of size 1.

  • It's pretty easy.

  • Here now, we have something substantive to talk about.

  • So I have one list here, one list here.

  • Which element from these two lists obviously comes first?

  • So 2.

  • And so what I'm going to do is then 2 is going to have to come down here.

  • And then I can advance my right hand so to speak.

  • And now, compare 4 and 6.

  • What's going to come next?

  • 4.

  • Now I can advance my left hand.

  • I compare 8 and 6.

  • Which comes next?

  • 6.

  • Now, my right hand, its job is done.

  • I have only one more to do.

  • And this.

  • So what's key in the merging is that you essentially

  • have this left hand and this right hand concept.

  • Each of them kind of steps forward over time, but you never go back.

  • You never double back.

  • So it's this sort of linear relationship with your left and right hand

  • as you're merging the lists together.

  • So pictorially, let's see that.

  • We compare 4 and 2.

  • 2 goes first.

  • Then, 4.

  • Then, 6.

  • Then, 8.

  • Where am I in the story?

  • I mean, we've been droning on for some time now.

  • I have just successfully sorted what?

  • The left half of the original list, which means the next step is?

  • AUDIENCE: Sort the right.

  • SPEAKER 1: Sort the right half of the original list.

  • So I can box that in to draw attention to it.

  • How do I sort a list of size 4?

  • Well, that's really just 2 lists of size 2.

  • Let's sort the left half.

  • How do I sort a list of size 2?

  • Well, that's really just a list of-- 2 lists of size 1.

  • Done.

  • 1 is sorted.

  • Done.

  • 7 is sorted.

  • Now I have to merge them together.

  • 1 comes first, then 7.

  • OK, now sort the right half of the right half of the original list.

  • OK.

  • Sort the left half.

  • Done.

  • Right half.

  • Done.

  • Merge them together into some extra space.

  • 3, and then 5.

  • Now where am I in the story?

  • Now I have sorted the left half of the original right half

  • and the right half of the original right half,

  • so now I have to merge them together.

  • So 1, 3, 5, 7.

  • Almost there.

  • Now we're sort of rewinding in time.

  • Now we're at the very first chapter in the story--

  • sort left half, sort right half, merge.

  • I have that one final merge step.

  • So what comes first from the left half and the right half?

  • 1, 2-- even if you've lost track of the story,

  • the answer's going to be 1, 2, 3, 4, 5, 6, 7, 8.

  • But 1, 2, 3.

  • But notice what's happening here in effect.

  • Whatever code is implementing this, it's as though my right and left

  • fingers are just moving from left to right, plucking off the element that

  • is indeed then the smallest.

  • That was a heck of a lot of work.

  • And frankly, that story seemed to take longer to tell than any of the others.

  • But what happened?

  • And in fact, I've left deliberately, with this little animation

  • of sorts, on the screen the remnants of where we came.

  • And it looks like I really cheated, though.

  • I seemed to first say, hey, I need some extra RAM.

  • And I took it.

  • And then I didn't mention I need more RAM.

  • I didn't mention I needed even more RAM.

  • So this seems to have taken like four times as much total memory than any

  • of our previous algorithms, but that's not actually the case.

  • I left this here pictorially so we could see the history.

  • Technically, you do need more RAM to implement this algorithm, merge sort.

  • But we could have cut corners.

  • Instead of merging down here, we could have just

  • merged back into our original chunk.

  • And so we could have just bounced between two separate arrays,

  • but this is just a little more clear in terms of remnants.

  • So how many steps did this take?

  • Well, how many times did I divide my original list in half,

  • in half, in half effectively?

  • I had 8 elements here.

  • And that was really like, if you divide it up differently,

  • two bigger halves and two bigger halves.

  • But if you divide it again, that's like two really big halves,

  • and then you get one big list.

  • So it would seem that starting here, I divided it 1, 2, 3 times.

  • I did some splitting.

  • Left half, right half, left half, right half.

  • And any time you split something in half, in half, in half,

  • in half, what's the running time of anything involving like halving again

  • and again and again?

  • It's like binary search.

  • Log n.

  • Log n.

  • So any time you see in computer science more

  • generally, certainly CS50 in upcoming weeks, this division

  • and conquering again and again and again where

  • you're dividing something in half, odds are logarithmic running time

  • is somehow at play.

  • But this algorithm, merge sort, surely cannot run in big O of log n time.

  • Because again, you can't sort n elements in less than linear time

  • because you'd be guessing that everything is sorted.

  • So log n is not the final answer here, but there

  • is something logarithmic happening.

  • However, even though we did a left half/right half thing,

  • a left half/right half thing, a left half/right hand,

  • thing sort of a total of three times, from here to here to here

  • to here, at which point we were done.

  • On every row that's on the screen if you will,

  • what did I do with my left and right hand?

  • I sort of always did n steps.

  • You can really see it up here.

  • How did I merge this last thing?

  • I had to touch all 8 elements by walking through the list.

  • How did I merge these two left halves?

  • Well, I had to do this.

  • And then I had to do this.

  • So if you kind of consider the remnants of my finger touching the screen,

  • every time I did a divide and conquer, I had to touch every element in order

  • to merge them together.

  • So I did log n things.

  • And every time I did that, I incurred n steps of merging.

  • So log n things times n is going to give me big O of n log n.

  • So this was among the options on our sort of menu of possible running times,

  • at least that we'll focus on right now.

  • This is bigger than log n, of course.

  • Because you're multiplying it by n but.

  • It's smaller than n squared.

  • Because if log n is less than n, then surely n times log n is less than n.

  • And so the running time here of merge sort

  • is indeed going to be big O of log n.

  • Any questions on this here?

  • If I may, let me point out one other approach

  • for seeing this same thing as follows.

  • You can actually glean this kind of detail

  • more formally, especially if you're the mathy type, from the pseudocode itself.

  • We didn't have to walk through this whole verbal exercise or visualization

  • thereof.

  • What if we just go back to basics and just analyze our own pseudocode code,

  • or our own C code eventually, line by line?

  • So here's the original algorithm.

  • How many steps does this take to just check if n is less than 2?

  • Return.

  • We're done.

  • Big O of what?

  • There's just one step.

  • Or maybe, it's two, like the if and then the return.

  • But it's constant.

  • It's one or two.

  • We can debate that, but it's constant.

  • It has nothing to do with the size of n.

  • You're just making a finite number of decisions there, 1.

  • So you know what?

  • The running time, which I'm going to formulaically say is t of n.

  • So the running time when your input is of size n,

  • is just going to be like on the order of constant time, big O of 1,

  • whenever n is less than 2.

  • This is not a very bold claim.

  • It's sort of like I'm plucking off the easiest part of the question,

  • but it's at least correct.

  • If you have a small list, it's constant time.

  • The harder question is when we analyze the rest of the algorithm.

  • How many steps does it take to sort the left half of elements?

  • How many steps does it take to sort the right half of elements?

  • And how many steps does it take to sort-- rather,

  • to merge the sorted halves?

  • So we have three questions to answer.

  • And let me propose this.

  • You can kind of punt.

  • And this is the beauty of recursion.

  • You can sort of answer the question with the question.

  • The running time to sort n elements is really the running time

  • to sort n over 2 elements, the left half, plus the running time to sort n

  • over 2 elements, the right half, plus the merging step,

  • which I claimed per my sort of left/right finger intuition

  • is big O of n steps linear.

  • And now this holds if n is greater than or equal to 2.

  • Now, unfortunately, this is like a recurrence if you will, mathematically.

  • This really is a cyclical definition.

  • We really need to be able to generalize this and explain

  • it a little more formulaically.

  • But indeed, take a guess.

  • If you have your old physics textbook or your math textbook,

  • what does this recurrence actually equal if you multiply it all out

  • and prove it mathematically?

  • What's the running time total of something

  • where you do some function of n over 2 plus some function of n over 2

  • plus some number of linear steps where each of those inner running times

  • is recursively defined?

  • And just to be clear, the reason this is like non-obvious

  • and should be kind of over most heads, at least if you're

  • less familiar with this world, is that I'm saying that the running time, t, is

  • a function of the running time, t.

  • Like again, that's defining a word with the word itself.

  • But it turns out there are ways to prove this inductively, mathematically.

  • And indeed, if you actually multiply this all out in your textbook,

  • you will get something on the order of n times log n.

  • So where has that left us?

  • We've introduced a whole number of algorithms here, the last of which,

  • merge sort, gives us this ability to actually solve problems

  • much faster than any of the others.

  • But in each case, are we able to apply some general formulaic analysis,

  • asymptotic notation so to speak-- big O, big Omega, and big Theta, that

  • describes in general terms how much time these kinds of things take.

  • So let's now translate one of today's key ingredients to actual C code

  • so that we can see how we can leverage this programmatically.

  • So here in CS50 IDE, I've got an example, sigma 0 and sigma 1.

  • Both of which at the end of the day do the same thing,

  • but they do it differently.

  • And in sigma 0 here, we have the following code.

  • A sigma function that takes as input an integer m as its sole argument

  • and it has a return value of type int.

  • So it's going to return a number.

  • Main takes no command line arguments.

  • So no arg v, no arg c this time.

  • But inside of main is code that's going to pester the user again and again

  • until he or she gives us a positive integer as per this condition

  • down here.

  • And then, notice down here I'm going to call a function sigma,

  • passing in whatever the user's input was.

  • 10, for instance.

  • Or 11.

  • Or some other value.

  • Storing the answer in answer, and then just printing it out.

  • And what's nice about the fact that frankly you

  • can't see anything else on the screen is that for all intents and purposes,

  • sigma, the function at the moment, is a black box.

  • It is a function whose purpose in life is to add all of the numbers from 1 up

  • until the user's input and return it.

  • And we don't know yet or need to care how it's actually implemented.

  • If we do take a look inside that black box down lower in the file,

  • you'll see this code here.

  • Sigma takes an input, m, as an int.

  • Returns an int.

  • And what does it do inside?

  • Well, I've got what I'll call a counter-variable.

  • Called it sum.

  • Initialized it to 0.

  • And then using a for loop, iterating from 1 up to m,

  • because I want to sum all the numbers from 1 up through the value

  • the human has typed in.

  • I just want to keep adding to the sum that value.

  • So it's not just sum plus plus.

  • If I did sum plus plus as we keep seeing syntactically in examples,

  • that would just add 1.

  • I want to add in whatever i is.

  • Whether it's 1, or 2, or 3, as I'm counting all the way

  • up through the user's input.

  • And then I return sum.

  • But the neat thing about sigma is that it

  • affords an opportunity for recursion.

  • This isn't necessarily a better opportunity or more correct,

  • but we could view the world a little differently.

  • Sigma1.c is identical code in main, but the implementation of sigma

  • is fundamentally different.

  • And here's where it's kind of mind-blowing, at least at first glance.

  • You can implement a function, like sigma,

  • in such a way that the function calls itself.

  • Much like the running time for merge sort is defined in terms of itself.

  • Much like searching or sorting a list, it

  • can be recursively defined as searching or sorting a smaller list,

  • but with the same steps.

  • We implement this code in C as follows.

  • On this line here, if m is less than or equal to 0,

  • it's really just a sanity check.

  • If the human was messing with me and somehow

  • was able to inject a negative value or a 0 value into this function,

  • I don't want to get stuck in some random, like infinite loop.

  • I want to catch that.

  • So I'm just going to say less than or equal to 0 just to be super-defensive

  • when programming return 0.

  • It's not correct to say that, but I just want--

  • and I'll document this in my comments, for instance.

  • Return 0 if the user does not cooperate.

  • I need that base case so that stuff stops eventually.

  • And then, here's kind of the amazing thing.

  • In my "else" condition, is one line of code

  • that implements the entirety of this function.

  • The result of asking me the question, what is sigma of m,

  • is really the same thing as saying m plus sigma of m minus 1.

  • In other words, if the goal is to count up all of the numbers between 1 and m,

  • logically that's the same as counting up all the numbers from m to 1.

  • And so if I view it that way, if you give me m,

  • I am going to sort of on my cheat sheet write down the number m,

  • and then say plus whatever the summation is from m minus 1

  • all the way down to 1.

  • And you know what?

  • I have a function, a piece of code, that gives me exactly that answer if I just

  • pass-in that slightly smaller value.

  • So on first glance, it might look like sigma calls sigma.

  • That can't possibly be a good thing.

  • But it doesn't do it forever.

  • Sigma might call sigma might call sigma might call sigma.

  • But at what point in the story does sigma stop calling sigma?

  • AUDIENCE: [INAUDIBLE].

  • SPEAKER 1: When it's counted.

  • When m, the input, is what?

  • Negative.

  • Or we shouldn't hit negative ever.

  • We should first hit?

  • AUDIENCE: 0.

  • SPEAKER 1: 0.

  • So once m minus 1 gets so small, 1 minus 1, that it equals 0, this line of code

  • is going to kick in.

  • And it's going to say return 0.

  • Thereby, short circuiting this seemingly infinite loop,

  • which does, in fact, terminate eventually once the input

  • gets small enough.

  • Now, probably you shouldn't bother implementing code like this.

  • If anything, we'll talk about this in future weeks.

  • Calling function after function after function after function after actually

  • can take up a lot of memory.

  • It can actually cause your computer to crash in some cases.

  • And we'll talk about when and why.

  • So this isn't necessarily the best technical decision,

  • but it's illustrative of the capability we have in programming to map intuition

  • from our binary search or our merge sort into actual code.

  • And indeed, there's going to be data structures that we soon

  • see far fancier than arrays.

  • We're going to see things like link lists, and trees,

  • and tries, and hash tables.

  • And other data structures still that are going

  • to allow us opportunities to implement algorithms on top of those data

  • structures so to speak, to answer questions

  • using an ingredient like this one here.

  • But of course, we're left then fortunately with a better algorithm

  • than something like bubble sort.

  • And indeed, this is fairly well-known now it would seem.

  • And I thought we'd conclude with a look at this interview that

  • was conducted with a certain senator some years ago who was running

  • for the presidency and was asked by Google's own Eric Schmidt,

  • the CEO at the time, a set of interview questions.

  • One of which we thought was appropriate note to end on here.

  • [AUDIO PLAYBACK]

  • [APPLAUSE]

  • -Now, Senator.

  • You're here at Google.

  • And I like to think of the presidency as a job interview.

  • Now, it's hard to get a job as president.

  • You're going [INAUDIBLE] now.

  • It's also hard to get a job at Google.

  • We have questions.

  • And we ask our candidates questions.

  • And this one is from Larry Schwimmer.

  • [LAUGHTER]

  • You guys think I'm kidding?

  • It's right here.

  • What is the most efficient way to sort a million 32-bit integers?

  • [LAUGHTER]

  • -Well--

  • -I'm sorry.

  • Maybe--

  • -No, no, no, no.

  • I think the bubble sort would be the wrong way to go.

  • -Come on.

  • Who told him this?

  • OK.

  • I didn't see computer science in your background.

  • -We've got our spies in there.

  • [END PLAYBACK]

  • SPEAKER 1: All right, so that's it for week 3.

  • We will see you next time.

  • SPEAKER 2: So you want to know the full story?

  • SPEAKER 3: As far back as you can remember.

  • SPEAKER 2: All right.

  • Well, let me tell you.

  • The first time I met David Malan, I knew he was nothing but trouble.

  • SPEAKER 4: There's a David J. Malan here for you.

  • SPEAKER 5: Who?

  • SPEAKER 4: He says he's in your Computer Science 50 class.

  • Wants to talk about his grade.

  • SPEAKER 5: Are you kidding me?

  • I mean, who does he think he is, Bill Gates?

  • All right.

  • send him in.

[MUSIC PLAYING]

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

CS50 2016 - 第3週 - アルゴリズム (CS50 2016 - Week 3 - Algorithms)

  • 118 16
    洪鈺翔 に公開 2021 年 01 月 14 日
動画の中の単語