Placeholder Image

字幕表 動画を再生する

  • The following content is provided under a Creative

  • Commons license.

  • Your support will help MIT OpenCourseWare

  • continue to offer high quality educational resources for free.

  • To make a donation or to view additional materials

  • from hundreds of MIT courses, visit MIT OpenCourseWare

  • at ocw.mit.edu.

  • JOHN GUTTAG: All right, welcome to the 60002,

  • or if you were in 600, the second half of 600.

  • I'm John Guttag.

  • Let me start with a few administrative things.

  • What's the workload?

  • There are problem sets.

  • They'll all be programming problems

  • much in the style of 60001.

  • And the goal-- really twofold.

  • 60001 problem sets were mostly about you

  • learning to be a programmer.

  • A lot of that carries over.

  • No one learns to be a programmer in half a semester.

  • So a lot of it is to improve your skills,

  • but also there's a lot more, I would say,

  • conceptual, algorithmic material in 60002,

  • and the problem sets are designed

  • to help cement that as well as just

  • to give you programming experience.

  • Finger exercises, small things.

  • If they're taking you more than 15 minutes, let us know.

  • They really shouldn't, and they're generally

  • designed to help you learn a single concept, usually

  • a programming concept.

  • Reading assignments in the textbooks,

  • I've already posted the first reading assignment,

  • and essentially they should provide you a very different

  • take on the same material we're covering

  • in lectures and recitations.

  • We've tried to choose different examples for lectures

  • and from the textbooks for the most part,

  • so you get to see things in two slightly different ways.

  • There'll be a final exam based upon all of the above.

  • All right, prerequisites-- experience

  • writing object-oriented programs in Python, preferably

  • Python 3.5.

  • Familiarity with concepts of computational complexity.

  • You'll see even in today's lecture,

  • we'll be assuming that.

  • Familiarity with some simple algorithms.

  • If you took 60001 or you took the 60001 advanced

  • standing exam, you'll be fine.

  • Odds are you'll be fine anyway, but that's

  • the safest way to do it.

  • So the programming assignments are

  • going to be a bit easier, at least that's

  • what students have reported in the past,

  • because they'll be more focused on the problem to be solved

  • than on the actual programming.

  • The lecture content, more abstract.

  • The lectures will be--

  • and maybe I'm speaking euphemistically--

  • a bit faster paced.

  • So hang on to your seats.

  • And the course is really less about programming

  • and more about dipping your toe into the exotic world of data

  • science.

  • We do want you to hone your programming skills.

  • There'll be a few additional bits of Python.

  • Today, for example, we'll talk about lambda abstraction.

  • Inevitably, some comments about software engineering,

  • how to structure your code, more emphasis in using packages.

  • Hopefully it will go a little bit smoother

  • than in the last problem set in 60001.

  • And finally, it's the old joke about programming

  • that somebody walks up to a taxi driver in New York City

  • and says, "I'm lost.

  • How do I get to Carnegie Hall?"

  • The taxi driver turns to the person

  • and says, "practice, practice, practice."

  • And that's really the only way to learn to program

  • is practice, practice, practice.

  • The main topic of the course is what I think

  • of as computational models.

  • How do we use computation to understand

  • the world in which we live?

  • What is a model?

  • To me I think of it as an experimental device

  • that can help us to either understand something that

  • has happened, to sort of build a model that explains phenomena

  • we see every day, or a model that

  • will allow us to predict the future, something

  • that hasn't happened.

  • So you can think of, for example, a climate change

  • model.

  • We can build models that sort of explain how the climate has

  • changed over the millennia, and then we

  • can build probably a slightly different model

  • that might predict what it will be like in the future.

  • So essentially what's happening is

  • science is moving out of the wet lab and into the computer.

  • Increasingly, I'm sure you all see this--

  • those of you who are science majors--

  • an increasing reliance on computation rather than

  • traditional experimentation.

  • As we'll talk about, traditional experimentation

  • is and will remain important, but now it

  • has to really be supplemented by computation.

  • We'll talk about three kinds of models--

  • optimization models, statistical models, and simulation models.

  • So let's talk first about optimization models.

  • An optimization model is a very simple thing.

  • We start with an objective function that's either

  • to be maximized or minimized.

  • So for, example, if I'm going from New York to Boston,

  • I might want to find a route by car or plane

  • or train that minimizes the total travel time.

  • So my objective function would be

  • the number of minutes spent in transit getting from a to b.

  • We then often have to layer on top of that objective function

  • a set of constraints, sometimes empty, that we have to obey.

  • So maybe the fastest way to get from New York to Boston

  • is to take a plane, but I only have $100 to spend.

  • So that option is off the table.

  • So I have the constraints there on the amount

  • of money I can spend.

  • Or maybe I have to be in Boston before 5:00 PM

  • and while the bus would get me there for $15,

  • it won't get me there before 5:00.

  • And so maybe what I'm left with is driving,

  • something like that.

  • So objective function, something you're either

  • minimizing or maximizing, and a set of constraints

  • that eliminate some solutions.

  • And as we'll see, there's an asymmetry here.

  • We handle these two things differently.

  • We use these things all the time.

  • I commute to work using Waze, which essentially is solving--

  • not very well, I believe-- an optimization problem

  • to minimize my time from home to here.

  • When you travel, maybe you log into various advisory programs

  • that try and optimize things for you.

  • They're all over the place.

  • Today you really can't avoid using optimization algorithm

  • as you get through life.

  • Pretty abstract.

  • Let's talk about a specific optimization problem

  • called the knapsack problem.

  • The first time I talked about the knapsack problem

  • I neglected to show a picture of a knapsack,

  • and I was 10 minutes into it before I

  • realized most of the class had no idea what a knapsack was.

  • It's what we old people used to call a backpack,

  • and they used to look more like that than they look today.

  • So the knapsack problem involves--

  • usually it's told in terms of a burglar who breaks into a house

  • and wants to steal a bunch of stuff

  • but has a knapsack that will only

  • hold a finite amount of stuff that he or she wishes to steal.

  • And so the burglar has to solve the optimization problem

  • of stealing the stuff with the most value while obeying

  • the constraint that it all has to fit in the knapsack.

  • So we have an objective function.

  • I'll get the most for this when I fence it.

  • And a constraint, it has to fit in my backpack.

  • And you can guess which of these might be

  • the most valuable items here.

  • So here is in words, written words what I just said orally.

  • There's more stuff than you can carry,

  • and you have to choose which stuff to take

  • and which to leave behind.

  • I should point out that there are two variants of it.

  • There's the 0/1 knapsack problem and the continuous.

  • The 0/1 would be illustrated by something like this.

  • So the 0/1 knapsack problem means you either take

  • the object or you don't.

  • I take that whole gold bar or I take none of it.

  • The continuous or so-called fractional knapsack problem

  • says I can take pieces of it.

  • So maybe if I take in my gold bar

  • and shaved it into gold dust, I then can say,

  • well, the whole thing won't fit in,

  • but I can fit in a path, part of it.

  • The continuous knapsack problem is really boring.

  • It's easy to solve.

  • How do you think you would solve the continuous problem?

  • Suppose you had over here a pile of gold and a pile of silver

  • and a pile of raisins, and you wanted to maximize your value.

  • Well, you'd fill up your knapsack with gold

  • until you either ran out of gold or ran out of space.

  • If you haven't run out of space, you'll

  • now put silver in until you run out of space.

  • If you still haven't run out of space,

  • well, then you'll take as many raisins as you can fit in.

  • But you can solve it with what's called a greedy algorithm,

  • and we'll talk much more about this as we go forward.

  • Where you take the best thing first as long as

  • you can and then you move on to the next thing.

  • As we'll see, the 0/1 knapsack problem

  • is much more complicated because once you make a decision,

  • it will affect the future decisions.

  • Let's look at an example, and I should probably warn you,

  • if you're hungry, this is not going to be a fun lecture.

  • So here is my least favorite because I always

  • want to eat more than I'm supposed to eat.

  • So the point is typically knapsack problems

  • are not physical knapsacks but some conceptual idea.

  • So let's say that I'm allowed 1,500 calories of food,

  • and these are my options.

  • I have to go about deciding, looking at this food--

  • and it's interesting, again, there's things showing up

  • on your screen that are not showing up on my screen,

  • but they're harmless, things like how my mouse works.

  • Anyway, so I'm trying to take some fraction of this food,

  • and it can't add up to more than 1,500 calories.

  • The problem might be that once I take something that's

  • 1,485 calories, I can't take anything

  • else, or maybe 1,200 calories and everything else is

  • more than 300.

  • So once I take one thing, it constrains possible solutions.

  • A greedy algorithm, as we'll see,

  • is not guaranteed to give me the best answer.

  • Let's look at a formalization of it.

  • So each item is represented by a pair, the value of the item

  • and the weight of the item.

  • And let's assume the knapsack can accommodate items

  • with the total weight of no more than w.

  • I apologize for the short variable names,

  • but they're easier to fit on a slide.

  • Finally, we're going to have a vector l

  • of length n representing the set of available items.

  • This is assuming we have n items to choose from.

  • So each element of the vector represents an item.

  • So those are the items we have.

  • And then another vector v is going

  • to indicate whether or not an item was taken.

  • So essentially I'm going to use a binary number

  • to represent the set of items I choose to take.

  • For item three say, if bit three is zero

  • I'm not taking the item.

  • If bit three is one, then I am taking the item.

  • So it just shows I can now very nicely

  • represent what I've done by a single vector of zeros

  • and ones.

  • Let me pause for a second.

  • Does anyone have any questions about this setup?

  • It's important to get this setup because what

  • we're going to see now depends upon that setting in your head.

  • So I've kind of used mathematics to describe the backpack

  • problem.

  • And that's typically the way we deal with these optimization

  • problems.

  • We start with some informal description,

  • and then we translate them into a mathematical representation.

  • So here it is.

  • We're going to try and find a vector

  • v that maximizes the sum of V sub i times I sub i.