Placeholder Image

字幕表 動画を再生する

  • [VIDEO PLAYBACK]

  • [MUSIC PLAYING]

  • [END PLAYBACK]

  • DAVID J. MALAN: All right.

  • This is CS50, and this is Yale University.

  • Welcome to week seven.

  • So this class marks a transition between the first part of the course, where

  • we have been learning about the magic wonder world of C--

  • an extremely powerful low-level programming language that

  • has allowed us to solve many problems, indeed--

  • to a second part of the course, where we move

  • towards more high-level programming languages, such as Python.

  • In fact, this transition had already begun last week

  • when we learned about HTML, CSS, and the like.

  • And we left off, really, by commenting on the fact

  • that this new programming language, Python, can be conveniently used

  • to write something like the back end of google.com or facebook.com

  • or any web service, really, that accepts some parameters,

  • parse them, possibly look up for some code in a database,

  • possibly store some code in a database, and it gets back

  • to the user with dynamic output.

  • So before we get to that end and to see how Python can used to indeed write

  • the back end of a web server, it is instructive to see

  • how Python can be used as a tool, really, to do data analysis,

  • much like a data scientist will do.

  • And this is what we are going to see today,

  • diving into the magic world of machine learning.

  • But first of all, what is machine learning, really?

  • So these are a funny sequence of vignettes that I

  • found online on pythonprogramming.net.

  • And they represent the stereotypical life of programmers/researchers

  • in machine learning.

  • So let's see on the top left here.

  • Well, society thinks that they are creating hordes of robots,

  • possibly with the idea to conquer the world.

  • Right?

  • Friends think that they're hanging out with robots, really.

  • Now, parents-- their parents think that programmers in machine learning

  • spend most of their time in data centers with no windows, apparently.

  • What about other programmers?

  • Well, they might think that programmers do fancy mathematics.

  • And what about themself?

  • Well, they typically think that they're involved with some fancy visualization,

  • data analysis.

  • But at the end of the day, what they really do-- and here we are--

  • is using Python, and not just using Python to implement some algorithm,

  • but really using Python to import some algorithm as we are seeing here.

  • So we do not know Python yet.

  • But this line of code looks extremely readable, isn't it?

  • There is English there.

  • It says "from sklearn"-- we don't know what this is yet-- "import"--

  • I already mentioned svm, support vector machine.

  • It's a function to run a machine-learning algorithm.

  • So we don't know Python yet, but we're already

  • able to decipher, more or less, what is going on.

  • And indeed, sklearn, as we will see, is a so-called module

  • in Python-- a library, if you wish, in C-- from which we

  • are importing an algorithm, a function.

  • Now, this line exemplifies a characteristic feature of Python--

  • namely, it readability.

  • And it is often the case that Python code is referred to

  • as being similar to pseudocode, precisely for the fact

  • that it allows us to express very powerful ideas with a couple of lines

  • that are extremely readable.

  • Now, this very characteristic of Python, together

  • with our familiarity, at this point in the course, with C,

  • is what will allow us today to quickly see

  • how Python can be used as a data processing tool

  • to implement and run machine learning algorithms.

  • Well, so back to the questionn-- what is machine learning, really?

  • So as the previous sequence of vignettes were suggesting,

  • in the popular culture, at least, machine learning

  • is often associated with the world of AI, artificial intelligence.

  • Typically, we think of machine in terms of robots.

  • And literally, there are countless science fiction movies

  • about this theme, typically representing some robots turning evil

  • against humanity or in this line.

  • Indeed, this is a modern interpretation of an old fear

  • dating back, possibly, to Frankenstein in the 1800s and beyond.

  • But really, if we think of the way machine

  • learning is having an impact on our lives on a day-to-day basis,

  • it's not necessarily related to robots, per se.

  • It has more something to do with the following set of applications

  • that we indeed use on a daily basis.

  • So lets just think of search engines, for instance,

  • that allow us to look for whatever we might feel like in the world wide web,

  • and in a matter of literally milliseconds to get back

  • an order, at least, of results based on the query that we enter.

  • Think about image recognition, the possibility

  • to catalog, to search for an image based on its subject.

  • Think about speech recognition software.

  • These days, we can all talk to our phone and ask, what's the weather like

  • or show me the movies around me.

  • Or finally, just for these four sets of application

  • I mentioned, the world of natural language processing,

  • the amazing ability to translate a document of text from one language

  • to another in real time or the ability to, say, infer

  • the meaning, the semantics of a document with no human intervention.

  • So these are just a few of the applications that are indeed backed up

  • by machine learning algorithms.

  • And here by the word machine, we really mean an algorithm running, most likely,

  • on the cloud in a data center.

  • And today, we use these applications on a daily basis.

  • And so have you ever wondered what they're all about?

  • How can we get started to thinking to design one of these applications?

  • And this is what we're going to learn today.

  • So in particular, we will be focusing on two applications-- image recognition

  • and natural language processing.

  • But before that, let's go back to week zero.

  • So we have seen this diagram early on in the course.

  • It represents a general-purpose algorithm.

  • The black box is an algorithm that takes some input from the left-hand side.

  • It processes them, and it delivers the user with an output.

  • Now the class of application that we're going to see today

  • very much fits into this framework.

  • Just think about image recognition, for instance.

  • Well, we might want to have an algorithm that

  • takes as inputs images-- say here an image of a horse, image of a car--

  • and is capable or realizing or recognizing that there is, indeed,

  • a horse or a car in that image and get back to us

  • with strings, such as "this is a horse."

  • Or, well, in the world of natural language processing,

  • think about an algorithm where we could do something of the following--

  • say I want to pass as an input to this algorithm one of my favorite novels,

  • 1984 by George Orwell.

  • So this is just a part of it-- "BIG BROTHER is watching you."

  • But say that we want really the entire book to be fed up as an input

  • to this algorithm.

  • We want the algorithm to be able to recognize the semantics of the book.

  • So we want the algorithm, with no human intervention,

  • to be able to get back to us and tell us, look, Patrick,

  • this book is about politics.

  • It's about propaganda.

  • It's about privacy.

  • So in both of these two applications-- image recognition and natural language

  • processing-- it is already clear that what the machine learning

  • algorithm is doing is trying to infer some hidden structure in the input,

  • right?

  • And if we phrase the problem like that, well, we have seen something like that

  • earlier on in the course-- in fact, many examples of a scenario

  • where we are given an input with some hidden structure,

  • and we want to design-- you have done that in problem set four--

  • an algorithm that can decipher, that can crack the input and gives us an output.

  • So this is Whodunit.

  • And per the specific, we were given an image with some red noise.

  • And you were told to design an algorithm to get back with the hidden message.

  • So in this respect, you might think applications such as image recognition

  • seem to share the similar behavior, isn't it?

  • But there is a key difference.

  • Anyone can spot the difference?

  • We're getting started.

  • So one difference in the picture is that instead of just one image,

  • there are two images.

  • And indeed, that is somehow to the point in the sense

  • that with problem set four, we were given an image with a hidden message,

  • indeed.

  • But we were told the structure of the hidden message.

  • In other words, we were told that the image had a lot of red noise,

  • and we were told that in order to decipher, to really find

  • the hidden message, we should lower the intensity of the red pixel

  • and possibly rise the intensity of the other pixels.

  • But in machine learning applications, this

  • is not typically what we have in mind.

  • So we want an algorithm that can work with any sort of image

  • we may want to feed it with.

  • And this is what is bringing us to one of the key differences

  • in the class of application that we are going to see,

  • namely the fact that it is the algorithm itself

  • that is trying to find out the hidden structure in the input.

  • And the way you can do this is by having access to what is called training data.

  • In other words, even if we want to design an algorithm that

  • is capable to tell us, Patrick, look, this is a horse, in order to work,

  • this algorithm needs to have access to a lot of images of horses.

  • And based on these images, it is able to figure out the hidden structure

  • so that once you feed, again, an image of a horse, it can say,

  • this is a horse.

  • And this is what we are going to see today,

  • starting precisely from the example of image classification.

  • So what is the example we're going to play with?

  • It's the following-- suppose that we are given a data set of handwritten digits.

  • So there are 10 digits from zero to nine.

  • And for each of these digits, we are given

  • a collection of handwritten images representing that digit.

  • So for instance, for digit zero, we are given many of these images.

  • And each of is having a different way of writing a digit zero.

  • So what we set as our goal for today is to actually design, in this case,

  • an algorithm that can take an image of digit zero

  • or an image of digit six or any other digit

  • and, indeed, get us back with the string "0" or "6" or so on.

  • And the way you can do that, as we will see,

  • is by having access to the so-called training

  • data that is inferring the [INAUDIBLE] structure behind the images

  • we want to use.

  • But before we get to talk about images, let's us just abstract a little bit.

  • And let us think that we live in a one-dimensional world, a line land,

  • if you wish, where the only thing that we can see is a straight line.

  • Say we go to school, and the first thing that we are told in day one of school

  • is that, OK, I see a point here.

  • The teacher is telling us, well, look, Patrick.

  • This is number zero.

  • Then we go again, day two of school.

  • We see another point.

  • The teacher is telling us, this represents a zero.

  • Day three, we see another point.

  • It's the only thing that we can see in a one-dimensional world.

  • And the teacher is telling us, well, this is a six.

  • And so on.

  • So day four, we see another point.

  • This represents number zero.

  • And so on.

  • So this is representing number six.

  • This will represent the number six.

  • So in other words, we have been exposed to a labeled training

  • set of data-- points and the associated label corresponding

  • to the digits they represent.

  • So say now that we are presented, the next day of school,

  • with a so-called test point over here.

  • And we are asked, what should this number be?

  • What should this point represent?

  • Which digit?

  • Anyone?

  • Number six, indeed.

  • And congratulations because this is the first example.

  • This is the first machine learning algorithm

  • that we are going to discuss today, easy as it sounds.

  • So there is a lot going on.

  • First of all, just appreciate the fact that I haven't

  • told you anything about the structure.

  • I just presented to you a set of points.

  • We don't tell you anything about the game.

  • It's not like in problem set four, where we

  • were told about the specifics of the image to crack.

  • In this case, I was just presenting you with points.

  • And then at a certain moment, I was asking you, OK, what is this point?

  • And as you have guessed, well, this should represent the number six.

  • Why?

  • Well, because somehow, we have figured out

  • that the points representing number zero are grouping in one side of the line.

  • Points representing number six are grouping in another side of the line.

  • And so in our head, what we are doing, we're

  • looking at the closest point among the ones

  • that we have been exposed to previously, so the point with the minimal distance,

  • so the so-called nearest neighbor, and just

  • label the new point that we're given-- the test

  • point-- with the label of the closest point we've been exposed to.

  • So this algorithm is called nearest neighbor classifier.

  • And indeed, this is what we're going to use to classify handwritten digits.

  • But before we get to that, let's assume that we get promoted.

  • And so on the second year of school, we now see the world in two dimensions,

  • much like a flat land.

  • And we are exposed to a set of points-- again, a labeled set of points.

  • So this point represents number zero.

  • This is the number six, number six, number zero, number six,

  • number zero, six again.

  • And finally, we're presented with a test point.

  • So any guess on what this test point should represent?

  • It's the number zero.

  • And indeed, same reasoning, just done in two dimensions-- nearest neighbor

  • classified.

  • So in a way, this is very much intuitive.

  • And what we are left off with is the question of, OK,

  • can we map an image into a point in a space?

  • Because if we were able to do that, if we can take an image of a digit

  • and just interpret that image as a point in a space, then

  • we could repeat exactly the same procedure I told you.

  • In this case, there are no points.

  • There are images in this abstract space.

  • But we have access to a labeled training set.

  • So we know what these digits represent.

  • And then when we are asked, OK, what is this new digit?

  • Well, we apply the same logic, right?

  • And this is what we are going to see.

  • Indeed, the world can be seen as having more than one

  • or two or three dimensions.

  • And much of machine learning is really about interpreting

  • the data we are given as points in some sort of high-dimensional world.

  • And in making this jump, we might feel a little bit like the character

  • in one of my favorite novels, namely, Flatland by Edwin Abbott Abbott.

  • This is the [INAUDIBLE] of the first edition in 1884.

  • The plot of Flatland is the following-- the story

  • describes a two-dimensional world occupied by geometric figures.

  • The narrator is a square named A Square who

  • guides the reader through some of the implications of life in two dimensions.

  • On New Year's Eve, A Square dreams about a visit to a one-dimensional world,

  • Lineland, inhabited by lustrous points, in which

  • he attempts to convince the realm's monarch of a second dimension,

  • but he's unable to do so.

  • Following his vision, A Square is himself is visited

  • by a three-dimensional sphere named A Sphere,

  • which he cannot comprehend until he sees Spaceland, a three-dimensional world.

  • So many movies have been produced on this story line.

  • Let's see a trailer of one of those.

  • [VIDEO PLAYBACK]

  • -Imagine a vast plane, a world of only two dimensions

  • on which triangles, squares, pentagons, and other figures live

  • and move freely about.

  • -Configuration makes the man.

  • -Get to your squarical!

  • Now!

  • -You're only a square.

  • -Thanks, brother.

  • -They know nothing of our three-dimensional world.

  • -Such a notion is, of course, absurd and, furthermore, illegal!

  • -But that's all about to change.

  • -Where did you come from?

  • -I come from space, the third dimension.

  • -No, no, no, no, no, no, no, no!

  • No!

  • You're not serious!

  • -Based on the beloved novel by Edwin A. Abbott.

  • Tonight, our world faces a grave threat.

  • [END PLAYBACK]

  • DAVID J. MALAN: Right.

  • And then it goes on.

  • I love it.

  • It's one of my favorite.

  • You should watch the movie.

  • And so indeed, are we ready to go beyond the Lineland, Flatland, Spaceland.

  • So let's do it.

  • So here I just represented what Linland looks like.

  • Just one line, right?

  • The only thing that we can see in one dimension are really points.

  • And here I wrote two points for us.

  • What we have here is the coordinate system,

  • just to be able to measure distances, so to speak.

  • So in this case, this point is at location one.

  • This is what this one represents.

  • And this point is at location four.

  • So this is the picture in one dimension, Lineland.

  • Flatland we have seen-- two-dimensional world.

  • Indeed, we can visualize points here.

  • In this case, each point is represented by two coordinates.

  • We have the horizontal coordinate and the vertical coordinate.

  • So coordinate 1, horizontal, and 2, vertical.

  • And so on for the 4, 4.

  • Now we go to Spaceland-- indeed, a three-dimensional world.

  • And even here, it's pretty easy to visualize what points are, really.

  • There are three coordinates.

  • And so we can simply refer to a point with the associated coordinates.

  • So can we go beyond?

  • Indeed.

  • Indeed we can.

  • It's not that easy to draw points or the like in higher dimensions.

  • But we can indeed think of a point, say, in four dimensions

  • by being referred to by a set a coordinates-- say, 0, 1, 4, 5.

  • Isn't it?

  • So OK, this reference, the axes there don't mean much.

  • We're no longer in three dimensions.

  • Still, they just represent sort of an abstract space.

  • But indeed, I cannot four axes here, but can definitely think of a point indexed

  • by four coordinates as a point living in a four-dimensional world.

  • And so on.

  • If we want a point in a five-dimensional world, well, here we are.

  • And so we can go back to the idea, can we

  • map an image from this data set we have access

  • to to a point in a higher-dimensional space?

  • And as we have seen in problem set four, images

  • are just a collection of pixels-- in this case,

  • for the smile, just a collection of the zeroes and ones,

  • zeroes being associated to the color white, ones to the color black.

  • And even in the data set that we are playing with,

  • each image in this data set is simply an eight

  • by eight array of pixels, whereby each pixel in this case

  • is a number between 0 and 16.

  • So in some sense, we have eight by eight-- so it's 64.

  • So it's really a point in a 64-dimensional space, isn't it?

  • So we can really use this idea and interpret images

  • as points in this 64-dimensional space.

  • And now we can indeed run the nearest neighbor

  • classifier we have seen previously.

  • Namely, we are in this 64-dimensional space.

  • We see labeled images that are presented to us.

  • We are exposed to it, if you wish.

  • This represents a six, a six.

  • This represents a zero.

  • And so on, until we are presented with a test point again.

  • Fine.

  • We know how to deal with it.

  • We just simply assign to this first point the label of the training point

  • that is closest to it.

  • So the only thing that is missing here is a notion of distance, isn't it?

  • So let's see how we can go about thinking about this.

  • So in a one-dimensional world, it's pretty easy to compute distances.

  • Indeed, the distance between these two points is simply what?

  • This line, right?

  • So it's simply 4 minus 1, 3.

  • In a two-dimensional world, a little bit more

  • complicated, but we are still able to do so.

  • If this is the distance we want to have access to,

  • well, we have the old Pythagoras's theorem.

  • So we first compute this distance-- so the difference

  • between the horizontal coordinates of the two points, which is 4 minus 1,

  • is a 3.

  • Then we square it.

  • And we add the vertical distance between the two points, which is 4 minus 2-- 2.

  • And then we square it.

  • And then we take the square root.

  • Isn't it?

  • So in this case, well, it will be the square root of 13.

  • So even in a three-dimensional world, it's the same idea.

  • In order to get this distance between the two points,

  • we can simply work coordinate-wise.

  • So we can take the distance between the first coordinate, which is a 3,

  • square it, plus the distance between the second coordinate, which is a 2,

  • square it.

  • Plus the distance between the first coordinate, which is a 3,

  • and square it again.

  • And we can take the square root of this.

  • So we indeed have a formula to compute distances.

  • And this formula doesn't simply hold in one-, two-, or three-dimensional space.

  • It holds in many dimensions we may want to work.

  • And so we have all the ingredients to run the nearest neighbor

  • classifier at this point.

  • So just to give you an idea what these distances look

  • like, say we want the distance between these two images,

  • thought of as points in this 64-dimensional space.

  • Well, in this case, if we apply this formula coordinate-wise,

  • we get a distance of 31.98.

  • Say now we want to consider the distance between an image representing

  • digit zero and an image representing digit six.

  • Well, we do the math, and we find out 45.97, which is, indeed, bigger

  • than what we had previously.

  • Previously, we had 31 as the distance between two images representing

  • the same digit.

  • So it should be smaller than a distance between two images representing

  • different digits.

  • So at this point, we are ready to see some Python code.

  • So we are going to actually implement the nearest neighbor

  • classifier just described to you in abstract terms

  • by using these data sets.

  • Again, for each digit, we have access to a collection of different images

  • representing the digits.

  • So let's get to see some Python.

  • First of all, let me show you what by Python is.

  • We can go to the CS50 IDE and simply fire up the Python interpreter

  • by writing, python.

  • And here we are.

  • We're inside the interpreter, so to speak.

  • At this point, we can run Python code.

  • So let's see.

  • We can write, x = 3.

  • y = 5.

  • x + y equals-- guess what?

  • 8.

  • Amazing, isn't it?

  • Coming from the world of C, this is totally amazing.

  • Many things happening here.

  • First, there is no need of declaring variables whatsoever.

  • I didn't write int x = 3.

  • I simply wrote x = 3.

  • And in fact, we might write something like x = '8'.

  • y = 'b'.

  • And guess what-- x + y is equal to the string 'ab'.

  • So in Python, there is no difference between single quotes

  • and double quotes, as in C. And indeed, we do not

  • need to define what a variable is.

  • Another key difference with respect to C is the fact that it's immediate.

  • There is no clang or make or the like.

  • There is no need for us to call the compiler.

  • I was simply running the code, and the interpreter

  • was in fact interpreting or running the code line by line.

  • So indeed, there is a compiler behind the scene.

  • But we do not need to get involved with this.

  • This is one of the beauty of Python.

  • And in fact, coming from the world of C, we can read off Python code fairly easy

  • at this point.

  • Now, the syntax is a little bit different.

  • So let's see what a for loop will be.

  • for i in 3, 5, 7, print i.

  • And ta-da, this is a for loop.

  • So a few syntax differences, right?

  • First of all, it is more like a for each loop,

  • where we loop through each element in this array, if you wish.

  • And then there is a column that we don't even see.

  • More interestingly, there are no brackets, no curly brackets.

  • So how is Python knowing that the block of code

  • we want to execute inside the for loop is the line "print i"?

  • Well, it goes by indentation.

  • So if I were to repeat the line of code that I just

  • presented to you but without them indentation here,

  • you will see an error.

  • So in Python, there is no need of curly brackets.

  • It's simply a matter of the indentation you use.

  • And it is good style to use indentation even in C code,

  • so why the curly brackets after all?

  • So OK, I could carry on like this and show you some more Python

  • code within the CS50 IDE.

  • But for the sake of exposition, let me actually

  • close this and go to the following way of presenting code to you.

  • So these are called markdown type notebooks.

  • So the idea is that while the code is indeed run line

  • by line as Python Is doing, this way of presenting the material

  • is allowing me to group lines of code together.

  • So here it is, what I was writing earlier

  • in the CS50 IDE, the same line of code, same line of code again

  • when it comes to manipulating strings, the for loop

  • that I presented to you, and so on.

  • So as we see again, we don't know the syntax yet,

  • but we can read off what is happening coming from C.

  • This is an if statement, for instance.

  • Again, there are syntax differences.

  • Indeed, there are no brackets here.

  • There is the semicolon and there is the indentation.

  • But we can really decipher what is going on.

  • So this one the beauty of Python, and we are going to rely on this beauty now

  • and today to actually parse almost line by line some blocks of code that

  • will allow us to quickly jump into the action

  • and see some meaningful, cool output from machine learning algorithms.

  • So let's do that.

  • We go to the world of supervised learning, namely image recognition.

  • But before we get to that, what is supervised learning?

  • So the class of application in machine learning typically

  • fit into either supervised learning or unsupervised learning.

  • Today we are going to see both.

  • So the example of image recognition fits into the category

  • of supervised learning, as we have access to a labeled,

  • as I mentioned and stressed earlier, data set.

  • So we are presenting a set of code or a set of points or images

  • with a label associated to it.

  • As we had this label, we're doing supervised learning.

  • So let's start with points, and let's see how Python

  • can be used in Flatland, if you wish.

  • So Python has many built data types and functions.

  • Indeed, we are going to see much more of this next week.

  • But when it comes to data processing, data science type of application,

  • really typically, people rely on external data structures and functions.

  • And so this is what we are going to do today.

  • And this is the line of code that I'm writing here.

  • I'm importing two modules, they're called-- library,

  • if you wish, in the world of C. The first module is numpy.

  • It's one of the main modules for scientific computing.

  • The second module is matplot.

  • It will allow us to easily plot graphs.

  • And the third line of code, don't pay attention to it.

  • It is simply required there in this notebook style

  • to tell the notebook just print whatever graph we

  • are producing in line with the output.

  • So one of the cool things about Python is

  • that it is an extremely popular language.

  • And being popular in computer science, it is really important,

  • one reason being that there is a lot of code

  • out there that we can simply go and import.

  • If you wish, these numpy and matplot are on these lines.

  • Someone else has brought libraries, modules of code

  • that we can easily import.

  • So indeed, we need to install this module before we can import them.

  • But for the sake of today, just ignore the fact that we need to install them.

  • So let's just assume that we can easily import this module with a line of code

  • that I am presenting to you.

  • OK, let us create a training data set in Flatland.

  • So here is the code.

  • The first line creates a numpy array, meaning an array

  • within this numpy scientific module.

  • So it's very simple.

  • We're creating an array of six points.

  • Each point is a two-dimensional point, so it is indexed by two coordinates.

  • And so in the second line, instead, we are

  • creating, in this case, a built-in Python list, as we will see.

  • But it's simply an array of strings, if you wish.

  • And each element in the array Y_train is simply a color, name of a color.

  • We have three red strings and three blue ones.

  • So shortly, we will plot this collection of points.

  • But before we get to do that, let me just

  • show you some cool features of the Python syntax.

  • So X_train is an array of six points, each of them being

  • a two-dimensional vector, if you wish.

  • But you can also interpret these array as being a two-dimensional array.

  • And so we can use this syntax that here I present to you.

  • So the print function is simply printing whatever is inside it.

  • And so with this line of code, X_train 5, 0, what we are doing,

  • we are taking the fifth element in the array of points.

  • So Python is a zero-index language, much like C. So we start counting from zero.

  • And so in this case, the fifth point is the last one in this array--

  • so, namely, 7, 6.

  • And so we take the fifth point, the last one in that array,

  • and we can print the zeroth index coordinate.

  • So if we you that, the interpreter is outputting

  • 7, which is indeed the horizontal coordinate, if you wish,

  • of the last point in the collection.

  • Now, we can do the same while changing from the horizontal

  • coordinate to the vertical one.

  • And so we get 6, which is here.

  • So one other key feature of Python-- we are

  • going to get to see a little bit of it today--

  • is the so-called slicing syntax.

  • So slicing syntax is a convenient way to extract a collection of elements

  • from an array, as in this case.

  • So in this case, what we are doing, we are taking-- the first line of code

  • is saying, OK, Python.

  • Just look at all the points in the array.

  • Take the horizontal coordinate, the zeroth coordinate, and just print

  • all the elements in the horizontal coordinate.

  • Indeed, if you were to compare, we have a 1, a 2, 3, 5.5.

  • These are all the first, the horizontal, components of each point.

  • And so with the second component.

  • So this is one of the uses of this slicing syntax.

  • And what we can use this syntax for is to conveniently plot this point.

  • So now we are using-- OK, plt.figure is simply saying,

  • Python, just expect that I'm going to print a figure.

  • And the last line, as well, the .show Is simply saying, just output the plot.

  • So the magic, really, everything is happening in the line in between,

  • the scatter function that is contained, again,

  • in the matplot module we are importing.

  • So this scatter function takes two arrays, an array of horizontal

  • coordinates, if you wish, that we have access to through the slicing syntax,

  • and an array of vertical coordinates.

  • And then we have s equals 170-- simply the size, if you wish, of this point.

  • And the color yellow.

  • This is one of the beauties of Python, again.

  • We are using the slicing syntax again from the label arrays.

  • Recall that Y_train was an array of colors.

  • So every time we are plotting a point, we

  • are also taking the corresponding color from the array Y_train.

  • So if we do that, just appreciate that.

  • With essentially one line of code, we have a plot here.

  • And there are indeed six points.

  • Three of them are red.

  • Three of them are blue.

  • So this represents our so-called labeled training set.

  • Instead of having digits zero and six, now we have color red and blue.

  • The idea is the same.

  • So what we can do, we can now add a so-called test point,

  • say at location 3 and 4.

  • So why don't we print it?

  • We use the same line of code as before to print.

  • We add this new line with the new test point.

  • And we have it plot with the color green.

  • And this is the output.

  • Flatland.

  • So now what we want to do, we want to run this nearest neighbor classifier.

  • And we know why, right?

  • We simply look at the point that is closer to the green point here.

  • And we associate to the green point either a color green or blue,

  • depending on whatever color of the nearest neighbor has-- in this case,

  • green.

  • So in order to do that, we need to define a notion of distance in that.

  • Well, we know what the distance should look like.

  • We have the mathematical formula.

  • Let's write it down in Python.

  • And so you get to see how we can indeed define functions in Python.

  • So here, define-- and again, this is resembling pseudocode code,

  • right? def stands for defining.

  • Just appreciate this.

  • So define a function that we call dist that takes two points

  • and returns the following.

  • So let me just pass for you precisely what we are doing here.

  • So the line of code that we want to understand

  • is the following, where we take two points.

  • These could be, in this case, two-dimensional points, each of those.

  • But later on, we are going to 64-dimensional points.

  • And return with the Euclidean of [INAUDIBLE] distance.

  • So let's parse this.

  • Let's assume that we have two points.

  • y is the test point 3, 4, and x is one of the training points.

  • And we want to see what this line above is doing with respect to these inputs.

  • So first of all, we can take the difference.

  • And the way Python is thinking about this

  • is taking the difference coordinate-wise.

  • This is, again, one of the magic properties of the Python language--

  • namely, we can act, we can apply a function to the entire vector.

  • It's called vectorization.

  • So whenever there are vectors-- in this case, two-dimensional points

  • or the like or arrays-- we can apply a function to each element of the array.

  • So this is one case, if we take the difference,

  • Python will automatically take the difference of each coordinate at once.

  • So the difference between 1 and 3 is indeed minus 2.

  • And the difference between 1 and 4 is indeed minus 3.

  • So now we can take the power, the second-- we can square it.

  • And again, what Python is doing with this code

  • is simply thinking coordinate-wise.

  • So it's taking minus 2, and taking the square of it.

  • That's 4.

  • Taking 3, and taking the square of it as 9.

  • And so on.

  • Now we can use a function that comes with the numpy module simply summing

  • the two coordinates.

  • So 4 plus 9 is indeed 13.

  • And then we can take the square root of it.

  • So this is what this single line of code is doing.

  • Just appreciate the beauty behind that.

  • And indeed, we can define a function that simply returns

  • the distance between any two points.

  • And let us just compute for each point in the training set,

  • we compute the distance with respect to the single point, the test point.

  • So what we are doing here, we are computing.

  • There are six points in the train set, three red and three blue.

  • So we are taking the distance from the green point

  • to any other of these six points in the training set.

  • And so this is what this block of code is doing.

  • The length function is simply returning the amount

  • of points in the X_train array.

  • Then we initialize an array of zeros with six zeros.

  • And then this is simply a for loop that is computing the distance

  • using the dist function we defined between all these six pairs of points.

  • And this is the output that we print with the function print.

  • So this is an array where each element represent the distance with respect

  • to the test point.

  • Then what we can do to just complete our classifier

  • is just choose the point that has the closest distance.

  • In this case, it will be the second one, as we can see.

  • The distance is 1.8 here.

  • And we can simply print the color associated with that point.

  • And indeed, if we go back to the picture here,

  • we see that point number two here is the closest to the green dot.

  • And so here it is.

  • So just appreciate the beauty.

  • With literally three, four lines of code,

  • we can run a machine learning algorithm.

  • In fact, with using some more Pythonic, as it is called, syntax,

  • we can even get down to much less lines of code.

  • So OK, this is the case for points.

  • But let's go back to the image classification example.

  • Let us see how we can exactly take the precise lines of code

  • that I showed to you and apply it to images in the digit data set.

  • So this is what we are doing.

  • We are importing from the module sklearn--

  • it's a common module in machine learning for Python.

  • We are importing a data set that is the digit data set.

  • We call it digits.

  • So the digit data set contains 1,797 images in that.

  • And there are multiple structures within this database.

  • In particular, there are two arrays-- an array of images, which is called

  • digits.images, an array of labels, which is called digits.target.

  • So let us see.

  • Let us print the first element in the array digits.images.

  • So this is what it contains.

  • It is an eight by eight collection of pixels that indeed represents

  • the number zero, as we can see by actually

  • plotting with that line of code that I showed to you,

  • quickly mapping each pixel from 0 to 16 included to an intensity of black.

  • So indeed, we can realize that, well, this is a zero.

  • There are a few zeroes in the middle.

  • And indeed, if we plot it with this line of code, we indeed get number zero.

  • So this is just the first element indexed by zero in the data set.

  • And we can indeed also plot the true label

  • that counts with the digits.target.

  • And it is a zero.

  • So this is the data set.

  • And what we want to do is to run the nearest neighbor classifier

  • to this data.

  • So in particular, what we are doing-- so let's see.

  • This data set, again, something like this.

  • What we are doing, we are saying, OK.

  • Let us consider just a subset of the database.

  • So let us take 10 images.

  • And let us consider these 10 images as being the training data

  • that we have access to.

  • Potentially, we could have access to the entire database.

  • But somehow, we want to split.

  • And this is a common feature in machine learning,

  • to split the original data set into multiple subsets

  • in order to test the performance of the algorithm you are implementing.

  • So this is what we are doing.

  • By selecting 10 images, we essentially have this picture,

  • where each point in this 64-dimensional space represents an image of a digit.

  • And the yellow label here represents the true label

  • we have access to in this training set.

  • So now what we can do, we can say, OK.

  • Let us take a test point.

  • So say that-- take a test point here.

  • And Let us assume that we only have access to the image of this test

  • point, which is indeed a three.

  • And we want to test the performance of our machine learning algorithm

  • to classify this point.

  • So indeed, this is the line of code that allows us to only take

  • 10 images out of the original data set.

  • I'm using, again, the slicing syntax, but in this case,

  • a little bit differently in the sense that we are selecting elements

  • from zero included to 10 excluded.

  • This is per the specification of the slicing syntax.

  • The right, outmost element is excluded.

  • So indeed, if we use this syntax, we can extract 10 images from it,

  • precisely like in the picture there.

  • And then we can create a test image.

  • We can choose a random number here in the remaining part of the dataset,

  • say the image corresponding to 345.

  • Indeed, we can plot it with the same line of code I presented to you.

  • It is indeed a three.

  • But this is easy to the human eyes.

  • So we want to see how good of a performance

  • we can get by applying the nearest neighbor classifier.

  • And now the lines of code that I'm going to present to you now

  • are precisely the same lines of code that I

  • presented to you earlier in Flatland.

  • So let's see.

  • This is all together.

  • And indeed, we get that the classifier is returning number three.

  • So the classifier, what it's doing, again,

  • is computing all the distances between these test points

  • and all the other points in the training sets.

  • And it's choosing the point that has the closest distance, the nearest neighbor.

  • And it is assigned the same label of this point-- and in this case,

  • indeed, correct.

  • So it should come as a surprise that, indeed, such a simple algorithm--

  • two, three lines of code in Python to implement

  • it-- allows us to get such good a result. But how good of a result

  • is this, really?

  • Well, we can test it.

  • And indeed, we can plot the true solution.

  • We do have access to the true label, which is indeed a three.

  • So let us test how well we are doing with 100 test images.

  • So what we are doing, instead of just testing

  • the performance of our algorithm with a single image,

  • let us consider a set of 100 images.

  • And let us count how many mistakes the algorithm we just implemented gets.

  • So if we run this code-- I won't it parse it for you.

  • It's simply, again, taking into account that starting from a number of errors

  • equals 0.

  • And then there is a count here.

  • It's adding plus 1 every time that the algorithm is outputting something

  • that is different from the truth.

  • So if we run this algorithm over a set of 100 test images,

  • we get that we commit 37 errors.

  • So we get 63 correct answers out of 100, which is pretty good, really,

  • for such a simple algorithm, isn't it?

  • But indeed, much like the way humans learn,

  • human learning, also machine learning algorithm

  • gets better with the amount of training sets they have access to.

  • So in this case, we have just chosen a subset

  • of the original data base of 10 images.

  • So what we might try to do is to take a training set which is much bigger

  • and see how well the algorithm is doing with that.

  • So we can do that.

  • We indeed, enlarge the training set.

  • Before, it was from 0 to 10 excluded.

  • Now it is from 0 to 1,000 excluded.

  • So it has 1,000 images.

  • We can run exactly the same code as before over 100 test images.

  • And this time, look-- only three mistakes.

  • It's rather surprising, isn't it?

  • I mean, such a simple algorithm.

  • I described it to you starting from Lineland.

  • And basically, the idea was there.

  • Now it was a matter of coding it up.

  • There was a notion of a point in a higher-dimensional space.

  • There was a notion of a distance.

  • But once we figured that out and we code it up in Python,

  • Python doesn't care about the dimension, as we saw.

  • The same distance function that works with two-dimensional points

  • equally works with 64-dimensional points and higher.

  • And so this is what we achieve-- 97% of correctness

  • with, really, five lines of code.

  • So the question is, what if we try the very same algorithm

  • I just presented to you in a data base that looks like more

  • of what we will like to try it with?

  • So this is a popular database.

  • It's called CEFAR-10.

  • It is, again, same idea-- it is a labeled data base

  • that contains, in this case, really, tens of thousands or more

  • of images with a label.

  • So we indeed have 10 labels, as before.

  • But now, instead of the labels being numbers from 0 to 9,

  • the labels are something like airplanes, automobiles, birds, dogs, and so on.

  • So this is just one of the data sets that you can find.

  • And indeed, there are websites such as kaggle.com

  • that host sort of competitions where machine learning researchers

  • and programmers try out their algorithm, and there are challenges going on.

  • So this data set was popular a couple of years ago for one of these challenges.

  • A typical challenge could last anything in between two, three months

  • to even longer-- a year.

  • So it turns out that if we run the nearest neighbor

  • classifier to this new set of images, the performance is 30%.

  • So it is still much better than random guessing.

  • After all, there are 10 categories.

  • So you might suspect that just by random guessing, you get 10% correct.

  • Indeed, you get 30%.

  • But it's not what we would like, is it?

  • And in fact, there are more advanced algorithms that

  • do something a little bit different.

  • But let us see first what could be an issue with the algorithm

  • that we just ran?

  • So this is a training cert for the category zero in the previous data set.

  • And this is a few elements, a few images,

  • from the category horse in the new data set.

  • So one difference that pops to the eye immediately

  • is that these are color pictures.

  • And indeed, they are.

  • So, in fact, instead of being eight by eight pixels, they are 32 by 32.

  • So still rather small pictures, but now each pixel

  • is indeed a triple-- RGB, as we saw-- where each coordinate contains

  • a number from 0 to 255.

  • So in some sense, we are in a higher dimensional space.

  • It's not just 64.

  • But there is another key difference, isn't it?

  • Anyone?

  • Please.

  • AUDIENCE: The image can be rotated.

  • So you can have a horse that's facing one way or [INAUDIBLE].

  • DAVID J. MALAN: Great.

  • This is definitely one of the issues here, is viewpoint variation.

  • So we indeed have a sequence of pictures representing horses.

  • But the horses are taken from different angles, poses, right?

  • And in fact, there are all sort of issues here.

  • There are viewpoint variations, illumination conditions,

  • scale variation, deformation, occlusions, and the like.

  • And this is what is making image recognition a really tough challenge

  • for machine learning algorithms.

  • Now, what more sophisticated algorithms do

  • is not just interpreting images as collections of pixels, per se.

  • But they work on a higher level somehow.

  • So they group pixels together.

  • Instead of looking at one pixel at the time, what is happening,

  • they sort of try to extrapolate, to abstract

  • some higher-level feature of the code.

  • And this is an example.

  • So the digit zero can indeed be represented

  • as having the following four features, which

  • is four arches and the possible angles.

  • And indeed, we can go higher in the hierarchy.

  • So the top-of-the-art algorithm for this class of application, and not only,

  • is called deep learning.

  • They do work besides like I just described.

  • So instead of working with the pixels themself,

  • as at the bottom of this image, they try to group pixels together.

  • And they try to find out patterns if you group a few pixels together.

  • And the first layer of patterns they can extrapolate are edges, for instance.

  • And then from there, you can go another step up.

  • So you by grouping edges together, you can come up

  • with objects such as eyes or noses or mouth and the like.

  • And then grouping this set of objects again,

  • you can get something like a face.

  • So this is indeed the logic.

  • Deep learning is really a game changer.

  • The idea behind this type of technology is not new, in fact.

  • It relies on so-called neural networks that were invented

  • in the '80s or probably even earlier.

  • But it's only until recently-- literally four years ago,

  • five-- that researchers have been able to use this technology

  • to really achieve amazing results.

  • And now this technology is everywhere, not just for image recognition.

  • Google's search engine uses deep learning.

  • Just name one-- Facebook.

  • The tagging feature for pictures in Facebook is based on deep learning.

  • Speech recognition software-- Siri, Cortana, OK Google--

  • that one doesn't have a fancy name yet-- they are all based on deep learning.

  • And this is really a game changer.

  • It has brought a 10% or more increase in the performance

  • of [INAUDIBLE] algorithm in a matter of literally one jump.

  • Something like that was unseen in the field.

  • And indeed, deep learning is beyond the scope of this class.

  • In fact, you really need a lot of computational power

  • in order to run deep learning algorithms.

  • Not just that, you need tons of data out there.

  • And this is why in the '80s, they couldn't

  • run-- the theory was there, but first of all,

  • we didn't have access to the amount of data

  • we do have access to today, thanks to the world wide web.

  • And back then, we didn't have the processing capabilities

  • that we have now.

  • So while deep learning algorithms are beyond what we can actually

  • run in the CS50 IDE, we can indeed use some tool sets

  • that are out there that are based on deep learning.

  • So this is one example-- TensorFlow by Google.

  • So what you can do, you can actually download within Python,

  • for instance, a trained algorithm for you.

  • So without us trying to train an algorithm ourselves,

  • as we have been doing with the nearest neighbor classified,

  • we can download a file of 100 megabytes or more for doing image recognition.

  • Another example is the DeepDream generator.

  • So it turns out, as I mentioned, this type of algorithm,

  • they are able to figure out patterns in the input that we feed them with.

  • So what we can do with the DeepDream generator,

  • we can go online, upload whatever picture

  • we might like-- a picture of ourself-- then upload a picture of one

  • of our favorite paints.

  • And the algorithm is capable of recognizing the painting

  • style behind that painting and apply it to the original image we uploaded.

  • And this is why it's called DeepDream, because apparently dreams work

  • by mixing together images.

  • So if you apply this type of technology to that database

  • I showed to you earlier, we get an amazing performance

  • of 95%, which is indeed close to what the human eye can

  • achieve, in this case.

  • So the question is, is 95% enough?

  • Well, it really depends on the application.

  • And just imagine an example-- self-driving car.

  • As you know, it's a hot area.

  • There are a lot of players out there.

  • Tesla is one of the first players who was brought to the market,

  • really, autopilot-like features.

  • So the autopilot feature by Tesla is allowing the car

  • to shift lanes on the highway automatically,

  • to speed up or lower down based on the traffic, and so much more.

  • Now, this is just an assistant, and the company is making it clear,

  • so someone should always be in control of the car.

  • But it is indeed providing a lot of help.

  • And it turns out that the autopilot feature in cars like Tesla

  • do rely on a variety of technologies, such as GPS,

  • ultrasonic sensors, radars, and so on.

  • But they also do rely on forward-facing cameras

  • with image recognition software.

  • And indeed, they use these so-called deep learning technologies these days.

  • What is the problem?

  • Well, let us see a video.

  • [VIDEO PLAYBACK]

  • [MUSIC PLAYING]

  • [END PLAYBACK]

  • DAVID J. MALAN: So indeed, the reason investigation is going on--

  • and let me actually close this.

  • And as we saw, a driver was on a highway in Florida.

  • He was using the autopilot feature, apparently

  • relying almost exclusively on it, when a tractor trailer drove perpendicular

  • to it.

  • And if you read, from tesla.com, a statement that the company has released

  • after the accident, which happened a few months ago,

  • I read-- "Neither Autopilot nor the driver

  • noticed the white side of the tractor trailer against a brightly lit sky,

  • so the brake was not applied."

  • So it is indeed an issue with image recognition.

  • Apparently, the color of the trailer was whitish, white.

  • And so against a brightly lit sky, the algorithm,

  • although it performs something like 95% of the time correctly,

  • had some challenges.

  • So these are a few of the challenges.

  • And, in fact, the [INAUDIBLE] will be much interested in this respect.

  • Applying this type of technology for self-driving cars

  • will bring a lot of interesting questions in all fields--

  • not just computer science, of course, but politics with policies, ethics,

  • philosophy, and the like.

  • All right.

  • That was it for image recognition.

  • So let's now just move to the next application-- text clustering.

  • So we are going to go a little bit faster about that.

  • The application we have in mind is the following.

  • Say that I want to design an algorithm that takes as an input

  • the following list of movies-- in fact, not just the movie title,

  • but the IMDB synopsis for the movies.

  • So the movies are Robin Hood, The Matrix, The King's Speech, Aladdin,

  • and so on.

  • And I want to design an algorithm that, just solely based on these inputs,

  • is capable to cluster, as it is called, to group

  • these movies into two categories.

  • So if we stare at the list of movies, it might be evident,

  • if you like, that the clustering that we expect is something like that,

  • where we have a Beautiful Mind, The Matrix, The King's

  • Speech in one group and Robin Hood, Aladdin, Finding Nemo in another group.

  • And indeed, if I were to ask you, just group this list of movies

  • into two groups, most likely, this would have been the answer.

  • But your answer would have been based on something

  • different than what the machine will do, as we see, most likely.

  • In fact, you might say, OK, these are the two categories.

  • Because we know from before that, in a way, Robin Hood, Aladdin,

  • Finding Nemo are really more Disney-like movies, right?

  • They're for kids, if you wish, whereas the other are more action-type movies.

  • But again, this way of categorizing, clustering movies

  • is based on some sort of human learning, whereas the machine has only access

  • to the synopsis of the movies.

  • So let's see what will happen if we indeed try to run an algorithm

  • to do this clustering.

  • So as before, we try to abstract.

  • In this case, the set of applications we are discussing now

  • is called unsupervised learning because contrary

  • to what happened before, where we were presented a list of data,

  • a list of points, a list of images with a label associated to it, this time,

  • we are simply presented with a set of points.

  • And here it is in Lineland.

  • This is what we will see-- just a set of seven points.

  • And so if you are asked, OK, just split this set of points into two groups,

  • well, we have an easy way of doing that.

  • And once again, I haven't told you anything about the structure

  • to be inferred.

  • I was just presenting to you with a set of data points and asking to you,

  • just split this group of points into k equals 2 categories, groups.

  • So as before, this is indeed a well-known machine learning algorithm.

  • It's called K-means.

  • K there represents the number of clusters

  • we want the algorithm to divide the original data into.

  • And as before, from one dimension, we can go to two dimensions.

  • And if I ask you, OK, split this group of points into two groups, easy.

  • That's what K-mean is doing.

  • What about text now?

  • So before, we somehow had an easy way of mapping

  • an image, a collection of pixels, into a point in a higher-dimensional space.

  • How can we go about thinking-- doing something like that with text?

  • It's not that easy, isn't it?

  • Because if we were able to do, if we were

  • able to interpret each of these synopses as a point in a space, then

  • most likely, the picture will look like that.

  • And if I ask you to divide this group of movies into two categories,

  • that will be your answer.

  • So indeed, let's see how we can map some text input into a collection,

  • into a point in a higher-dimensional space.

  • So in order to do so, let me just step back for a moment

  • and describe the following, easier type of example, where as an input,

  • we have four strings.

  • So let's read.

  • First string is, "I love CS50.

  • Staff is awesome, awesome, awesome."

  • String A. String B-- "I have a dog and a cat."

  • String C-- "Best of CS50?

  • Staff.

  • And cakes.

  • OK.

  • CS50 staff."

  • String D-- "My dog keeps chasing my cat.

  • Dogs."

  • OK.

  • Say these are the four strings, and we ask, OK,

  • let's split it into two groups.

  • Most likely, what you will guess is that one cluster, one group

  • should include string A and string C. And the other cluster

  • should include string B and string D, based on the semantics, on the meaning.

  • So how can we do this?

  • And indeed, if this a representation in high-dimensional space,

  • this is what we will get.

  • So the missing step is really this mapping, right?

  • Mapping each of these four strings into a point in a higher-dimensional space.

  • And this is what we can do with the following interpretation.

  • So here, we have the four strings.

  • The first thing that we can do is look at the vocabulary used in these four

  • strings-- namely, extract the words that is used in each of the strings.

  • So if we do not consider so-called stop words-- namely,

  • words such as "I," "is," "a," "and," and the like,

  • which do not provide much meaning, this is the dictionary that we will extract.

  • There is the word "awesome," "best," "cakes," "cats," and so on.

  • And now if we look at each of the strings,

  • we can indeed map each string into a numerical point

  • by using the so-called bags of words interpretation, by which each string is

  • simply represented by the word count.

  • So let's see, for instance, the first string.

  • The word "awesome" is used three times.

  • And that's why we have a three there.

  • The word "CS50" in lowercase is used once.

  • The word "love" is used once, again.

  • And "staff" is used once, again.

  • Again, we do not consider the stop words such as "I" and "is."

  • So on-- so the second string can also be represented

  • by a numerical vector as this.

  • And indeed, there are 12 words in this dictionary.

  • So we can indeed think of each of these strings

  • as being a point in a 12-dimensional space, isn't it?

  • So not so simple.

  • This is a first great step.

  • But we should also normalize by the length of the string, if you wish,

  • just because if a string is very long, then

  • it's more likely that simply the rough counts will be higher.

  • What we can do easily is just divide each numerical vector

  • by the total amount of words in that string.

  • So we get a so-called frequency matrix.

  • So here it is.

  • We have a way to map a string into a point in a high-dimensional space,

  • a 12-dimensional space.

  • And what we can do, we can apply this algorithm K-means.

  • So let me just show you quickly how this can be done with Python

  • in the realm of unsupervised learning.

  • So now we will move much faster than before.

  • In this case, we we're importing the same modules as before,

  • numpy and matplot, and creating-- in this case, in the world of Flatland--

  • we're creating an array of seven points, which I here

  • plot with the same exact lines of code as before.

  • But in this case, instead of us implementing the machine

  • learning algorithm from scratch, we can do what stereotypically, at least,

  • most machine learning programmer or researcher will do,

  • which is importing the K-means algorithms from an external module.

  • So if we imported this algorithm, the details of how the algorithm works

  • is beyond the scope of this class.

  • But it wouldn't be that difficult, in fact.

  • But we can reasonably run this algorithm and say, OK,

  • algorithm, cluster this group of points into two groups. k equals 2.

  • If we do that-- I won't pass line-by-line what is happening-- simply

  • running the algorithm with k equals 2, these would be the output.

  • So indeed, the algorithm is capable of figuring out the two groups of points

  • based on their distance.

  • So we can also run the algorithm with k equals 3.

  • After all, we are deciding the number of groups to be created.

  • So if we run with k equals 3, the same lines of code

  • as before-- I'm changing k from 2 to 3-- we get three groups.

  • And so on.

  • With k equals 7, there are seven points.

  • And here it is.

  • So the crosses that are there present in the plot

  • are simply, if you wish, the center of mass,

  • center of gravity of the group-- simply the middle of the group.

  • So this is what we can do easily with points in the world of Flatland.

  • And we can, in fact, move, very much like we have done now,

  • to the world of documents.

  • And so this is the collection of strings I presented to you.

  • This is the bags of words matrix that we can easily

  • construct using some function from the external module sklearn.

  • Again, I won't spend much detail on parsing this code.

  • I want you to appreciate the fact that really, in a few lines of code,

  • we can get something like this running in Python.

  • So we can have a look at the dictionary, is what I presented to you

  • earlier-- "awesome," "best," "cakes," and the like.

  • We can get to the frequency matrix, as before.

  • And we can indeed run K-means with k equals 2.

  • And if we do that, we indeed have the output that we expect,

  • meaning the algorithm is capable of figuring out

  • that the two clusters should be divided by the words "dog,"

  • "cat," "keeps," and the words "awesome," "staff", and "CS50."

  • So this is per the simple example of strings.

  • We can go to the more interesting example

  • of movies with the IMDB synopsis.

  • Run precisely the same line of code.

  • Now, the inputs for the algorithm is the following list

  • of movies with their title and the synopsis from IMDB.

  • And we can easily import this Google spreadsheet

  • into Python using the pandas modules.

  • Again, I won't spend much time to it.

  • I want to get to the punch line.

  • This is the line of code to import these data sets.

  • Here is, indeed, if we printed these frames [INAUDIBLE] Python,

  • it's the same table as in the Google spreadsheet.

  • And from this time on, we can precisely take the same code

  • that we applied earlier in the easier example in this case.

  • And if we do so, just by creating the frequency matrix,

  • running K-means with k equals 2, we get the following output.

  • So indeed, the algorithm is capable of figuring out that one class of movies

  • should be including movies such as The King's Speech, Frozen Aladdin,

  • Cinderella, Robin Hood, and the like.

  • So The King's Speech, we wouldn't really expect that, right?

  • Frozen, Aladdin, Cinderella, Robin Hood-- OK, kids' movies.

  • But The King's Speech?

  • Well, let's hear the other cluster.

  • The other cluster would be Mad Max, The Matrix, No Country For Old

  • Men, and the like.

  • So the way the algorithm is thinking about it when we map it

  • to this higher-dimensional space is by grouping movies together

  • based on the count words, as we see.

  • And so the reason why it's taking The King's Speech

  • in the same group as Frozen, Aladdin, Robin Hood and so on

  • is because of words such as "king," "prince," "duke."

  • So this is the machine learning behind-- this what the machine is doing.

  • Again, it might sound a little bit counterintuitive,

  • coming from a human learning point of view.

  • But the machine has only access to those inputs that I showed to you earlier.

  • So let's just wrap up.

  • This was Python.

  • We have seen, indeed, two important applications in the real world--

  • so the image recognition application and the text clustering application.

  • These are just two applications out of countlessly many applications that

  • are changing our life on a daily basis.

  • And in fact-- well, at this point, just to mention something more in this.

  • This should ring a bell at this point in the class.

  • It is, indeed, the pyramid in Mario.

  • And indeed, we can run a machine learning algorithm

  • to play video games such as Mario.

  • The way we do that?

  • Well, we can have as a training set, we can have human players play Mario.

  • And we can have an algorithm watching them and learning from them.

  • Or we can have an algorithm watching another algorithm watching playing.

  • And this is what indeed does happen in the case of Go.

  • So Go is a popular chess-board-like game.

  • It's much like chess.

  • But now the number of combinations are astonishingly large.

  • It's more than the number of atoms in the universe.

  • So playing this game has always been considered sort of hard.

  • And it has been believed for a long time to be

  • outside of the reach of modern applications, modern machine

  • learning algorithms.

  • This was the picture up to a few months ago,

  • in March 2016, when an algorithm made by a researcher at Google

  • played against one of the world champions at this game.

  • And here it is, the world champion, world master of Go, Lee Sedol.

  • So before a series of five games against the machine,

  • Lee released a statement claiming that he would have expected

  • to win something like four to one.

  • Indeed, four to one was the final outcome,

  • but it was the other way around.

  • So the machine won four times out of one.

  • And what is really amazing-- this is perceived, again, as a game-changer.

  • Deep learning algorithms are behind this technology.

  • So much attention has been drawn to the game

  • not just because the machine won, in fact, four out of five, but because

  • during the game, really, new plays were made by the algorithm.

  • And the algorithm was trained not just by observing human masters

  • playing the game Go.

  • But it was also trained by looking at itself-- at the algorithm itself--

  • playing against another algorithm at Go.

  • So in having access to this set of training data,

  • the algorithm simply came up with new, astonishing moves

  • that not even commentators of the game were able to command.

  • And so this is where we left off by watching

  • some of the reaction of these commentators

  • while trying to interpret the moves made by the machine.

  • CS50 will be back next week with more on Python and on web servers.

  • [APPLAUSE]

  • [VIDEO PLAYBACK]

  • -And this is what [INAUDIBLE] from the Google team was talking about,

  • is this kind of evaluation, value--

  • -That's a very surprising move.

  • -I thought it was a mistake.

  • -Well, I thought it was a quick miss.

  • But--

  • -If it were online Go, we'd call it a clicko.

  • -Yeah, it's a very strange-- something like this would be a more normal move.

  • -OK, you're going to have to-- so do you have to think about this?

  • -This would be kind of a normal move.

  • And locally, white would answer here.

  • -Sure.

  • -But--

  • [END PLAYBACK]

  • DAVID J. MALAN: Thanks again.

  • [VIDEO PLAYBACK]

  • -All those freshman were, like, stoked for the first lecture, you know?

  • Like cake, candy, swag, candy.

  • They, like, got their own DJ.

  • -But the first lecture didn't go as planned.

  • -Malan, he had that phone book ready to, like, tear its heart out.

  • -Well, you might first open the phone book roughly to the middle, look down,

  • and--

  • -And then the dude just let it drop.

  • [MUSIC PLAYING]

  • -Hmm.

  • -Mind if I bum one of those?

  • -Oh, sure.

  • -Dude, there was something in those pages.

  • This was like nothing I'd ever seen before on the YouTubes.

  • -Was there any mention of Rosebud?

  • -Rosebud?

  • Is that, like, a programming language?

  • -[EXHALES]

  • [END PLAYBACK]

[VIDEO PLAYBACK]

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

CS50 2016 - 第7週 - 機械学習 (CS50 2016 - Week 7 - Machine Learning)

  • 206 30
    林宗炫 に公開 2021 年 01 月 14 日
動画の中の単語