Placeholder Image

字幕表 動画を再生する

  • (ding)

  • - Hello, and welcome to a Coding Challenge,

  • Quick, Draw edition.

  • Now, I have been talking about doing this

  • for a very long time, and I'm excited

  • to finally try this on my channel.

  • One of my favorite data sets that is out there

  • in the world is the quick draw dataset.

  • Now, here's the reason, one of the reasons

  • why I'm interested in this is not just this dataset

  • of 50 million drawings, which is interesting

  • and fun to play with on its own,

  • but there is something called Sketch RNN,

  • which was developed by a set of researchers

  • at Google, Google Brain,

  • and you can see some of them here who wrote this paper,

  • and explained how Sketch RNN is a neural network,

  • a recurrent neural network that learned about how

  • to draw various things from the quick draw dataset

  • and then can try and imagine and create new drawings

  • based on how it learned and can even interact

  • and draw with you.

  • So many possibilities.

  • So, this is where I'm going with this.

  • I am going to make...

  • Sketch RNN has recently been added to the ML5 library,

  • (ding)

  • and I'm going to show you an example,

  • and I'm going to build that with Sketch RNN ML5,

  • but I feel like before we start making

  • the artificially intelligent system that generates

  • the drawings, let's look at the actual data itself

  • that it was trained on.

  • So first, where did that data come from?

  • So, and apologies if I get anything wrong,

  • please let me know in the comments,

  • 'cause this is not my project, I am just inspired

  • and enthused by it, so the quick draw project

  • is a project, the AI experiment,

  • made by friends from Google,

  • and it is a game that you can play

  • where you say draw a pencil in under 20 seconds,

  • okay here we go, (vocalizing),

  • - [Robot] I see marker or lipstick.

  • - No. - Or crayon.

  • - No, no that's really like a pencil.

  • If I put an eraser here. - I see rocket.

  • - No, rocket, I'm the worst.

  • Aah.

  • - [Robot] I'm not sure what that is.

  • - Yeah, I don't know what that is either.

  • (ticking)

  • Time is runnin' out.

  • - [Robot] Sorry, I couldn't guess it.

  • - All right, let's try a basketball.

  • - [Robot] I see nose or moon or blueberry

  • or baseball or bracelet.

  • (laughs)

  • - [Robot] Oh, I know, it's basketball

  • (ding)

  • - All right, I win, okay, so you get the idea.

  • I could be stuck here for quite a while.

  • Now, what you might not, when you are playing this game,

  • your doodles are being collected,

  • and over 15 millions of players have contributed

  • millions of drawings playing Quick, Draw,

  • oh and I've used this before, right,

  • I made a example with a neural network

  • that tried to recognize your drawings.

  • This has been done on my channel before,

  • but what I haven't actually looked at,

  • what I looked at before was I looked

  • at all the drawings as pixels.

  • What's actually, what's interesting about the data,

  • is that the data which you find here,

  • information about it on GitHub, is not pixels,

  • it's actually the pixel paths of the people making

  • the drawings with timing information.

  • So you could load that data and replay any drawing back,

  • and each data, each drawing, has the word

  • that was associated with it,

  • the country where the person is from who drew,

  • at least the IP address presumably,

  • and then whether it was recognized

  • and then the actual drawing itself.

  • So, what I want to do, and you can see here that the format

  • of the data is a whole of XY positions,

  • XY, XY, XY, with timing, what time was I at the first point,

  • the second point, the third point.

  • Then, I might have lifted up my pen, moved and started doing

  • another one, so it's a bunch of strokes.

  • So this is, it's a little tricky 'cause I can't use

  • the word stroke as a variable name in P5,

  • 'cause stroke is a function that actually sets

  • the pen color, but the idea is that if I do this,

  • it's sampling a bunch of my points,

  • as I drew along that path,

  • each one of these is an XY point associated

  • with a given time, and then there is an array

  • with all of the Xs, all of the corresponding Ys

  • and the corresponding times.

  • Now, what I'm actually going to use in this video

  • is if there are a bunch of different versions of the data,

  • I'm going to use a simplified version of it

  • because these are huge data files,

  • but I encourage you as an exercise to try to do

  • what I'm going to do but with the non simplified version,

  • maybe with the timing aspect of it,

  • but the simplified drawing files are

  • the same exact thing, the same exact thing,

  • but no timing information,

  • and also they have been sub sampled,

  • meaning in theory, as the person is drawing,

  • as the user is drawing, a lot of points are being captured,

  • but maybe you don't need that level of detail,

  • and that's often referred to as pixel factor

  • or scale factor, I believe, or epsilon value, I guess.

  • You can say simplify all strokes using

  • the Ramer-Douglas-Peucker algorithm,

  • I don't know if I pronounced that correctly.

  • With epsilon value of two.

  • So, these are available as something called ndjson.

  • Now, if you've watched my videos before,

  • you're probably familiar with json,

  • JavaScript object notation, that is a format

  • where you can store data

  • that's in JavaScript object notation.

  • I have some videos about what is json.

  • Ndjson is a funny thing, ha ha, so hilarious,

  • it's the most, the funniest version of json, no,

  • and it is actually a set of multiple json elements,

  • each on a different line in a file,

  • so it makes sense to do it that each drawing's

  • its own sort of json object on a different line in a file.

  • So, let's go grab one of these files,

  • so getting the data, we can actually go

  • to the public data sets.

  • Oops, no, I'm sorry, I just want to go to the list

  • the files in the cloud console, which is right here.

  • I'm going to say I agree, and I don't want an email updates,

  • but I accept, okay.

  • Accept!

  • So I'm going to go to full.

  • I realize you can't see anything here,

  • so let's try to make this bigger.

  • Let me dismiss this right now, and come on.

  • I guess I'll make this smaller, and I'll just zoom in.

  • So these are the different formats,

  • they're actually all the data in binary,

  • there's this numpy.bitmap, which is useful for other kinds

  • of machine learning, different things you might want to try.

  • The raw data, but let's look at the simplified data,

  • and let's pick, oh, I don't know,

  • which model should I pick?

  • There's so many, banana, bandage, baseball, basketball,

  • bat, beach, bear, beard, I guess I should do beard.

  • Right?

  • That's kind of lame though.

  • Birthday cake, is there a unicorn?

  • Maybe there's a unicorn.

  • No, was there a rainbow?

  • Yes, there's a rainbow, (ding), all right,

  • so we'll use the rainbow.

  • So I am going to, oops, download this file.

  • So here's the thing.

  • This is a very large file.

  • I had a reason why I was doing this challenge also.

  • This is a 43 megabyte file.

  • Now I could just use some bode in my client side JavaScript

  • to load that file and put it on the web,

  • and at some point, I might show you some techniques

  • for doing that, stay tuned in the future,

  • but I think this is a good case where my video series,

  • sort of module for my programming from A to Z class,

  • or the program with text class, building an API

  • with Node and Express, this is a case where I've got this,

  • what if I wanted to have every drawing,

  • some of the just millions of them.

  • I don't want to load hundreds of megabytes

  • and gigabytes of files in my client side JavaScript.

  • I could write a little Node program whose sole purpose

  • is to hold on to all that data,

  • and my client side JavaScript could just request it.

  • So this could be because what I want to do is create an API

  • out in the world for people to get drawing information,

  • but this isn't data that I own in a way

  • that I would necessarily do that.

  • We'd have to look at the licensing

  • to see if that's even something reasonable to do.

  • Where is that eraser?

  • But, what I can do,

  • is on my computer here, right, the idea here is oh,

  • I'm going to make a server, and the server is going to hold all

  • of the drawings, and then my P5 sketch can just say,

  • hey, can make a request, like a get request,

  • please, could I have a rainbow?

  • And then the server's going to send back just a single drawing.

  • It's not going to send back hundreds of megabytes of data,

  • it's storing all the data, but it's going to send back

  • just one piece.

  • The interesting thing is this server

  • can easily just also run on the laptop.

  • And I could connect to it, so there's a variety of ways

  • you could deploy this and use this,

  • but I'm going to do it all from this laptop.

  • All right, so, to run a server with Node and Express,

  • you could go back and watch some of these videos

  • where I step through this in more detail,

  • I'm just going to start it in the directory in my console,

  • then I'm going to say npm init,

  • and I'm going to call this codingtrain_quickdraw_example,

  • and it's version 0.0.1, it is an example that I am making

  • on the Coding Train, and you know, whatever,

  • I'm going to skip through a lot of this stuff.

  • Yes.

  • Okay, so now, if I go to my code,

  • you can actually see I have this package.json file.

  • The package.json file has all that information

  • that I just entered.

  • This is the configuration file for my project.

  • Notice this, we're central manager of this project now.

  • So, I need a couple Node packages

  • to be able to make this work.

  • I need to use express, express is what I'm going to use

  • to handle that get request, this http get request.

  • So I'm going to say npm install express,

  • and then I also need something to load that ndjson file.

  • So ndjson Node, let's just,

  • I've actually used this before, but let's look.

  • So this is a Node package for loading an ndjson file,

  • so I'm going to say npm install ndjson.

  • Great, there we go, and now,

  • I meant to show you what does that ndj,

  • oh I got to grab that file now,

  • so I also need, I'm just going to change,

  • rename this to rainbow.ndjson, I'm going to drag it here

  • into my project, so now this is a huge file,

  • and so you can see that Visual Studio Code

  • is freaking out, it's like I don't want to deal

  • with this file because it's too big,

  • but you can see that what this is

  • is every single drawing on one line,

  • so it's like this is my database, essentially,

  • database of rainbow drawings.

  • I have a database of rainbow drawings, what could be better?

  • Okay, so what was I doing?

  • Back to the code in the server.