Placeholder Image

字幕表 動画を再生する

  • [WHISTLE]

  • Hello.

  • And welcome to another video using Posenet and ML5.js.

  • But in this video, what I'm going

  • to do is take the output of the Posenet pre-trained model,

  • and feed that into an ML5 neural network to train,

  • oppose classifier, to recognize when

  • I'm making certain motions like a y, and m, a c, and an a.

  • Before I begin coding, let me quickly mention

  • something I added between the last video and now.

  • I'm mirroring the image so that when I raise my left hand,

  • it's mirrored to me what I'm seeing on the screen in front

  • of me over there.

  • This is important for interactivity.

  • It makes it feel much more intuitive and natural to see

  • yourself mirrored.

  • You might recall that the ML5 has

  • a specific function called Flip Image that will do it for you.

  • But I actually found, because I'm

  • drawing all this other stuff, that it's easier for me

  • to just write the code for it itself,

  • which involves a translate and a scale.

  • In other words, typically if I'm drawing an image, it's 00,

  • I'm drawing it right here, and the image gets painted across

  • the canvas.

  • But if I call scale negative 1,1,

  • it sets the x-axis going in the other direction.

  • So positive pixels go this way.

  • And if I translate over to here and put 00 here and draw

  • the image this way, it will appear reversed-- inverted,

  • flipped--

  • to the viewer.

  • So that's what's happening in these three steps right here.

  • The two videos that I'm assuming are prerequisites

  • here are the previous one, where I covered

  • all of the code for this particular Posenet example

  • that you're seeing running right here in the web editor,

  • as well as this train your own neural network set

  • of videos that covered the basics of how

  • the ML5 neural network function works to train a model

  • to play musical notes based on where the user clicks

  • their mouse in a canvas.

  • To get started, I could really begin with either one

  • of these sketches.

  • For example, I could go and get my Posenet code

  • and bring it into this particular sketch.

  • Or I could take the neural network code from this sketch

  • and bring it into the Posenet one.

  • I think I want to continue working from the Posenet sketch

  • itself.

  • And the first thing that I want to do

  • is create an object to store the neural network.

  • So I'm going to call that Brain.

  • And then after I initialize the Posenet model,

  • I'll say Brain is a new ML5 neural network.

  • And you might recall that anytime

  • you create a neural network, you can

  • specify a set of options for how you

  • configured that neural network.

  • All of the options for how to configure

  • an ML5 neural network, you can find on the documentation

  • page for the reference.

  • I'm just starting with these four basic properties-- inputs,

  • outputs, task, and debug.

  • So let's come over here to the whiteboard.

  • And let's diagram out what's going on.

  • Now remember, we're starting with the Posenet machine

  • learning model.

  • We're sending an image into that model as the input.

  • The Posenet model then takes that image

  • and does Pose estimation, making a guess

  • as to where all the key points are on the human body

  • that it sees.

  • And all of those points come in the form of xy pairs,

  • coordinates.

  • Here's my elbow.

  • Here's my shoulder.

  • Here's my ear.

  • It doesn't have an ear--

  • whatever-- nose, there's 17 of them.

  • All of this data is what I want to send

  • in as the input to my ML5 neural network.

  • ML5 neural network will take all these xy pairs

  • and classify them into a given pose that has a label.

  • It's a dab pose, or a Saturday Night Fever pose.

  • I don't know what kind of poses I'm going to make.

  • I'll do YMCA.

  • Why not?

  • This now tells me how I want to configure my neural network.

  • I want to send it 17 pairs of numbers.

  • That's 34 inputs.

  • And I want it to classify those 34 numbers

  • into one of four labels.

  • That is four outputs.

  • 34 inputs, four outputs, the task is classification.

  • And I do want to see debugging as I'm training the model.

  • And I have to give those options to the ML5 neural network

  • itself.

  • This is where things get kind of complicated because I

  • need to call Brain.AddData.

  • That's the way I add training data to my neural network.

  • So somewhere I have to have some kind of interaction.

  • Maybe I press a key.

  • I'll press the key Y and then it will wait a little bit.

  • And it'll know after five seconds,

  • for when I come over here, to start collecting pose data

  • for a certain amount of time.

  • Then it will stop.

  • And then I'll come back over here, and press a button,

  • and do something else.

  • So this requires a lot of thoughtfulness

  • in terms of how I might build the interaction around this.

  • I'm just going to try to do it in a simple way

  • that I can get it to work right here right now in this room.

  • For a much nicer example around interaction and collecting pose

  • data, you can take a look at Google Creative Lab's Teachable

  • Machines.

  • So I've made video tutorials about training image models

  • and sound models that can actually be imported into ML5.

  • At this moment, you cannot import the pose model into ML5.

  • That's something that we're working on.

  • And I'm hoping that this video tutorial

  • will lead the way to that.

  • But essentially, what I'm building

  • is a pose teachable machine.

  • I just won't do as thoughtful of an interaction as here

  • in the actual Teachable Machine project.

  • You can see here in Teachable Machine, for example,

  • there's a button that I can press.

  • And it's going to give me a 10-second countdown.

  • And then when I come over here, after 10 seconds

  • it's going to start collecting my poses.

  • So this is a much nicer example.

  • I encourage you to look at it for inspiration.

  • Of course, that was terrible training data.

  • But now I'm going to go back to my code

  • and try to implement my own version of this.

  • To keep track of the flow of the sketch,

  • let me add a variable called State.

  • And I'll just initialize it to waiting.

  • And then I will add the key pressed function.

  • And when I press the key, I want to say state equals collecting.

  • Only, I don't want to start collecting immediately.

  • I want to wait a little bit because it's

  • going to take me some time to walk over there and get

  • into my pose.

  • So I'll use Set Time Out for a delay.

  • So, Time Out is a built in function in JavaScript.

  • It's not part of P5 that will execute a function

  • after a certain amount of time.

  • And maybe I want to execute this function after a certain amount

  • of milliseconds.

  • So I can put a little function inside here.

  • I could use the arrow syntax.

  • There's a variety of ways I could approach this.

  • Let's just say 10 seconds later.

  • Right?

  • So when I press the key, 10 seconds later,

  • set the state equal to collecting.

  • Also have a variable called Target Label.

  • And I'll set the target label equal to the key that

  • was pressed.

  • All right, so I have this function going.

  • When I press the key, whatever key I press

  • is the target label.

  • I want to see that in the console.

  • And then 10 seconds later, I want

  • to see it say that it's starting to collect.

  • Let me make it one second later so I

  • don't have to wait as long.

  • All right, and I'm going to press the Y key.

  • Y, collecting-- perfect.

  • So this is the right idea.

  • Once that state switches to collecting,

  • I want to call the ML5 neural network Add Data function.

  • Where I want to call the add data function

  • is right here when I have a pose.

  • So when I have a pose, I want to say Brain, Add Data, and then

  • the inputs and the targets.

  • The inputs are all of the xy locations of the pose itself.

  • There's 34 of them.

  • I mean, I have kind of an issue where

  • the camera can't see my legs.

  • So I probably should ignore some of them.

  • But I'm just not going to worry about that.

  • I could also consider using the confidence scores.

  • Like maybe the confidence score, the neural network

  • could learn when it's a low confidence score

  • to kind of ignore that point.

  • But I'll ask you to try all that stuff if you're making

  • your own version of this.

  • I'm just going to use these 17 xy pairs.

  • So I need them to be in a plain old array.

  • And if you recall, they're not in a plain old array.

  • They're in this pose at key points

  • which each has an object, which is position.x.

  • So I need to flatten the data.

  • Whatever format the data is in, I

  • want to just put it into a plain array.

  • So I'm going to grab this loop.

  • I'm going to create an empty array called Inputs.

  • And I'll just say inputs.push x, inputs.push y.

  • So this is me going through the entire pose,

  • getting all the xy's, putting them

  • in an array, which is the input to the neural network.

  • And what's the target?

  • It also wants an array.

  • But in this case, it's one thing, just the label.

  • So I can take the target label, put it an array.

  • And that's what I'm giving an Add Data function.

  • You might recall in my previous neural network examples,

  • I was making objects that I passed in with named inputs

  • and outputs.

  • So this is just showing you that you can do it either way.

  • If I want to have names for all the inputs and outputs,

  • I can build an object with properties.

  • If I just want a big array of numbers,

  • I can just make it a plain array.

  • But there's a new problem.

  • The new problem is once I start collecting the data,

  • I'm going to strike the pose.

  • And maybe I'll collect the pose for a little while.

  • I've got to stop collecting the pose.

  • So let's go back up to where I started collecting the pose.

  • I'm going to do something awful.

  • This is so painful.

  • I don't want to do it.

  • Let's just do it and then we'll revisit it later.

  • We will.

  • [MUSIC PLAYING]

  • I'm going to call set Time Out again right inside here.

  • Because a second later or 10 seconds later,

  • I want to stop collecting.

  • This might be some of the worst code I've ever written.

  • It's really awful to look at.

  • It's what's informally known as callback hell.

  • And there's a variety of ways I could approach this differently

  • by using promises, and async, and await.

  • But in this case, really all I want to do

  • is set the state to collecting in 10 seconds.

  • Then 10 seconds later, set it back to waiting.

  • And I think this will work for me.

  • Let's give it a try.

  • I need to first press Y. One 1,000, two 1,000, three 1,000,

  • four 1,000, five 1,000-- collecting.

  • 10 seconds later it should say not collecting.

  • All r right.

  • OK, that worked.

  • What I'm doing here, quite poorly I might add,

  • is implementing a state machine.

  • So it might be nice for me to, in a separate video which,

  • if I can ever get around to making it,

  • talk about a more proper way of implementing a state machine.

  • But this works.

  • I set this state variable to collecting.

  • 10 seconds later, set it back to