字幕表 動画を再生する
-
What's going on?
-
Everybody welcome to part six of the M l and Deep Learning with the highlight three challenge tutorials in this video, we're just gonna continue building on the last one, which I believe I left off fixing a slight bug that we weren't able to train for multiple E pox.
-
So let me quickly bring up tensor board tensor board.
-
Lauder Equal lips blogger equals longs doing it.
-
10 surfboard logger equals long slash.
-
And for some reason I get a bunch of this.
-
I think it's because we're really we're kind of overriding the steps with my crappy way of doing things, but we'll get through it.
-
So then, um, I found that the best way to get there is with local host Colin 6006.
-
If I use like 1 27 001 that doesn't work.
-
And if I use the machine name Colon 6006 that doesn't work either.
-
So anyways, this is what's working.
-
So let me just smoothies quite a bit.
-
Um, so the red line is three whole pox.
-
Uh, the orange and blue is just one iPAQ.
-
I'm gonna switch these two relative so you can kind of see those compared, Um, but really, after three A pox, Really?
-
I mean, there's a mean, maybe it's slightly comes up, or at least it levels out in the in sample.
-
But then at a sample, not too too impressive.
-
I have to switch it to relative to show you the iPAQ Val ac the again because of the way they were doing things.
-
So again, the orders line is actually only one iPAQ.
-
The red line is three pox in the blue line is one he parked.
-
The only difference is the Orange line was the time when we actually had to load the data and prepare it.
-
Whereas these two down here, we didn't.
-
So, as you can see preparing the data, it takes, like, three times as long.
-
So anyway, accuracy somewhere between 24 a half and 25%.
-
That's pretty good, considering, you know, random choice would be 20%.
-
And in this case, you know, we've trained a model off, you know, using a threshold.
-
Sure, but we're training off of models that are moving randomly.
-
So the fact that we got 4 to 5% over random is actually pretty impressive.
-
That tells me our threshold probably worked, but now it's time to find out.
-
So now we begin the massively iterative process where we've trained a model.
-
Now we want to run a bunch of games with that model and some threshold.
-
And then the hope is that our average or mean kind of Hal I collected has gone up a little bit.
-
Any little bit will help, especially these earlier games.
-
So, uh, so what I'd like to do is, First of all, let's go ahead and just copy training data.
-
In fact, I figured this That'll take too long.
-
But I'd like to do let me just delete this.
-
We'll just call this training data and I'm gonna call it underscore one just for the first set and then we'll make a new folder.
-
Call it training data.
-
Now that's empty.
-
And then we can run those stats that we did where we do the hist a gram, but also just like the mean on all that we can run some stats and see, Has the model actually improved at all?
-
We also could do the model versus the random and see which one wins more often or something like that.
-
But anyways, what we shall do now is implement the aye aye rather than random incentive on.
-
So I'm gonna copy Paste Center by and now that's just sent Abi.
-
And then it'll be Dash, not an ill will be.
-
Dash em.
-
Hell, Okay, let's pull that up.
-
And I am that I just close everything else.
-
Cool.
-
All right.
-
So we have to make a few imports also because of the way Like how light uses standard aRer outputs and stuff for the, you know, the game and python to talk to each other.
-
So we have to make sure we silence all that stuff.
-
So I'm gonna make a few imports here in Port Au s, Simon import cysts.
-
Well, it's a standard error equals sister s t d air.
-
And they were, say, at sister S t d air equals open.
-
Oh s dot definable with the intention to write.
-
And then we're going to import tensor flow as t f.
-
I hate that sublime.
-
Does that to me?
-
Space should not.
-
Where is it?
-
Oh, it's cause you hit enter for the new line.
-
I don't know how you get around that actually.
-
Anyway, system standard error equals standard air.
-
So the issue is when you first import tensorflow in, it's gonna tell you things.
-
Like what?
-
Back end.
-
You're not the back end that you're using.
-
Butt's gonna tell you things about, like, that's what caress it's gonna tell you things about, like, your GPU and stuff like that.
-
And we need to silence that.
-
So this is how we're doing it.
-
Basically, to silence all that output if you have it, if the game is gonna air out.
-
So now Ah, Santa ma m l can remain the name.
-
That's totally fine.
-
Uh, next thing we need to do is continue to silence any of the tensorflow outputs.
-
So, um, I guess we can still write this.
-
And I thought about just copying and pasting X.
-
This is just kind of cookie cutter code that was used last year to silence stuff from Hal I TTE Anyway, os dot environ, uh, and then what we want to say in here is T f c p p men long level, and we're gonna set that equal to three and string.
-
And then now we want to set the GPU options that's gonna be equal to t f dot All caps GPU capital.
-
Oh, options on this will be per process GPU memory fraction and we set that to 0.5 This is a small model.
-
We don't need much memory.
-
Why do we want to do this?
-
Well, by default, the TENSORFLOW is gonna allocate as much as it can and then keep stuff just floating in memory to attempt to make things as fast as possible for you.
-
The problem is, this is going to exhaust a lot of power Now.
-
The next thing is, I'd like to run three or four simultaneously simultaneous instances.
-
So you have to keep in mind that at any given time, upto like, say, 16 a eyes could be trying to run and make predictions at any given time.
-
So it's really important that we set this number very low now, depending on how many you can run at any given time.
-
Given your certain CPU and all that, you can adjust this fraction, but I've actually found it runs perfectly fine and actually it seems to run faster on a low GPU memory for action.
-
And I'm unsure why, but it seems much faster, like 10 times faster.
-
It's very strange.
-
Anyway, I think that must be some sort of bug or something.
-
That is very unintended, but definitely something I would look into in the future with a larger model.
-
Even so anyways, now we're gonna set our session cess equals T f session.
-
And then we're doing this so that when we loaded our model, it's gonna follow this session for us and use our Jeep you options that we set.
-
So config equals t f dot com vig proto and then GPU underscore Options equals GPU Underscore options.
-
Great.
-
Now we're ready to load in our model t f equals t f dot care.
-
Ah, stop models don't load.
-
Underscore model.
-
And then we give the model name.
-
I forgot the name and the other thing we want is random.
-
Underscore chance.
-
And I'm gonna call this secret stock choice.
-
And then in here, I'm gonna say 0 15 250.35 So this will be a random chance that the A I will choose a random move over the model's choice.
-
So we still wanna have some exploration like we can't just train models and then because the model isn't random anymore.
-
So we want to still have some random kind of permutations.
-
22 for the model to learn from new moves over time.
-
So, um, so, anyways, you know, 0.15 would be, you know, 15% chance of every move A 150.25 would be fourth of all the moves, 0.35 35% of the time will be Iran to move.
-
And then every time we initialize this, but some boss will be 35% random, some will be 25 some will be 15.
-
You can play with those numbers as you want.
-
We just know that we want some random exploration involved here.
-
The next thing I'd like to do is go into models and let's pick.
-
00 no.
-
We've made a horrible mistake.
-
That's a huge bummer to see.
-
We can train models pretty quickly.
-
That stinks.
-
Uh, luckily, I think that model should be fine, but let me fix that.
-
In the training code, we must have for gotten f it, uh, shoot.
-
Where is sent to train.
-
Um Mmm.
-
We've been I guess, that the model save.
-
Yeah.
-
Dang, what an idiot.
-
So, this one, it will be the three iPAQ model, which is fine.
-
I'm gonna, uh I'm not sure I would have used that one.
-
The other thing that might be wise to save, like, the score of the model.
-
So maybe, like, run a model that score right before the save.
-
Something like that.
-
Uh, whatever.
-
Uh, I'm gonna come into here, and I'm gonna rename this one, uh, models.
-
Hopefully, Uh, man, I just feel pretty bad because I wonder.
-
I hope that's a good bottle.
-
Probably.
-
What i'll do is I'll finish the code here, make sure it works, and then I'm probably gonna retrain that model, cause I don't have faith that I didn't accidentally run train and then I stopped it or whatever.
-
After a quick short, it's, uh, segment.
-
So, anyway, I'm gonna call this for now.
-
Phase one anyway.
-
Come back up to center train, er or not said to train close that.
-
Yeah, there.
-
If we run in trouble, come back to this script.
-
Come to the top in the model we wanna load.
-
Sent about ml.
-
Where?
-
Here.
-
Phase one.
-
All right.
-
Save the progress so far.
-
Okay, Now what we want to do is I think I actually kinda want I just want to leave the threshold the same for now, because I want to see if the mean has gone up.
-
Like, I just want to see if it's actually gotten better or not.
-
So I'm actually gonna leave the threshold the same with the hope that it actually doesn't prove same thing with total turns.
-
I'm gonna leave everything the same, make the same amount of ships, all that.
-
And instead, what I want to do is I want to come down to where we actually make our, uh, our choice.
-
So choice seekers, Choice range, Len, Direction, order.
-
So what we're gonna say is, if, ah, secrets dot choice range of int one out of random chance equals one, then we want to say choice equals secrets dot Choice.
-
Um, well, in fact, basically this I think that's what I want.
-
So basically, if it's one out of, you know, 15% or whatever, as you make that, uh, range, the question is, if that number is equal to one, then we'll make a random movement.
-
Otherwise, what do we want to say so else we're gonna say Model underscore Out equals model dot Predict And then we want to predict, uh, always prediction takes a list, even if it's one thing we want to predict.
-
We need to predict the numb pie array conversion of surroundings with a dot reshape negative one.
-
And then it's negative one by the size, by the size by three channels.
-
So in this case, we can say Lend surroundings Comma Len, it's not running on my face.
-
No surroundings.
-
Comma three.
-
Let me zoom out a little bit so we can see the full line model out Models predict numb pyre a conversion dot Reshape negative one by 33 by 33 by three.
-
Now this returns because you pass a list, it also returns a list.
-
So we want the zero with element.
-
So that's the actual prediction.
-
Now, even though we were using scale er's it is still gonna output to us a one hunt vector 100 Ray.
-
So we want to say prediction equals np dot Argh!
-
Max of model out now logging dot info.
-
We want to save this.
-
We're gonna say f prediction.
-
And that is gonna be direction, order prediction, and then choice choice equals prediction.
-
So then we leave all of this other stuff.
-
And so as we build the data, the data is built with either are random movement or we're not.
-
Um, just because I'm my brain is going slow today.
-
I don't know about you guys.
-
Uh, let's import lips, import secrets, paste this, uh, four I in range.
-
Let's say 100.
-
I'm gonna tab this over.
-
I'm gonna print or actually, if that equals one print.
-
Yes, I apologize for this.
-
I gotta I gotta make sure I'm doing this right.
-
So we're just gonna generate over this 100 times.
-
If that is equal to one, and then let's say random chance equals it should be about 15 of those.
-
Uh, let's do five perps 05 Should be about five.
-
Obviously, we'd have to run just a ton, and it's random five.
-
And then this should be about a 10% chance About 10.
-
Should be the average one we see Seems a little low, but it's probably correct, Steve 0.3.
-
So this should be a average of about 30 looks to be about right.
-
Okay.
-
Just want to make sure that logic was sound.
-
So in this case, the question is, if this is equal to one great, that's our 35% chance or whatever.
-
Um, okay, then we might allow bubble blower, make the choice, run the game.
-
Everything else stays the same.
-
We're not changing any of that.
-
So hot diggity.
-
Let's run it.
-
So it happens.
-
We need to change run game or Python version.
-
No longer is it the one that we have.
-
It's sent about dash ml now, dash passion, Mel and, uh, pride before I make the mistake, let's just highlight one of these.
-
Replace all.
-
Great.
-
Um, are we ready?
-
Well, I think we are ready.
-
I also think Oh, actually, as we run, the game will see it.
-
So this won't office.
-
Kate, what comes out?
-
So hey, let's run it.
-
I forget if python straight up runs what we want, it s o Python run game dot pie.
-
Let's see what happens.
-
So on initial, uh, shoot.
-
I know one thing That's probably cause a problem, I think.
-
Let me just break this along.
-
What did that say?
-
Communication failed.
-
Uh, okay, let's read one of these, okay, that Okay?
-
There's nothing in there.
-
And then what we'll do is we'll go into the replays.
-
I can't all check in a second.
-
Surely we're not timing out there right now.
-
This is gonna be a heat.
-
If you don't give me any error, This is gonna be huge pain in my took us.
-
Okay.
-
Okay, so we are timing out, okay?
-
I feel it going about sneeze, apologize.
-
Fight it.
-
Okay, so now, uh, no time out.
-
So, dash dash, no time out.
-
Now, here's a little trick.
-
Um, what I have found is the first time the model is loaded in the first time.