Placeholder Image

字幕表 動画を再生する

  • What's going on?

  • Everybody welcome to part six of the M l and Deep Learning with the highlight three challenge tutorials in this video, we're just gonna continue building on the last one, which I believe I left off fixing a slight bug that we weren't able to train for multiple E pox.

  • So let me quickly bring up tensor board tensor board.

  • Lauder Equal lips blogger equals longs doing it.

  • 10 surfboard logger equals long slash.

  • And for some reason I get a bunch of this.

  • I think it's because we're really we're kind of overriding the steps with my crappy way of doing things, but we'll get through it.

  • So then, um, I found that the best way to get there is with local host Colin 6006.

  • If I use like 1 27 001 that doesn't work.

  • And if I use the machine name Colon 6006 that doesn't work either.

  • So anyways, this is what's working.

  • So let me just smoothies quite a bit.

  • Um, so the red line is three whole pox.

  • Uh, the orange and blue is just one iPAQ.

  • I'm gonna switch these two relative so you can kind of see those compared, Um, but really, after three A pox, Really?

  • I mean, there's a mean, maybe it's slightly comes up, or at least it levels out in the in sample.

  • But then at a sample, not too too impressive.

  • I have to switch it to relative to show you the iPAQ Val ac the again because of the way they were doing things.

  • So again, the orders line is actually only one iPAQ.

  • The red line is three pox in the blue line is one he parked.

  • The only difference is the Orange line was the time when we actually had to load the data and prepare it.

  • Whereas these two down here, we didn't.

  • So, as you can see preparing the data, it takes, like, three times as long.

  • So anyway, accuracy somewhere between 24 a half and 25%.

  • That's pretty good, considering, you know, random choice would be 20%.

  • And in this case, you know, we've trained a model off, you know, using a threshold.

  • Sure, but we're training off of models that are moving randomly.

  • So the fact that we got 4 to 5% over random is actually pretty impressive.

  • That tells me our threshold probably worked, but now it's time to find out.

  • So now we begin the massively iterative process where we've trained a model.

  • Now we want to run a bunch of games with that model and some threshold.

  • And then the hope is that our average or mean kind of Hal I collected has gone up a little bit.

  • Any little bit will help, especially these earlier games.

  • So, uh, so what I'd like to do is, First of all, let's go ahead and just copy training data.

  • In fact, I figured this That'll take too long.

  • But I'd like to do let me just delete this.

  • We'll just call this training data and I'm gonna call it underscore one just for the first set and then we'll make a new folder.

  • Call it training data.

  • Now that's empty.

  • And then we can run those stats that we did where we do the hist a gram, but also just like the mean on all that we can run some stats and see, Has the model actually improved at all?

  • We also could do the model versus the random and see which one wins more often or something like that.

  • But anyways, what we shall do now is implement the aye aye rather than random incentive on.

  • So I'm gonna copy Paste Center by and now that's just sent Abi.

  • And then it'll be Dash, not an ill will be.

  • Dash em.

  • Hell, Okay, let's pull that up.

  • And I am that I just close everything else.

  • Cool.

  • All right.

  • So we have to make a few imports also because of the way Like how light uses standard aRer outputs and stuff for the, you know, the game and python to talk to each other.

  • So we have to make sure we silence all that stuff.

  • So I'm gonna make a few imports here in Port Au s, Simon import cysts.

  • Well, it's a standard error equals sister s t d air.

  • And they were, say, at sister S t d air equals open.

  • Oh s dot definable with the intention to write.

  • And then we're going to import tensor flow as t f.

  • I hate that sublime.

  • Does that to me?

  • Space should not.

  • Where is it?

  • Oh, it's cause you hit enter for the new line.

  • I don't know how you get around that actually.

  • Anyway, system standard error equals standard air.

  • So the issue is when you first import tensorflow in, it's gonna tell you things.

  • Like what?

  • Back end.

  • You're not the back end that you're using.

  • Butt's gonna tell you things about, like, that's what caress it's gonna tell you things about, like, your GPU and stuff like that.

  • And we need to silence that.

  • So this is how we're doing it.

  • Basically, to silence all that output if you have it, if the game is gonna air out.

  • So now Ah, Santa ma m l can remain the name.

  • That's totally fine.

  • Uh, next thing we need to do is continue to silence any of the tensorflow outputs.

  • So, um, I guess we can still write this.

  • And I thought about just copying and pasting X.

  • This is just kind of cookie cutter code that was used last year to silence stuff from Hal I TTE Anyway, os dot environ, uh, and then what we want to say in here is T f c p p men long level, and we're gonna set that equal to three and string.

  • And then now we want to set the GPU options that's gonna be equal to t f dot All caps GPU capital.

  • Oh, options on this will be per process GPU memory fraction and we set that to 0.5 This is a small model.

  • We don't need much memory.

  • Why do we want to do this?

  • Well, by default, the TENSORFLOW is gonna allocate as much as it can and then keep stuff just floating in memory to attempt to make things as fast as possible for you.

  • The problem is, this is going to exhaust a lot of power Now.

  • The next thing is, I'd like to run three or four simultaneously simultaneous instances.

  • So you have to keep in mind that at any given time, upto like, say, 16 a eyes could be trying to run and make predictions at any given time.

  • So it's really important that we set this number very low now, depending on how many you can run at any given time.

  • Given your certain CPU and all that, you can adjust this fraction, but I've actually found it runs perfectly fine and actually it seems to run faster on a low GPU memory for action.

  • And I'm unsure why, but it seems much faster, like 10 times faster.

  • It's very strange.

  • Anyway, I think that must be some sort of bug or something.

  • That is very unintended, but definitely something I would look into in the future with a larger model.

  • Even so anyways, now we're gonna set our session cess equals T f session.

  • And then we're doing this so that when we loaded our model, it's gonna follow this session for us and use our Jeep you options that we set.

  • So config equals t f dot com vig proto and then GPU underscore Options equals GPU Underscore options.

  • Great.

  • Now we're ready to load in our model t f equals t f dot care.

  • Ah, stop models don't load.

  • Underscore model.

  • And then we give the model name.

  • I forgot the name and the other thing we want is random.

  • Underscore chance.

  • And I'm gonna call this secret stock choice.

  • And then in here, I'm gonna say 0 15 250.35 So this will be a random chance that the A I will choose a random move over the model's choice.

  • So we still wanna have some exploration like we can't just train models and then because the model isn't random anymore.

  • So we want to still have some random kind of permutations.

  • 22 for the model to learn from new moves over time.

  • So, um, so, anyways, you know, 0.15 would be, you know, 15% chance of every move A 150.25 would be fourth of all the moves, 0.35 35% of the time will be Iran to move.

  • And then every time we initialize this, but some boss will be 35% random, some will be 25 some will be 15.

  • You can play with those numbers as you want.

  • We just know that we want some random exploration involved here.

  • The next thing I'd like to do is go into models and let's pick.

  • 00 no.

  • We've made a horrible mistake.

  • That's a huge bummer to see.

  • We can train models pretty quickly.

  • That stinks.

  • Uh, luckily, I think that model should be fine, but let me fix that.

  • In the training code, we must have for gotten f it, uh, shoot.

  • Where is sent to train.

  • Um Mmm.

  • We've been I guess, that the model save.

  • Yeah.

  • Dang, what an idiot.

  • So, this one, it will be the three iPAQ model, which is fine.

  • I'm gonna, uh I'm not sure I would have used that one.

  • The other thing that might be wise to save, like, the score of the model.

  • So maybe, like, run a model that score right before the save.

  • Something like that.

  • Uh, whatever.

  • Uh, I'm gonna come into here, and I'm gonna rename this one, uh, models.

  • Hopefully, Uh, man, I just feel pretty bad because I wonder.

  • I hope that's a good bottle.

  • Probably.

  • What i'll do is I'll finish the code here, make sure it works, and then I'm probably gonna retrain that model, cause I don't have faith that I didn't accidentally run train and then I stopped it or whatever.

  • After a quick short, it's, uh, segment.

  • So, anyway, I'm gonna call this for now.

  • Phase one anyway.

  • Come back up to center train, er or not said to train close that.

  • Yeah, there.

  • If we run in trouble, come back to this script.

  • Come to the top in the model we wanna load.

  • Sent about ml.

  • Where?

  • Here.

  • Phase one.

  • All right.

  • Save the progress so far.

  • Okay, Now what we want to do is I think I actually kinda want I just want to leave the threshold the same for now, because I want to see if the mean has gone up.

  • Like, I just want to see if it's actually gotten better or not.

  • So I'm actually gonna leave the threshold the same with the hope that it actually doesn't prove same thing with total turns.

  • I'm gonna leave everything the same, make the same amount of ships, all that.

  • And instead, what I want to do is I want to come down to where we actually make our, uh, our choice.

  • So choice seekers, Choice range, Len, Direction, order.

  • So what we're gonna say is, if, ah, secrets dot choice range of int one out of random chance equals one, then we want to say choice equals secrets dot Choice.

  • Um, well, in fact, basically this I think that's what I want.

  • So basically, if it's one out of, you know, 15% or whatever, as you make that, uh, range, the question is, if that number is equal to one, then we'll make a random movement.

  • Otherwise, what do we want to say so else we're gonna say Model underscore Out equals model dot Predict And then we want to predict, uh, always prediction takes a list, even if it's one thing we want to predict.

  • We need to predict the numb pie array conversion of surroundings with a dot reshape negative one.

  • And then it's negative one by the size, by the size by three channels.

  • So in this case, we can say Lend surroundings Comma Len, it's not running on my face.

  • No surroundings.

  • Comma three.

  • Let me zoom out a little bit so we can see the full line model out Models predict numb pyre a conversion dot Reshape negative one by 33 by 33 by three.

  • Now this returns because you pass a list, it also returns a list.

  • So we want the zero with element.

  • So that's the actual prediction.

  • Now, even though we were using scale er's it is still gonna output to us a one hunt vector 100 Ray.

  • So we want to say prediction equals np dot Argh!

  • Max of model out now logging dot info.

  • We want to save this.

  • We're gonna say f prediction.

  • And that is gonna be direction, order prediction, and then choice choice equals prediction.

  • So then we leave all of this other stuff.

  • And so as we build the data, the data is built with either are random movement or we're not.

  • Um, just because I'm my brain is going slow today.

  • I don't know about you guys.

  • Uh, let's import lips, import secrets, paste this, uh, four I in range.

  • Let's say 100.

  • I'm gonna tab this over.

  • I'm gonna print or actually, if that equals one print.

  • Yes, I apologize for this.

  • I gotta I gotta make sure I'm doing this right.

  • So we're just gonna generate over this 100 times.

  • If that is equal to one, and then let's say random chance equals it should be about 15 of those.

  • Uh, let's do five perps 05 Should be about five.

  • Obviously, we'd have to run just a ton, and it's random five.

  • And then this should be about a 10% chance About 10.

  • Should be the average one we see Seems a little low, but it's probably correct, Steve 0.3.

  • So this should be a average of about 30 looks to be about right.

  • Okay.

  • Just want to make sure that logic was sound.

  • So in this case, the question is, if this is equal to one great, that's our 35% chance or whatever.

  • Um, okay, then we might allow bubble blower, make the choice, run the game.

  • Everything else stays the same.

  • We're not changing any of that.

  • So hot diggity.

  • Let's run it.

  • So it happens.

  • We need to change run game or Python version.

  • No longer is it the one that we have.

  • It's sent about dash ml now, dash passion, Mel and, uh, pride before I make the mistake, let's just highlight one of these.

  • Replace all.

  • Great.

  • Um, are we ready?

  • Well, I think we are ready.

  • I also think Oh, actually, as we run, the game will see it.

  • So this won't office.

  • Kate, what comes out?

  • So hey, let's run it.

  • I forget if python straight up runs what we want, it s o Python run game dot pie.

  • Let's see what happens.

  • So on initial, uh, shoot.

  • I know one thing That's probably cause a problem, I think.

  • Let me just break this along.

  • What did that say?

  • Communication failed.

  • Uh, okay, let's read one of these, okay, that Okay?

  • There's nothing in there.

  • And then what we'll do is we'll go into the replays.

  • I can't all check in a second.

  • Surely we're not timing out there right now.

  • This is gonna be a heat.

  • If you don't give me any error, This is gonna be huge pain in my took us.

  • Okay.

  • Okay, so we are timing out, okay?

  • I feel it going about sneeze, apologize.

  • Fight it.

  • Okay, so now, uh, no time out.

  • So, dash dash, no time out.

  • Now, here's a little trick.

  • Um, what I have found is the first time the model is loaded in the first time.