Placeholder Image

字幕表 動画を再生する

  • what's going on?

  • Everybody.

  • And welcome to part three of the self driving cars with Carla tutorials in this tutorial and the coming probably at least like to plus more.

  • We're going to talk about doing reinforcement, learning specifically deep que learning with the Carla environment.

  • Now, to do this, we kind of have thio change the architecture A little bit of how we're approaching this problem.

  • We want to approach this problem with the, um with everything being set up to work well with the concept of reinforcement learning.

  • So there's a 1,000,000 ways that we could do this.

  • But with the advent basically or the introduction of open A I into the world now known as closed, I, um they had kind of set the standard for how to approach and work with environments to do reinforcement, learning and that that methodology kind of has persisted and tends to be how everybody handles reinforcement, learning whether or not they're actually using the open.

  • Aye, aye.

  • Environments themselves, like Jim or retro are now maybe is the name and I forget the other one, but a Tory, maybe, anyway, not important.

  • But basically, you wanna have some sort of environment that has a couple of methods.

  • So, um, I we're gonna basically keep all this code here.

  • We're gonna have to get rid of a lot of stuff because we're not We can't really run this as a script anymore.

  • We need toe.

  • Probably use object oriented programming.

  • So, um, we are probably done with all of this.

  • I do kind of want to leave process image.

  • We're gonna just convert that to a method rather than a function.

  • But that's okay.

  • Uh, in within the height.

  • Well, prime that moving those.

  • I'm just gonna delete those Most I just think this block of text, I'm determined to never write it anyway.

  • Basically, the the standard for, um for reinforcement Learning environments is to have some sort of dots step method for the environment where you pass an action to that step.

  • And maybe if the syntax was working, it would look better.

  • So, anyways, you'll have some sort of step method where you pass action.

  • You'll do something with that action and then, um, collect any information like determine what the reward was and stuff like that.

  • And then when you're done, you return the next observation.

  • The reward whether or not we're done.

  • So this will be like a flag.

  • It'll be a Boolean flag, basically, So this will be true or false.

  • Whether or not we've either reached the end of the environment like week were successful in what we tried to do or Ruben at a time, or we died for whatever reason.

  • So and then finally, usually just you see it as an underscore, but it's just extra info.

  • So if there happens to be extra info, this just again just kind of the standard.

  • We generally almost always care about observation, reward and whether or not things were done.

  • But any environment.

  • We tend to throw this in that way, and the reason why we want to use this standard and continue to use the standard is there are different reinforcement learning models, basically that you can work with and whether someone did it with, you know, opening I, Jim or they did some other environment or whatever.

  • You can easily just swap these these models round to all kinds of different environments, and some environments might want that return some extra info for who knows why, but maybe, and we just we just leave that there, so that is possible.

  • So some sort of extensive bility is possible, But anyways, this is what we want to do.

  • So we need a step method, and then we also want a reset method.

  • Um, and this is either at the very, very beginning of the environment or after we've returned a done flag.

  • If we want to run another episode, so to speak, will reset and run another episode.

  • Okay, so that's what we want to d'oh.

  • Now what we need to do, he's actually do it.

  • So what I'm gonna do is just come down again.

  • I'm just gonna leave this process image code here because, uh, I'm gonna pretty much needed I'm just gonna add a bunch of self dot in front of everything, so I'm gonna leave that there.

  • But we're gonna start some new code here, and the first thing I'm gonna do is we're gonna at the top of this script, we're gonna have a bunch of, like, starting Variables and Constance, But for now, I'll just have show preview, and I'm gonna set that to be false for now.

  • And this is whether or not we want to display the actual camera from Carlos so you wouldn't want Always display at one.

  • You know, if you have ah worthy machine, you'd want to run many agents at the same time.

  • Ah, but to displaying that previous gonna hog up computing resources both CPU and or GPU, I believe open CV purely uses CPU, so it's gonna lock up CPU.

  • Um, but you just you just it's gonna use Resource is nonetheless and probably we're gonna be maxing things out, so we'd rather not.

  • But if we want to use it for the bugging purposes, we can turn this flag to true just to see a couple of previews and then turn it back off, maybe whatever.

  • Anyway, I'm just a bit too falls for now, but you definitely would want to be able to see what's happening sometimes.

  • So then we're gonna have class, and we're gonna call this car end environment.

  • And now, when I just set some of our initial value.

  • So first we're gonna have show Cam.

  • And for now, that'll just equals show preview.

  • We might also have show cam show every even if show preview is set to true.

  • Maybe it shows every you know, I don't know.

  • 10,000 episodes or some Not 10 $10.

  • That's a big number.

  • 100 episodes or something like that.

  • S o show cam equals show preview for now, uh, then we're gonna have steer underscore amount.

  • We're gonna set that toe one point out.

  • Basically, we're gonna fully you know, there's three actions we can take.

  • It will be in, um 01 or two.

  • So we can either, Um we're actually, it'll be negative 10 or one.

  • And basically, we can either steer fully, uh, left, go straight or steer fully.

  • Right.

  • But later, you might want to make this may be cumulative so that so that steering wheel slowly turns one way.

  • And then if you want to go the other way, you slowly turn the other way, depending on what kind of friends.

  • For a second we get, we could probably do so much more fancy stuff there.

  • But for now, we're gonna do full turns every single time.

  • Ah, Then what we're gonna say is we're going to throw in an M.

  • Um, in fact, I guess we'll just toss this up here, so em underscore with a totally just deleted all this stuff with 6 40 m height for 80.

  • I was wondering how that was getting typed in there.

  • Okay, I see.

  • Um, cool.

  • So we've got those.

  • And then, um, for now, I guess we can set those to be equal to these as well.

  • So am I.

  • Him underscore with and with Will set equal to m with in an M height am underscore high.

  • It will set to em height.

  • Okay, so then what we're gonna do is we are going to do.

  • I think we'll just said we'll say front, underscore ir camera for now, we'll say that's none.

  • And I think that's it.

  • That's all of our, like, basically, initial values that we want to set.

  • Now, what we're gonna do is we're gonna define innit?

  • And self here.

  • And what we're gonna say is basically all that starting deleted it.

  • But the starting code to connect it's not too much anyway, so we need to connect to the server.

  • So self dot client is going to be carla dot client on.

  • Then we're gonna connect local host for 2000 self dot client dot Set underscore.

  • Time out.

  • We're gonna set that to be two seconds Still self dot World is equal to use self dot client dot get underscore World blue blueprint underscore Library library equals self doubt World dot gets underscore Blueprint underscore Library.

  • So okay, we've got the blueprint library.

  • Now we want to grab our car so self dots model underscore three is equal to Blueprint Libere eri not Filter.

  • And we're gonna filter for model three model three.

  • Actually, it's just model three new underscore there and then the zero with index.

  • So with that, we probably have everything we need when we first initialize.

  • Because we just we just wanna have access to our blueprint library.

  • We want Thio.

  • Well, at least in this case, we just grabbed for the car.

  • Uh, in fact, we might even we probably I think this should just make this self w print library because we're gonna probably need to access that elsewhere because we need the sensors as well.

  • So I'll leave.

  • I'll just do self w print liberate.

  • Okay, so with that, we're ready to do our, um, reset methods.

  • So we're gonna say define reset and again here just past self.

  • There's nothing needs to be passed step is what's going to take in the action.

  • So at the beginning of every reset environment, what do we need?

  • Well, we're gonna say self dot collision underscore history.

  • That'll be an empty list because the clued in sensor returns a list of these like collision events.

  • Basically, if we have any collusion event, we're going to go ahead and reset.

  • At least that's what we're gonna do.

  • Start later, I might customize that I haven't looked too deeply into the collision sensor.

  • The collision sensor basically reports like some sort of like magnitude value and sometimes, like if you just drive over like a pretty non substantial curb, even like you're totally fine.

  • Even though that theory pride throw the car out of alignment it it says that was a collision.

  • And I've even seen it call a collision like if you just simply let go uphill really quickly.

  • Um, like maybe the bumpers scraped the floor or the ground because you went too fast.

  • I'm not really sure, but I've seen it registered.

  • That is a collision as well, so we might want to require the magnitude to be higher than anything, right?

  • But for now, if anything is in this list.

  • So if any collusion is detected, we're just going to say, Hey, you failed.

  • Uh, then self dot actor underscore list is also going to be an empty list.

  • Um, And again, we just always wanna track actors so we can clean them up at the end.

  • So they were to say, self dot transform equals random dot choi's self dot world dot Get underscore map dot Get ghat get spun points.

  • Okay, Don't forget your open and close.

  • That's a method got get map on get spawn points again.

  • Method.

  • Okay, so we've got the transform now we're gonna do is self, not vehicle equals self dot World not spawn Underscore actor, actor And we want to spawn self dot that model three from above and then we're gonna spawn it to self dot transform.

  • So we've got we've just spawned an actor.

  • What do we need?

  • Do we need to self dot actor underscore lists dot up and ah, self dot vehicle.

  • Okay, so now that we've done that, we want to get our RGB camera.

  • So that's the next thing.

  • I'm going to go ahead and ads or is a self rgb underscored Can equal self?

  • Uh, no, we're gonna use Ah, blueprint library.

  • So self don't blueprint library.

  • Uh, and they were going to say Don't find and then sensor dot camera dot org Be secretly actually an RGB alfa camera.

  • Self doubt rgb rgb cam dot set underscore attributes, and we're going to say image under scorers size underscore X is the f string of self dot lips self Don't, uh I'm with em with cool, and then what I'm gonna do is copy this paste pays Will do three pace.

  • So m s o, then why?

  • So you've got your ex your Why is your height height?

  • And then this will be field of view, and then we're going to just I'm just gonna hard code this to be 1 10 for 1 10 for now.

  • Okay, So just like our vehicle, we get the vehicle.

  • What do we need to do?

  • What?

  • We need to specify the transform.

  • So I'm just going to say transform equals carla dot Transform Um, that's a Capital T.

  • Did I transform?

  • Transform?

  • It was Carla.

  • Did I do a court?

  • Where's the other terms form?

  • Okay, I don't really see it anyway.

  • carla dot capital t transform.

  • I'm just curious.

  • Why wanted lower case that tea?

  • But I'm sure when we go to run this will Sea Air's so, uh and then carla dot the current location.

  • And this is again I believe, to be a relative position.

  • So it's just in relative to where we throw it.

  • Uh, we want to move this.

  • So transform.

  • Then what we want to say is thes self dot sensor is equal to self doubt.

  • World dot spawn underscore actor, factory actor self dot rgb underscore Cam.

  • So it spawns the actor, uh, and then transform and then attach underscore two equals self don't vehicle.

  • So now we've got this camera on the front of our car.

  • We've got a new actor.

  • What do we do?

  • Well, we append to the actor list So copy paste.

  • And in this case, it's not self doubt vehicle it is self dot sensor.

  • So once we've done that, the next thing we want to do is we want to say, uh self dot sensor dot Listen, and we'll do land, uh, for lips Landa data and then self dont's process underscore image data Landa Data.

  • What have I done.

  • Oh, okay.

  • Lambda Data, self doubt, process, image data.

  • Okay, so then we need to We need to convert this to be a method inside of our environment, which we will.

  • D'oh.

  • I'm just gonna go and finish what we're doing here, and then we can We can do that pretty quickly.

  • So once we've done that, we've got our core.

  • We've spawned the car.

  • We've got our camera, We've spawned the cameras.

  • We've got all that.

  • And what?

  • Whenever you, um whenever you create this car and you spawn a car, I don't know why they did this, but they did.

  • Thanks, Carla.

  • People, when you spot a card actually falls from the sky.

  • And maybe they did that because that was the easiest way to overcome issues where you like you spawned and were clipping a little bit like your tires.

  • Maybe we're clipping and you couldn't move.

  • I don't know.

  • I'm sure there's a great reason for it.

  • But you fall from the sky, and when you fall from the sky, one issue is you actually can't drive yet.

  • The other issue is when you hit the ground.

  • Sometimes the collision sensor registers that as a collision because you just fell from the sky.

  • Uh, and then also initially, your RGB camera hasn't yet.

  • I don't know if it initialize is or what, but it doesn't actually start pulling in data.

  • So sometimes, as soon as you try to grab him injury from the RGB camera, you cannot.

  • Itjust itjust returns and none.

  • And this is for a variable amount of time.

  • I couldn't figure out, like sometimes it would immediately return imagery.

  • And then other times, it would take some time.

  • I don't know why, but it's a thing.

  • So we have toe handle for that sort of nonsense occurring.

  • So eso we're listening.

  • Uh, Then what we're gonna say is self, not vehicle.

  • Don't apply.

  • Underscore control?

  • Yes.

  • S o I left out one more thing, too.

  • Is that, um the duration that it takes for the car to actually start doing things also is variable.

  • So, like the camera, they're just weird things that happen.

  • And, um, I didn't really discover this.

  • Daniel seem to discover that if you just apply some control, like even if you don't do anything, if you just send in the command to control, um, it makes the car.

  • React quickly.

  • More quickly.

  • Anyway.

  • So I'm gonna throw this in.

  • I have no words.

  • Anyways, Carl taught you already know how to do vehicle control.

  • So we're just gonna say throttle equals and we're gonna say 0.0 and then break equals 0.0.

  • Definitely more research is required here to figure out exactly how to feed Carlo.

  • What?

  • It wants to make it act quick as quickly as possible.

  • Because the speed that we can run through episodes, the faster we can do that, the faster we can train these models.