Placeholder Image

字幕表 動画を再生する

  • what's going on, everybody.

  • And welcome to a video about kind of an overview of the current high end cloud GPU providers, as well as a bit of a tutorial for how to actually make efficient use of these servers.

  • So you're not wasting time and money doing pretty processing or even data set, download or upload to a server, which can take a very, very long time and just waste like hundreds of dollars.

  • So with that, we've got a lot of information to cover.

  • Let's get into it.

  • So the first thing that we'd want to do is just kind of quickly as quickly as possible.

  • Compare all these providers.

  • There's so many numbers that you've got to think about and look at it could get super confusing.

  • Hopefully, I can make it really easy for you.

  • First of all, the every provider has released.

  • Most providers have a cluster of deep user different Jeep use that they offer, but the best, the flagship GPU that pretty much everyone is offering is the V 100 so that Tesla V 100 from NVIDIA this GPU has 16 gigabytes of the ram per jeep.

  • You there is a 32 gigabyte, very into which you can see an example here.

  • Um and I think aws is the only one offering it.

  • I'm not positive.

  • I'll try to remember to look as we go through the prices, but you have to be spending a lot of money.

  • This is $30 an hour machine.

  • So I don't think many of you guys were looking for that.

  • But it is.

  • I think also eight of us has the most GPU V Ram person singular machine that you could have.

  • So this is before you might think about doing distributed tense or floor pytorch or whatever, Um, which is always a huge mess.

  • So if you want the largest models possible, AWS is still gonna be the victor.

  • But in terms of the 100 purvey 100 GPU $3.6 you can sign up for one year and three year reserve prices, which lowers it.

  • But you'd be a fool to sign up for a GPU for three years were even probably one year.

  • So I'm not really gonna look at those prices, uh, moving on so on to azure also V 100 gp use one thing start paying a little bit of attention to his CPU and ram for a lot of deep learning tasks, this might be irrelevant to you, like you might not actually need very much at all.

  • Like maybe the only thing you're doing is like io or something like that from, like a data set.

  • But if you're doing something like reinforcement learning, the more V ram the better, and often also with reinforcement learning depending on what sort of environment you're using, more CPU, the better.

  • But my not also always be the case that Morsi few the better.

  • So So, uh, that is going to be a per use case scenario that I can't possibly dive in for you guys, but basically, Azure is exactly the same price.

  • But they do have different very variable, very varying amounts of ram storage.

  • And how many cores of CPU they all very so.

  • Pay attention.

  • If if any of those things start to matter to you, pay attention moving along.

  • Uh, Google Cloud also has to be 100.

  • Also has, uh, Well, actually, they're cheaper.

  • I first numbers in my head.

  • I thought there were $3 is well to 48.

  • I wonder if they were always to 48 anyways, to 48 per jeep.

  • You so actually cheaper than both aws hand azure.

  • Um and, uh, they have up to 1 28 gigabytes missing what they had only four times.

  • So uh, less Wow, not very many.

  • So actually, azure is kind of losing in in terms of how much possible, Although I wonder if this this one's a little more expensive.

  • So what?

  • Also we have the same amount of Ram.

  • Same cores.

  • I wonder if this is your 32 gigabyte variant V 100.

  • Not sure.

  • Uh, anyway, moving on to paper space, My longtime favorite cloud she BU provider simply because you spend it up, it's ready to go.

  • It's just super simple.

  • You've got a virtual desktop already.

  • It's just super easy.

  • Try to get going on paper space.

  • There be 102 30 an hour.

  • So even this is the cheapest so far.

  • Another reason why I really enjoyed going with them.

  • Also, they offer other GPU is like the P 6000 which has 24 gigabytes of GPU memory, which is fun to play with.

  • Uh, so then moving along.

  • Oh, that one thing That's kind of a downside for paper space, though, is you only have one GPU per machine.

  • So you're maxing out and either 16 gigabytes of V Aram here or 24 on the P 6000.

  • But you can't.

  • You can't have multiple G pews per machine.

  • So that's kind of a bummer moving on to Lynn Owed, which just started their GPU plans they have, rather than the V 100 the R T x 6000 which is kind of an interesting move.

  • Uh, so at first I had to take some time to look in what the difference is.

  • So the V 100 g pew, obviously 16 gigabytes of the Ram The R T X 6000 is 24 gigabytes of your, um so a bit more v ram.

  • Also, the other thing that matters is how quickly can we process.

  • Tensor is basically a raise, so the V 100 has 112 tensor T flops.

  • The Arctic 6000 has 130 10 30 flops.

  • So Maur operations per second, more memory for half the price, at least of AWS and azure.

  • It's a little closer to paper space, but again, you can have up to four of these are t X six thousands.

  • So, uh, that's a pretty amazing offering.

  • A TV on is with you.

  • I don't know if they're even.

  • You've got to be operating at a loss at this stage.

  • But anyway, super cool.

  • So this is who we're gonna end up using.

  • I do have referral link for Linda, and sometimes people get a little fidgety when, uh, have referral links.

  • Honestly, I have a relationship with Amazon Azure, Google Paper Space, and Lynn owed all of them.

  • They all want the air time.

  • It just so happens that right now, Lin owed is offering the most absurd deal possible.

  • So we're gonna cope with Lynn owed s.

  • So that's what I'm going to use.

  • But if you have a different provider, you're watching this later, and someone else has better offer you can.

  • You can.

  • You can use the same methodology that we're gonna use on any of these providers.

  • So, uh, so hopefully it will still be of use even after Maybe Leonard is not offering the best deal possible.

  • So first of all.

  • You'll go ahead and need to create account log in again.

  • You can use its lynda dot com.

  • If you got a little dot com slash Syntex, you should get here and you could just sign up for an account there.

  • You'll get a $20 credit, but I think you actually have to spend $20 then you will get your credit.

  • So So keep that in mind, but you'll still at some point get a $20 credit.

  • So Ah, once you do that, let me go ahead and log into my account.

  • You should look at something like this when you get to your dashboard.

  • So how do we How do we want to efficiently do this?

  • So your GPU servers coming to be many dollars an hour, as opposed to like If we were to look at Theo, other types of servers were talking like pennies per hour.

  • So So the first thing we want to do is we're gonna have, like, a simple, um, a simple, virtual private server that is going to serve as our sort of house of data.

  • So at least on Leonard, if you go to create, we're gonna create a Lynn owed.

  • This is their name for a V P s.

  • So we'll go there.

  • We want a boon to for this.

  • We can use 1904 Uh, but hopefully don't forget, but wanting is 18 04 for at least our g p server.

  • And in fact, I'll just I'll just get in the habit.

  • So 18 where is 18?

  • Over Here it is.

  • And we want to do that for the long term.

  • Support makes updating later down the road much easier.

  • Also, it has long term sport.

  • So, uh, 18 0 for a boon to pick a region.

  • In my case, I'm gonna go with Dallas, Texas.

  • I haven't checked all the regions, but some, actually not Dallas, Texas, because test X is the problem.

  • Newark, New Jersey.

  • So with all of these providers, they don't offer GP use in the same region.

  • Like like they have more regions of, like, CPU ram storage.

  • Then they dio have, like, g pews at those regions.

  • So some of these places have, like, 40 locations around the world that you could choose from, but not for G pews.

  • And so, in Leonard's case, I happen to know they have deep you instances in Newark, New Jersey.

  • So I'm gonna put everything in Newark, New Jersey.

  • So what we're building right now is, or where we're going to store our data, So you can either have a data storage VPs.

  • You could have a data pre processing VPs and so on.

  • Uh, I wouldn't recommend it, at least on Lynn owed.

  • The thing that makes the most sense would be to do your storage and pre processing, probably on the same machine.

  • But we'll talk about that.

  • Maybe later, if I if I remember.

  • But for now, I'm just gonna go with a standard.

  • Lynn owed this two gigabyte.

  • You could even go with the nan owed for even cheaper.

  • But I kind of don't like one gigabyte of memory, because sometimes that's not even enough to install certain packages.

  • So I'm gonna go with this one.

  • Um, and then we'll come down here.

  • I'm gonna call this data server.

  • Uh, and then we're gonna make a password.

  • Jellyfish.

  • It's a great password.

  • And you might want to think about backups.

  • I've never regretted having backups, but this is just a quick machine that I'm just using for an example.

  • So I think we're ready.

  • So we're gonna go ahead and spin this one up.

  • So again, if you're following along, I would strongly recommend you.

  • You probably use Newark, New Jersey, and then later you can look for regions that are maybe closer to you.

  • Although it shouldn't really matter where it is for cloud GPU stuff.

  • I don't think it matters.

  • Um, unless you're uploading the data set from your local machine May.

  • But either way, it's not gonna cost you very much money in this case, because this is a $10 a month server, as opposed to, you know, a dollar 50 is a dollar 50 times 24 s o.

  • You know, you're paying more than that per day, more than double that per day.

  • Anyway.

  • Uh, let's go ahead and create.

  • Did I forget something?

  • Do I need a puncture?

  • Uh, about capital jelly?

  • Jelly fish, exclamation mark, do you think about that?

  • Is that a good password?

  • If I forget to, um, destroy this machine.

  • So we're riding me because you all know now know the password to my lovely I P address.

  • Right here.

  • Uh, okay, So we're gonna wait for this to set up.

  • Um, So basically, what this is gonna do is we're going to have a volume that we're going to attach to the server.

  • And again, this is a This is a concept that exists.

  • I definitely need to be west.

  • Definitely on Google Cloud.

  • I haven't used azure to any serious degree, but I'm willing to bet they do it to basically all these providers have this sort of, uh, structure where you've got your Yes, you could have a VPs, but you can also spin ups totally separate storage container type things.

  • Um, because often you need more storage than you need on that is available on some of these, like, just kind of pre packaged virtual private servers.

  • So, uh, get lost.

  • Okay, so this looks like it's good to go.

  • Here's your I P address.

  • We can copy that.

  • And, um, I actually don't think we need to log in just yet.

  • The other thing, the next thing that we want to do is we wanna have storage.

  • Like, in this case, I'm actually not going to need more than 50 gigabytes of storage.

  • If you don't need more than 50 year bites of storage.

  • Then you don't have to do the next step, right.

  • And you could also spin up.

  • Like, as you go bigger on your linens, you'll get more storage in Leonard.

  • Also, your GPU machine probably has a pretty hefty amount of storage.

  • But that machine, we just want to spin it up and use it as quickly as possible.

  • So we don't want We were trying to avoid the download or upload of a data set that machines that machine has to be on and you're being billed so long as those machines exist, whether it's on AWS azure Google.

  • If those machines eggs exist, you're getting billed for them.

  • So even if you turn him off, that GPU is still being dedicated to you and you're still paying a dollar 50 an hour or $3 an hour or $30 an hour, so you want to use it as quickly as possible.

  • So So we got her data server.

  • I believe it's probably online.

  • Now we're gonna do is we're gonna goto volumes.

  • We're gonna add a volume.

  • I'm gonna call this ml storage Size 20 is fine again.

  • This is this is just example.

  • But, you know, you probably want, like, a terabyte or like, 10 terabytes or something.

  • How freaking big did I go over 1000 gigabytes.

  • Yeah, but if I don't want a 10 terabytes, that's a lot of money per month.

  • Anyway, you know, you might want 500 gigabytes or something reasonable.

  • Anyway, um, you also could, in theory, have your storage on in some other provider.

  • But part of what we're trying to do is we wanna have our volume in our data source.

  • We wanna have that in the exact same region.

  • So when we go to transfer that data, it has the fastest transfer rate possible.

  • So hopefully you can upload and download data at 100 plus megabytes per second.

  • If if it's like from your local machine, you might only get, like, five or 15 or something terrible is gonna take forever.

  • So we're trying to get this to be as quick as possible.

  • Um, and at least here, there's an even better method that'll show you guys near the end, but we'll go with 100 region again.

  • We're gonna go in Newark, New Jersey.

  • We can just automatically select a limited.

  • You can change this later, but I'm gonna say data server.

  • Does that make sense?

  • Um uh, I don't really care about tag.

  • Um, I don't even think you have to do it.

  • So let's go ahead and hit.

  • Submit there, and we should get our volume.

  • So now what we want to do is we need to run these commands.

  • And, uh, I guess I'll just make another window here leno dot com Go log in.

  • What I want is so there's our volume click on the server.

  • I'm trying to get a right piece.

  • So here's our i p address.

  • Boop.

  • So now, coming over to either If you're on Mac or Lennox, just open your terminal.

  • If you're on Windows, either download a program called Putty or I'm using Bash on a boon to on Windows.

  • You could go to, like the AP story thinking get it or do you just enable it?

  • I don't even know.

  • It might even be there by default.

  • Now, I honestly do not know.

  • Um I just know I have it, and it makes it easier to do stuff like this.

  • So, uh, again use whatever, uh, there's a 1,000,000 options for Windows, basically.

  • But if you're on Mac or Lennox, just open terminal.

  • If you're on windows and you having a problem with with this step, feel free to post comment below or join us on discord.

  • That's discord.

  • G slash Centex.

  • We can definitely help you out or go to the text based version of the tutorial.

  • There's instructions there.

  • So we're gonna sshh roots at, uh, can't There you right.

  • Click Thio Paste it in that address.

  • Uh, and then the first time you connect, you're gonna get a message like this basically is just like, Hey, we've never seen this fingerprint before.

  • So if this is the first time you're connected to that server, then this is totally fine.

  • If you don't think this is the first time you've connected to that I p address, uh, something is wrong.

  • There should be a red flag, but for now, yes, it's correct.

  • Cool.

  • Uh, now the password capital, jelly fish, exclamation mark.

  • So if we could get a different type of that organists, but what I'm getting maybe Okay, just take a second.

  • I was like, No, Uh, okay.

  • So now what we want to do is we want to run thes at least these three commands.

  • And then we could also add this, um, as well.

  • This is a little less essential eso first.

  • This command here actually makes the file system, um, on the actual volume itself, cause we've attached to this volume to this, uh, Lynn owed.

  • Let's call.

  • It s so first we want to make a file system on that volume, so it's almost like it's stone drive right now, right?

  • And so first, we're gonna add that.

  • So I'm gonna come over here and just right click.

  • Uh, boop done.

  • And if I forget to?

  • If I forget to say later, um, we're gonna show you guys two methods for kind of moving your data around and this is a command you would only want to run one time per volume.

  • Just remember that.

  • Okay, so then this one make dirt.

  • So now that we've got the storage, um, or the final system on that volume.

  • Ah, here.

  • We're just gonna make a directory for how we where we want to access that storage on our actual machine here.

  • So I'm gonna just copy paste that in, makes that directory and then Now we want, basically mount that volume to that directory.

  • So now, paste again.

  • So it's just like pointing this location to this location.

  • So now our data server has access to all that 20 gigabytes of storage or 500 or 10,000 or whatever.

  • So cool.

  • So now that we have, uh, that done, we also could you could just nano, um f s tab here.

  • Uh, here.

  • So Well, so nano.

  • Uh, f s for Ashley was E T c f s tab or F Stab whatever you wanna call it.

  • Uh, come down here, paste that in, uh, me.

  • Confirm that.

  • Cool.

  • Yeah.

  • All set control X Yes, to save, enter.

  • Done.

  • So now, if you reboot this machine or whatever, this will always be there for you.

  • So, um, so now we have the storage, and we can get there by just doing change directory into Mount ml storage.

  • And there we are.

  • Now, uh, move this aside.

  • And in fact, um um, I don't want to do this good.

  • The next thing we want is the, uh I guess we could Google it, but I could find it that way.

  • We're gonna We just need a data set just to kind of show you guys how it works.

  • Cats, verse, dogs might grows soft.

  • I'm sure I can find it.

  • Yes.

  • Okay, so click on that.

  • Uh, and then here's your download.

  • Uh, I think this will take us to get to a new page.

  • And actually, what I want to do is right.

  • Click this copy link address, and then I'm gonna come over here and inside of Mount Ml storage.

  • What we're going to run is, uh you can't see that here.

  • We're gonna say w get really I'm pretty sure I copied that.

  • Let me try that one more time.

  • Uh, copy, link address.

  • Come over here.

  • Boop.

  • Cool.

  • So that will download again to our data server.

  • So hopefully here, you can see Yes.

  • Even here we're getting about, I don't know, 30 40 megabytes.

  • Second, make it bits a second.

  • So feel free to correct.

  • So now, once that's done, we basically have this data set on our, um, our server.

  • Now, the next thing we could do is we can.

  • Either way, we could keep it zipped form, or we could, like, unzip it.

  • And so I think I'm actually just gonna unzip it.

  • I always forget the tar accepts your command to do that.

  • So instead, we're going to say is pseudo act.

  • Get install on zip.

  • Wait for that any day now, Bond.

  • Huh?

  • I guess while we wait on that shuts out to the following channel members.

  • Let me boot that, uh, one years, Iago Lima Rodrigo's Silva average.

  • Eat.

  • Thank you guys for one year of membership.

  • It's a very long time.

  • Omni Crux, Miguel Latorre Perata.

  • And how young?

  • Whoa.

  • Six months of membership.

  • You guys are amazing.

  • Okay, we've got our, uh, unzip.

  • So now we want to hopefully over here, Uh, on zip, CAG.

  • Oh, cats and dogs.

  • Very good.

  • Uh, well, this could take a while.

  • Okay, Well, while we wait on that, um, turned his side.

  • So the next thing that we're going to do is once we have this, this is in theory or data set like we're pretty much done.

  • We're just gonna wait.

  • I wish I could remember if this started on cats or started on dogs.

  • I guess we'll find out.

  • I think it's about 12,000 images per thing.

  • We're done.

  • Okay, Let me make sure that worked as intended.

  • I cannot read that this is pet images.

  • And then I have no idea what's supposed to be here.

  • Have picked up a poor color scheme.

  • So Okay, so we've got basically this directory, that is, It looks, it's pet images like that.

  • That's what contains basically all of our training data.

  • So now see, all this time that we spent Luckily, this didn't take too long.

  • This isn't actually a very large day set, right?

  • I think it's like 400 megabytes or something.

  • The typical data set that you're gonna actually do deep learning on, though, is often like 50 gigabytes or 500 gigabytes.

  • Right?

  • It takes a very long time to get that data to your server, either through downloading or even worse, if you have to upload that data.

  • Almost everyone has a pretty at least in the United States.

  • Other parts of the world are blessed and then other personal world aren't blessed.

  • They don't have any Internet.

  • So So depending on where you live, this might vary for you, but in general, I find my upload rate is terrible and it really is very painful as I upload to a server that I'm paying many dollars an hour for.

  • So anyway, we have all of her data, and it's on the data.

  • VPs.

  • So now what we want to dio is we can actually spin up our, uh, GPU servers come over here.

  • Um, okay, we're going to actually go to stack scripts and then community stack scripts.

  • And then probably if you just type Centex, you'll find it.

  • Maybe.

  • Yes.

  • So what we want is the scent extends airflow, GPU and pytorch setup.

  • No.

  • Like I said, first, let me do the okay.

  • First will click on it.

  • Like I said, you can do this on any of the providers.

  • You don't have to use the nude, but you should, because right now they're the best.

  • But, um, this basically, it's it's called a stack script on Lynn owed and stack strips on Leonard or basically shell scripts with a little bit of Hawaiian variables involved.

  • Okay, with that's it.

  • Uh, and this stack script is truly a shell script that you could run.

  • There's no like our There's no Lin owed variables being passed here.

  • So if you go to the text based version of the tutorial.

  • You you literally can just copy and paste this.

  • It's a shell script.

  • And then whenever you get onto your server, um, you would first just ch monde plus X the name of the script on then you would just run it with period slash and then, you know, script dot s h Okay, simple again.

  • If you're having trouble with that, you don't really understand any of that.

  • Come to discord dot gov slash and tax, and we'd be happy to help you out.

  • So, uh, so many ways coming back over here.

  • It's just convenient online owed that we can just save this.

  • And it's kind of like like on paper space, you can just spin up the machine.

  • It's ready to go because again, setting up tents, airflow could take a while because you had to get Cody nn Could a tool kit.

  • You've got to get tensorflow GPU.

  • You've got to get every all the other dependencies you're gonna need to get.

  • Think python three sixes now that defaults on Linda are on a boon to, but probably should be on 37 At least that's what I would prefer to be on So, you know, getting that all set up getting Pip, getting just There's so many things you gotta get.

  • So you just wasted my time.

  • And it's not only the server time, it's your time as well.

  • So anyways, this will just automatically do all of it for you.

  • So I'm gonna say, deploy new Lin owed from this stack script.

  • You also could have clicked those, like, little three dots.

  • But that's what I'm gonna D'oh!

  • It's already selected a boon to 18 04 for us again.

  • Region do Newark, New Jersey.

  • If you do another region, you might see something.

  • So let me just say, Well, I'll just say New York.

  • Georges, you're free now and I'll try to remember to switch it, but then come down to a limited plan.

  • Choose GPU.

  • I'm gonna go with the $1000 a month option, but you could use the other ones.

  • Um, if you if you have something that needs that much little label, I'm going to say our t x 6000.

  • Uh, password.

  • Jelly fish.

  • Exclamation Mark.

  • Um, please don't forget about that coming over here.

  • What else do we e?

  • I think we're gonna go So the only other thing I want to show you guys is if I was to select Dallas, Texas, which I have tried before, and I click create.

  • If this works, I'm gonna be mad.

  • But I don't think it will.

  • Yes, you'll get this.

  • This little warning here.

  • It says you are not authorized.

  • Take this action.

  • What they mean to say is we don't have any G pews there.

  • So anyway, coming back to Newark, New Jersey, they do have GP use their although if if enough people follow this tutorial on release, they might not so careful.

  • Uh, anyway, Newark, New Jersey.

  • Cool.

  • They were gonna go again.

  • You could do backups, but $40 a month is pretty expensive.

  • Uh, and our methodology will kind of have almost backups fully created.

  • You would just need to do a couple of basic things, and you'd have your own backup, so Yeah.

  • Anyway, I'm not gonna do that.

  • Uh, crate.

  • That's good.

  • Uh, Leonard GPS are build.

  • I want to say, um, pro rated to the hour.

  • So some places air prorated to the second, which is like eight of us.

  • We'll prorate your bill.

  • So if you use 30 seconds on Lynn owed.

  • You will be billed for an hour.

  • So a dollar 50 Let's say, whereas on Amazon, if you use 30 seconds, you would be billed for whatever a dollar 50 are actually $3.6.

  • Um, in 30 seconds of $3.6 an hour, you do the math.

  • Okay, so currently our server is spinning up.

  • It's busy, it's got its There's actually quite a few things that needs to do.

  • Um, because first it has to spin up, get allocated everything, and then it actually has to run the stacks grip.

  • So even though we don't have to run this that anymore, it still does have to run and actually set things up.

  • So things were getting downloaded and stuff like that.

  • S o actually don't remember exactly how long this will take.

  • So probably will do is pause the video for now and then See, it does say booting, but it will first boot, then run the stack script and then reboot so it could take it could take a little bit, so we'll see.

  • Um let me think you're we could connect to it has been created booted.

  • I just know it.

  • It can't possibly be ready.

  • It'll have to reboot after the stacks script runs.

  • Anyway, I'm gonna pause for now and then pick back up once, Uh, I'm confident the machine is ready, so I'll just test make sure tensor flows is up and all that.

  • Okay.

  • And we are back.

  • It has finally rebooted.

  • I just I knew that it would reboot at the end of that stack Scripts.

  • So it's just kind of waiting for the reboot.

  • It took 10 minutes, honestly, to set up all that stuff myself.

  • I usually take about 30 to 45 minutes because you have to download, like, code and the code a tool kit.

  • Drivers, uh, all the other things.

  • And then I usually end up forgetting certain dependencies when I go to run my scripts for the first time.

  • It takes me a really long time.

  • So 10 minutes is really good.

  • You just started.

  • You walk away.

  • It's all good.

  • So anyways, yeah, so you'll have to wait 10 minutes if you're following along.

  • Otherwise, you're good to go.

  • So, uh, now here is a fresh boot bash on a boon to bash on?

  • I don't know.

  • Anyways, going are the letters.

  • What I want is my i p address for the GPU server.

  • I wouldn't want to make sure everything worked.

  • So sssh route at that I p I've already connected once, but otherwise you'll see that same warning is before it, sister.

  • First time connecting password.

  • Uh, what this jellyfish jelly fish?

  • Don't forget to nuke.

  • Okay, so first we're gonna do is check python 3.7.

  • That does indeed open python for me.

  • Let's make sure we're on view for everybody and import tensor flow.

  • Asked if, despite future warnings, it does work and you can ignore the use these air just you have numb pies involved with tens of low, as you might expect.

  • And there are certain things that are going to have to change before, uh, the next update to numb pie, which will eventually deprecate certain things that tensorflow is using of numb pie.

  • But right now, it's totally fine.

  • And don't worry, it will be finding the future.

  • As of python 3.7, various deprecation and future warnings are much more visible than they ever were before.

  • Which I kind I don't know, how I feel about that.

  • But anyway, that's why you probably if you've been using 37 you're seeing more of these deprecation warnings than you've ever seen before.

  • Anyway, looks like it works now.

  • We want to test it, but we first we need some data.

  • Also, our date is not on the server yet, Right?

  • So the first thing that we're gonna do is, uh I guess I'll just make sure and then forward slash uh, deep learning cools, and then we'll change directory into deep learning.

  • Cool.

  • And now we want our data.

  • So I'm going to use secure copy and paste to move our data set from here to our machine.

  • Now, this is one of two methods that you can use.

  • Um, I'm going to use this method first, cause this method will always work And then I'll show you the method that I would use anywhere you have this movable volume like we have right now, Onley note, because it will make way more sense.

  • But for now, we're currently in this on the data server which has currently the volume is mounted to it.

  • We have our ZIP file which extracted to pet images, which is I'm sorry about the colors, but anyways, pet images is there.

  • And now what?

  • We want to D'oh!

  • Yes, we're already meeting this.

  • Oh, now we want to Dio is, uh, move pet images to our GPU server and the way they were gonna do that is with SCP.

  • So, uh, let me pull up.

  • Yeah, so too do SCP.

  • You'll start with S C P.

  • And then what do we want to move?

  • We want to move pet images, but because if it was just a casual cats and dogs, you would just say SCP this.

  • Really?

  • Oh, it just took a size.

  • Yeah, Hit tab.

  • You can hit tab in auto completes, but it was taking a while anyway.

  • Okay, so s e b.

  • So you could if it's just a single like, zipped up file, you can do that.

  • But I actually want to move this directory that we extracted to save time.

  • So I should say SCP dash are for recursive pet images.

  • And now we're gonna SCP that entire directory.

  • And then where do we want to S e p a.

  • To?

  • We want STP it to the route at the I p address for our, uh Can I get it with this?

  • Possibly Is it here?

  • Yes, right here for our GPU server.

  • So copy.

  • Come over here.

  • Root at 45 7 Okay.

  • Cool.

  • Attn.

  • That address, colon and then the location on that server.

  • So the location was like, deep learning or something.

  • Me break.

  • So just slash deep learning.

  • Yes, PWD is working directory, but also said it right here.

  • Anyway, so just slash deep learning slash deep learning.

  • Cool.

  • I don't think you would need that.

  • Trailing slash moment.

  • Throw it in.

  • There has to be cool.

  • Enter.

  • Hopefully this goes again.

  • This is the first time that the data server has connected to the GPU server.

  • So again we get the same little fingerprint warning thing.

  • Yes, uh, password jelly fish.

  • And now we wait for this.

  • All these things to transfer this operation, it would be would have been questionable, which would have been quicker actually to do.

  • And I also wonder if SCP has a distributed option, because that would also be much quicker.

  • This is gonna take quite a while.

  • Actually, this is going.

  • This would not be the most efficient.

  • It would be more efficient to move the zip file, which I think took 30 or 40 seconds to download, will move that one at the end of this is well, because I think that would be kind of a curious ah thing.

  • It's still going.

  • Uh, the extraction was definitely quicker.

  • So here's what I'm gonna do.

  • It must break that.

  • Just control C to break it.

  • I'm gonna come over to our GPS server.

  • I'm going to l s.

  • There's pet images.

  • I'm a r m dash or for recursive pet images.

  • See you later, freak.

  • And instead, what we're gonna do is rather than SCP recursive lee, which will be useful if you're training like, especially if you're trying to move the information from your GP You server like you want to move model files, log files, other things back.

  • You still want to know the the S e p recursive?

  • But that is taking way too long.

  • So instead, what we want to do is, uh, let's do S e p uh, In fact, we just do it.

  • Just do this.

  • And then, rather than pet images, let's S C p, and then we'll just move cargo cats and dogs dot zip to the same location.

  • Now this should go quite quickly.

  • Hopefully, even now it's a 30 megabytes of second.

  • It's terrible.

  • I wonder if I didn't do.

  • I'm pretty sure both of these air in the exact, uh this should be way quicker.

  • It should be way quicker.

  • Um, both of these air in Newark, New Jersey, or volume.

  • It gets attached, but it's also in Newark, New Jersey Uh, these are not typical numbers that you would see.

  • It's going so slow.

  • I can't decide if it's going so slow because of our previous Trent.

  • Look, it might be like a firewall issue, like we should be getting 100.

  • At least I'm curious if anyone else is following along post below what number you're seeing, especially if you don't do this initial this initial thing, but this it's gotta be a firewall or something.

  • Um, I've done this quite a few times, and usually when you're transferring locally like that because it's from in the same region, so they should all be on the same network, so we should be getting somewhere between 103 100 megabits a second like this should not take so long.

  • So it's this kind of a bummer Megabits or make a bite some correct me there.

  • Um, doing it.

  • That's too bad.

  • That took a while.

  • I really did.

  • Um Luckily, I haven't even better solution for you, But first, let's finish up the S e p version because even I can't.

  • I still can't upload data at a solid 12 megabits a second or make a vice a second.

  • So many people are gonna be like it's this idiot.

  • Uh, okay, so now this is our data server.

  • See?

  • Later, freak.

  • Now we have the zip file, which again we're gonna have pseudo act, get install, unzip, and then we'll unzip it much quicker than that.

  • Now, though, I do wonder astute on zip, CAG goal.

  • I do wonder if I just wonder if we were throttled kind of before we started moving that data.

  • Or if there's some sort of network issue right now, what?

  • Because that's just that's a terrible local transfer rate.

  • Anyway, that's already done.

  • So cool.

  • So now we have that directory.

  • The next thing I want us to do is I will post a link in the description basically This is the text based version of this tutorial.

  • Uh, if we scroll down to here, we get a just a quick example script that will run.

  • It's just gonna train this model.

  • Basically.

  • So coming over to my bash on a boon to on Windows on Lennox on Mac os.

  • Oh, dear, I didn't want to do that.

  • Uh, we want to nano ml example dot pie and then right click in there.

  • And this is just our quick script.

  • Um, I don't really think I need to explain it, but basically, it's just gonna, uh, do some of the pre processing again.

  • This is something you might want to actually do on, like a pre processing, um, server rather than on your GPU server that you're paying a bunch of money for.

  • But it does just so happen to be the case that you're GPU server has a lot of processing power, so I'll let you make that decision, but for now, we'll just run it here.

  • But the same thing applies.

  • You could just SCP um, you know, run the pre processing than SCP it over.

  • I promise the S C P rate, uh, should not normally be 11.2, um, m bees per second.

  • So anyway, uh, ex yes, to save cool.

  • Then we could just run python 3.7 ml.

  • Example Dupuy Which again?

  • We'll run the pre processing step first s o.

  • That could take a little bit.

  • I don't remember how.

  • Well, I uh Okay, so first we're printing out some of the errors.

  • So for whatever reason, in this training data set some of the images kind of like don't actually load for some reason.

  • So anyway, what you're seeing right now for the errors is just right here.

  • For some reason, we're unable to read it with open CV pressure to use tiki gm or something to show where we were in the process.

  • That would've been smart, huh?

  • But we can court.

  • We can see at least where we are in dogs.

  • Although its ads out of order.

  • That's weird.

  • Who?

  • Okay, I think we're okay.

  • So we're now where I can see we're loading things only Move this over.

  • So this is a training her model for a few e box and is exceptionally fast.

  • This is hilariously fast.

  • Just in case you've never trained this model I highly encourage you to take this exact code and train it locally.

  • Um okay, so we trying to cats First dog model, the accuracy in sample an industry's 91.

  • That validation accuracy is 74.

  • Basically.

  • So we definitely over fit.

  • But we weren't really trying to do anything special.

  • We just wanted to run it.

  • Um, but that's insanely fast to do a two second epoch Thio.

  • For some reason, when I wrote this script locally, I was getting like, 24 2nd pox, which is very odd because I run an arty ex titan, GPU, which has the exact same tensor T flops and the same no more say more, same V ram 24.

  • Yeah, I think the same.

  • Very.

  • Um, so anyway, uh, not sure why, It was so lightning fast on Lynn owed, but definitely curious to dig more into that because this GP was very comparable to the G pews on paper space.

  • So sound like this GP is, like, super slow like it.

  • It's about it's a little faster than the V 100 So I don't know why these air so fast.

  • Anyway, I'd love to look more into that And also, if you want to see me compare this r T x 6000 to a V 100 like on paper space, let me know.

  • The other thing I wouldn't mind checking at some point is many.

  • So the V 100 was meant to be a server GPU that you would just have a bunch of.

  • So I do wonder if four r t x six thousands still significantly outperform four v one hundreds.

  • But that's a very expensive test.

  • So depending on how many people really want to see that, let me know, because I'm curious.

  • But it's also expensive, so we'll see anyway.

  • So that's that.

  • That is basically the entire workflow that should work everywhere now because of what you just saw with that SCP delay in all that, Sometimes these things can take a while toe, uh, transfer for who knows why I'm gonna wager, I don't know.

  • I just don't think SCP should have hit any sort of firewall or something.

  • I don't know what's going on.

  • There may be something else.

  • You're sharing the network with people, So maybe the network just under super heavy load, I don't know.

  • It should be faster than that, though, to be honest.

  • Um okay, So the other thing that I would show is at least on Lynn owed and other places that have these, like, separate volumes like this.

  • Ah, one thing you can do is detach and re tach pretty quickly.

  • And that is the case on Lynn owed.

  • So rather than doing all the s e p nonsense, you can have an ml ml storage, right that first you've attached it to your data server, you do your pre processing.

  • So in this case, let's call extracting pre processing, Uh and then we can attach it to RG Pew server.

  • So, actually, what?

  • What I'm gonna do is I'm gonna CD out of the Deep learning directory and then Okay, I'm gonna r m r deep learning.

  • See you later, freak.

  • And now what I'm gonna dio is, uh, come over here.

  • And it would probably wise to pseudo shut down on his age now.

  • And I'm gonna shut down this server as well.

  • Pseudo shut down now, because we're gonna move the volume.

  • And if we move the volume Wow, the server is on weak, like a least right now.

  • I'm under the impression that I have, like nothing is modifying that volume.

  • But you might make the mistake one day, and then that volume gets corrupted or whatever.

  • And if you don't have backups, you're gonna be really sad.

  • So just go and shot them both off.

  • And then what I'm gonna do is I'm gonna pop over here.

  • I'm gonna wait until this says that they're offline.

  • Okay, Uh, we're gonna come to the volumes, and then what I'm gonna do is click this little arrow here.

  • Don't say detach.

  • So currently, ml storage has all that training data already on it.

  • So now I'm gonna detach it from, um, from that that data server.

  • But the data is still gonna be on this volume, so detach.

  • And then we'll wait for that to get attached to Be pretty quick.

  • Okay, Now we want to attach, and we're going to attach it to our GPU server safe now, Once it's attached, we want we have to, like, re kind of sort of set it up.

  • But don't run that for his command, because you will recreate the file storage and you're basically just kind of like override it because It's the exact same command.

  • So if we go show configuration, don't run that command.

  • This is the one we want to run.

  • So, uh, so now what I'm gonna do is I'm gonna come down.

  • Whoops.

  • Do do do do do, uh, and reconnect again to our g p server, uh, jellyfish.

  • And we're going to run this command here loop and then the second command here.

  • I'm not gonna run the other command just to save time, But you can run that as well.

  • Do the I'm gonna call it f stab.

  • Uh, okay, so now the ML storage is there, So if we change directory into mount steps forward slash mount ml storage and then we list out that directory.

  • Guess what?

  • There's our pet images.

  • So now what we could dio is we could train a model, save the model right to this this storage here.

  • And then again, what we would do is we'd come over here once we're done.

  • Once that the the model and logs all that has saved two r ml storage volume, click here to click on that one.

  • Detach.

  • I crack myself up.

  • Uh, and then you would move it back to, uh, your, uh your data VPs right to do your pre processing were just to store.

  • It s so then you go over to your Leonard's and actually Oh, yeah, I just, like, trip myself out there because I was like, I never booted back up those servers.

  • But whenever you attach the attach her detach, it runs a reboot sequence.

  • So I removed it from Data server.

  • So it rebooted data server after it was removed.

  • And then the GPU server.

  • When I attached it, it booted the GP, sir, But for a second there, I was like, Wait, I never turned back on anyway.

  • Um okay, so now what I want to do is before I forget, uh, actually, I want to turn both of these guys off, uh, parole power off and power off.

  • But basically, at the end of your training, a model you had detached that volume power off the GPU server and nuke it.

  • So basically you spent 10 minutes really spinning it up, setting it up, and then the attaching and detaching of the volume is nearly instant.

  • It's pretty quick, or you can use SCP, which is variable.

  • That's not the first time.

  • It's not just Lynn owed.

  • That's not the first time.

  • SCP has been quite slow for me, even locally, so sometimes it can really bite in the butt, and that kind of sucks.

  • But anyway, least on Leonard, that's how you can even avoid that.

  • You just simply detach attach the entire volume, and yet so Okay, that was a ton of information.

  • Hopefully, you guys have learned something useful if you want.

  • Keep using the cloud.

  • At least right now, Lin owed is the place to go again.

  • If you haven't created an account up to this point, you can sign up for Lou Node and get a $20 credit by lynda dot com slash syntax Questions, comments, concerns whatever.

  • Feel free to leave those below.

  • If you have also, you can join the discord.

  • That's the score dot g slash Centex.

  • If you've got questions that you want answered in there as well, so anyway, that's all for now.

what's going on, everybody.

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

クラウドGPUチュートリアル(比較&使用方法 (Cloud GPUs Tutorial (comparing & using))

  • 3 0
    林宜悉 に公開 2021 年 01 月 14 日
動画の中の単語