Placeholder Image

字幕表 動画を再生する

  • Jabril: John Green Bot are you serious?!

  • I made this game and you beat my high score?

  • John-Green-bot: Pizza!

  • Jabril: So John Green Bot is pretty good at Pizza Jump, but what about this new game we made, TrashBlaster?

  • John-Green-bot: Hey, that's me!

  • Jabril:Yeah, let's see watch you've got.

  • John-Green-bot: That's not fair, Jabril!!

  • Jabril: It's okay John Green Bot we've got you covered.

  • Today we're gonna design and build an AI program to help you play this game like a pro.

  • INTRO

  • Hey, I'm Jabril and welcome to Crash Course AI!

  • Last time, we talked about some of the ways that AI systems learn to play games.

  • I've been playing video games for as long as I can remember.

  • They're fun, challenging, and tell interesting stories where the player gets to jump on goombas

  • or build cities or cross the road or flap a bird.

  • But games are also a great way to test AI techniques because they usually involve simpler

  • worlds than the one we live in.

  • Plus, games involve things that humans are often pretty good at like strategy, planning,

  • coordination, deception, reflexes, and intuition.

  • Recently, AIs have become good at some tough games, like Go or Starcraft II.

  • So our goal today is to build an AI to play a video game that our writing team and friends

  • at Thought Cafe designed called TrashBlaster!

  • The player's goal in TrashBlaster is to swim through the ocean as a little virtual

  • John-Green-bot, and destroy pieces of trash.

  • But we have to be careful, because if John-Green-bot touches a piece of trash, then he loses and

  • the game restarts.

  • Like in previous labs, we'll be writing all of our code using a language called Python

  • in a tool called Google Colaboratory.

  • And as you watch this video, you can follow along with the code in your browser from the

  • link we put in the description.

  • In these Colaboratory files, there's some regular text explaining what we're trying

  • to do, and pieces of code that you can run by pushing the play button.

  • These pieces of code build on each other, so keep in mind that we have to run them in

  • order from top to bottom, otherwise we might get an error.

  • To actually run the code and experiment with changing it, you'll have to either click

  • open in playgroundat the top of the page or open the File menu and clickSave

  • a Copy to Drive”.

  • And just an fyi: you'll need a Google account for this.

  • So to create this game-playing AI system, first, we need to build the game and set up

  • everything like the rules and graphics.

  • Second, we'll need to think about how to create a TrashBlaster AI model that can play

  • the game and learn to get better.

  • And third, we'll need to train the model and evaluate how well it works.

  • Without a game, we can't do anything.

  • So we've got to start by generating all the pieces of one.

  • To start, we're going to need to fill up our toolbox by importing some helpful libraries,

  • such as PyGame.

  • The first step in 1.1 and 1.2 loads the libraries, and step 1.3 saves the game so we can watch

  • it later.

  • This might take a second to download.

  • The basic building blocks of any game are different objects that interact with each other.

  • There's usually something or someone the player controls and enemies that you battle

  • -- All these objects and their interactions with one another need to be defined in the

  • code.

  • So to make TrashBlaster, we need to define three objects and what they do: a blaster,

  • a hero, and trash to destroy.

  • The blaster is what actually destroys the trash, so we're going to load an image that

  • looks like a laser-ball and set some properties.

  • How far does it go, what direction does it fly, and what happens to the blast when it

  • hits a piece of trash?

  • Our hero is John-Green-bot, so now we've got to load his image, and define

  • properties like how fast he can swim and how a blast appears when he uses his blaster.

  • And we need to load an image for the trash pieces, and then code how they

  • move and what happens if they get hit by a blast, like, for example, total destruction

  • or splitting into 2 smaller pieces.

  • Finally, all these objects are floating in the ocean, so we need a piece of code to generate

  • the background.

  • The shape of this game's ocean is toroidal, which means it wraps around, and if any object

  • flies off the screen to the right, then it will immediately appear on the far left side.

  • Every game needs some way to track how the player's doing, so we'll show the score too.

  • Now that we have all the pieces in place, we can actually build the game and decide

  • how everything interacts.

  • The key to how everything fits together is the run function.

  • It's a loop of checking whether the game is over; moving all the objects; updating

  • the game; checking whether our hero is okay; and making new trash.

  • As long as our hero hasn't bumped into any trash, the game continues.

  • That's pretty much it for the game mechanics.

  • We've created a hero, a blaster, trash, and a scoreboard, and code that controls their

  • interactions.

  • Step 2 is modeling the AI's brain so John-Green-bot can play!

  • And for that, we can turn back to our old friend the neural network.

  • When I play games, I try to watch for the biggest threat because I don't want to lose.

  • So let's program John-Green-bot to use a similar strategy.

  • For his neural network's input layer, let's consider the 5 pieces of trash that are closest

  • to his avatar.

  • (And remember, the closest trash might actually be on the other side of the screen!)

  • Really, we want John-Green-bot to pay attention to where the trash is and where it's going.

  • So we want the X and Y positions relative to the hero, the X and Y velocities relative

  • to the hero, and the size of each piece of trash.

  • That's 5 inputs for 5 pieces of trash, so our input layer is going to have 25 nodes.

  • For the hidden layers, let's start small and create 2 layers with 15 nodes each.

  • This is just a guess, so we can change it later if we want.

  • Because the output of this neural network is gameplay, we want the output nodes to be

  • connected to the movement of the hero and shooting blasts.

  • So there will be 5 nodes total: an X and Y for movement, an X and Y direction for aiming

  • the blaster, and whether or not to fire the blaster.

  • To start, the weights of the neural network are initialized to 0, so the first time John-Green-bot

  • plays he basically sits there and does nothing.

  • To train his brain with regular supervised learning, we'd normally say what the best

  • action is at each timestep.

  • But because losing TrashBlaster depends on lots of collective actions and mistakes, not

  • just one key moment, supervised learning might not be the right approach for us.

  • Instead, we'll use reinforcement learning strategies to train John-Green-bot based on

  • all the moves he makes from the beginning to the end of a game, and we'll evolve

  • a better AI using a genetic algorithm which is commonly referred to as GA.

  • To start, we'll create some number of John-Green-bots with empty brains

  • (let's say 200), and we'll have them play TrashBlaster.

  • They're all pretty terrible, but because of luck,

  • some will probably be a little bit less terrible.

  • In biological evolution, parents pass on most of their characteristics to their offspring

  • when they reproduce.

  • But the new generation may have some small differences, or mutations.

  • To replicate this, we'll use code to take the 100 highest-scoring John-Green-bots and

  • clone each of them as our reproduction step.

  • Then, we'll slightly and randomly change the weights in those 100 cloned neural networks,

  • which is our mutation step.

  • Right now, we'll program a 5% chance that any given weight will be mutated, and randomly

  • choose how much that weight mutates (so it could be barely any change or a huge one).

  • And you could experiment with this if you like.

  • Mutation affects how much the AI changes overall, so it's a little bit like the learning rate

  • that we talked about in previous episodes.

  • We have to try and balance steadily improving each generation with making big changes that

  • might be really helpful (or harmful).

  • After we've created these 100 mutant John-Green-bots, we'll combine them with the 100 unmutated

  • original models (just in case the mutations were harmful) and have them all play the game.

  • Then we evaluate, clone, and mutate them over and over again.

  • Over time, the genetic algorithm usually makes AI that are gradually better at whatever they're

  • being asked to do, like play TrashBlaster.

  • This is because models with better mutations will be more likely to score high and reproduce

  • in the future.

  • ALL of this stuff, from building John-Green-bot's neural network to defining mutation for our

  • genetic algorithm, are in this section of code.

  • After setting up all that, we have to write code to carefully define what doingbetter

  • at the game means.

  • Destroying a bunch of trash?

  • Staying alive for a long time?

  • Avoiding off-target blaster shots?

  • Together, these decisions about whatbettermeans define an AI model's fitness.

  • Programming this function is pretty much the most important part of this lab, because how

  • we define fitness will affect how John-Green-bot's AI will evolve.

  • If we don't carefully balance our fitness function, his AI could end up doing some pretty

  • weird things.

  • For example, we could just define fitness as how long the player stays alive, but then

  • John-Green-bot's AI might play \TrashAvoider\ and dodge trash instead of TrashBlaster and

  • destroy trash.

  • But if we define the fitness to only be related to how many trash pieces are destroyed, we

  • might get a wild hero that's constantly blasting.

  • So, for now, I'm going to try a fitness function that keeps the player alive and blasts

  • trash.

  • We'll define the fitness as +1 for every second that John-Green-bot stays alive, and

  • +10 for every piece of trash that is zapped.

  • But it's not as fun if the AI just blasts everywhere, so let's also add a penalty

  • of -2 for every blast he fires.

  • The fitness for each John-Green-bot AI will be updated continuously as he plays the game,

  • and it'll be shown on the scoreboard we created earlier.

  • You can take some time to play around with this fitness function and watch how John-Green-bot's

  • AI can learn and evolve differently.

  • Finally, we can move onto Step 3 and actually train John-Green-bot's AI to blast some trash!

  • So first, we need to start up our game.

  • And to kick off the genetic algorithm, we have to define how many randomly-wired John-Green-bot

  • models we want in our starting population.

  • Let's stick with 200 for now.

  • If we waited for each John-Green-bot model to start, play, and lose the gamethis

  • training process could take DAYS.