Placeholder Image

字幕表 動画を再生する

  • Hey, John-Green-bot.

  • I've been thinking really hard about a HUGE life decision.

  • I want to adopt a pet, and I've narrowed it down to either a cat or a dog.

  • But there are so many great cats and dogs on adoption websites.

  • John Green Bot: The Grey Parrot (Psittacus erithacus) has an average lifespan in captivity

  • of 40 to 60 years.

  • Jabril: Yeah, birds are great and all but I was thinking maybe a cat or a dog.

  • John Green Bot: Turtles will need a tank approximately 7.5 to 15 times their shell length in centimeters.

  • Jabril: Yeah, you're no help.

  • Come on Spot and Mr. Cuddles.

  • It looks like I'm going to have to figure this out myself, and by myself I mean

  • make an AI figure it out.

  • Today we're going to train an AI to go through the list of pets and make the best decision

  • for me based on data!

  • That'll make things less stressfulsurely, nothing will go wrong with thisright?

  • INTRO

  • Hey, I'm Jabril and welcome to Crash Course AI.

  • Today we're going to build a fairly simple AI program to find out if adopting a cat or

  • a dog will make me happier.

  • This is a pretty subjective question, and if I use data from the internet, I'll have

  • a lot of strong opinions.

  • So, I'll conduct my own survey where I collect data about people's cats and dogs and their

  • happiness.

  • I don't care what pet I get, as long as it makes me happy, so I won't even include cat

  • and dog labels in the model.

  • Like in previous labs, I'll be writing all of my code using a language called Python

  • in a tool called Google Colaboratory.

  • And as you watch this video, you can follow along with the code in your browser from the

  • link we put in the description.

  • In these Colaboratory files, there's some regular text explaining what I'm trying

  • to do, and pieces of code that you can run by pushing the play button.

  • These pieces of code build on each other, so keep in mind that you have to run them

  • in order from top to bottom, otherwise you might get an error.

  • To actually run the code or make changes to it, you'll have to either clickopen

  • in playgroundat the top of the page or open the File menu and clickSave a Copy

  • to Drive”.

  • And one last time, I'll give you this fyi: you'll need a Google account for this.

  • Creating this AI to help me decide between a cat and a dog should be pretty simple, so

  • there are only a couple of steps: First, I have to gather the data.

  • I have to decide on a few features that could predict if a cat or dog makes people happy.

  • Then, I'll make a survey that asks about these features, and go out in the world and

  • ask people if their pet fits these features and makes them happy.

  • It might be a little biased or imperfect, but I think it'll be juuust finnne to help

  • me make my decision.

  • Second, I have to build an AI model to predict if a specific pet makes people happy.

  • Because I'm not collecting a massive amount of data, it's helpful to use a small model

  • to prevent overfitting.

  • So I'll plan on using a neural network with just one hidden layer.

  • And for our final step, I can go through an adoption website of adorable cats and dogs,

  • put in their features, and let the AI decide which pet will make me happy.

  • No more stressing about this tough decision, the machines have my back!

  • Step 1.

  • Instead of importing a dataset this time, we've got to create our own!

  • So browsing through some adoption websites, the most common features I saw represented,

  • that are important to me are cuddly, soft, quiet (especially when I'm trying to sleep),

  • and energetic (because playing with an energetic pet might remind me to get up from my computer

  • a little more).

  • In the AI I'm programming, I'll use these four values to predict their answer todoes

  • your pet make you happy most of the time: yes or no?”

  • For the data collection part of this process, I gave this five-question survey of yes/no

  • questions to 30 people who own one cat or one dog.

  • I want to avoid bias based on the kind of pet, so I put everyone's answers into one

  • big list.

  • Every row is one person's response, and yes's are represented as 1 and no's as 0.

  • By representing the answers as numbers, I can use them directly as features in my model.

  • The first four questions are my input features and the last question about happiness is my

  • label.

  • And I'm not using cat or dog labels anywhere in my model.

  • I also have to split this dataset into the training set and the testing set.

  • The training set is used to train the neural network, and the testing set is kept hidden

  • from the neural network during training, so I can use it to check the network's accuracy later.

  • Step 2.

  • Now that I have a dataset, I need to build a neural network to help make predictions.

  • And if you did episode 5's Neural Network Lab (when I digitized John-Green-bot's handwriting),

  • this step will sound familiar because I'm using the same tools.

  • I'm going to use a multi-layer perceptron neural network or MLP.

  • As a refresher, this neural network has an input layer for features, some number of hidden

  • layers to learn representations, and a final output layer to make a prediction.

  • The hidden layers find relationships between the features that help it make accurate predictions.

  • Like in the Neural Networks Lab, we're going to import a library called SKLearn (which

  • is short for Sci Kit Learn).

  • SKLearn includes a bunch of different machine learning algorithms, but I'll just be using

  • its Multi-Layer Perceptron algorithm.

  • You can easily change the number of hidden layers and other parts of the model, but I'll

  • start with something simple: four input features, one hidden layer, and two outputs.

  • We'll set our hidden layer to four neurons, the same size as our input.

  • SKLearn will actually take care of counting the size of my input and output automatically,

  • so I only have to specify the size of the hidden layer.

  • Over the span of one epoch of training this neural network, the hidden layer will pick

  • up on patterns in the input features, and pass a prediction to one of two output neurons:

  • yes, happiness OR no, unhappiness.

  • The code in our Collab notebook calls this aniterationbecause an iteration and

  • an epoch are the same thing in the algorithm we're using.

  • As the model loops through the data, it predicts happiness based on the features, compares

  • its guess to the actual survey results, and updates its weights and biases to give a better

  • prediction in the future.

  • And over multiple epochs of the same training dataset, the neural network's predictions

  • should keep getting better!

  • We'll just go with 1000 epochs for now.

  • Now, I can test my AI on my original training data to see how well it captured that information,

  • and on the testing data I set aside.

  • The output here lets us know how good our neural network is at guessing if these pet

  • features predict owner happiness.

  • And it looks like our model got 100% correct on the testing data and 85% correct on the

  • training data!

  • Well guys, thanks for tuning in, but I think this project is almost over!

  • Everything was easy to do, performance looks great.

  • I'll just put in some pet features and let it help me with this big life decision!

  • Man, AI really is awesome.

  • Step 3.

  • Let's see... here's a pet I could adopt.

  • The description says it's cuddly, soft, quiet at night, and isn't that energetic.

  • Let's put in those features and see what the model says.

  • What?

  • Why not?

  • It seemed nice

  • But I guess that's why I programmed an AI, so I wouldn't be swayed by my FLAWED human

  • judgment!

  • Let's move on to the next one.

  • Let's see, this pet isn't cuddly, isn't soft, isn't quiet, and is really energetic

  • but let's see what my AI says.

  • Yes?!

  • I'm not so sure that pet would've made me happy, but my AI model had 100% accuracy

  • on the testing set!

  • I think I'm gonna test a few more...

  • Ok, so I've tested a bunch of animals and something weird is happening.

  • The AI rarely told me that adopting a cat would make me happy, but it almost always

  • said a dog would make me happy.

  • Maybe everyone I surveyed hates their cats?

  • But, that seems unlikely.

  • Besides, I never even told my AI what a cat is!

  • I combined all the surveys into one big dataset withoutcatordoglabels!

  • And I only taught the model about if a pet is soft, cuddly, quiet, or energetic.

  • Both cats and dogs can have all of those traits, right?

  • Is there a war between cats and AIs that I don't know about, and THAT'S why it's biased?

  • Hey John-Green-bot….

  • Do you guys hate cats?!

  • John-Green-bot: No, Jabril.

  • We love hairy babies...

  • Jabril: Ugh, I don't understand!!!!

  • So, obviously, AI doesn't have a grudge against cats.

  • I collected the survey data and I built the AI, so if something went wrong and introduced

  • an anti-cat biasit's on me, and I can figure out what it is.

  • So I should go back to analyze the data and my model design.

  • First, I'll look for patterns and correlations in my data by hand and make sure there's

  • nothing fishy going on.

  • This means a new step!

  • Step 4.

  • What's weird is that the model's predictions don't seem to make sense to me despite the

  • high performance.

  • Specifically, I'm noticing a bias towards dogs.

  • So there might be something strange about the data.

  • Earlier, I decided to just pool all the survey results together, but now I'll split them apart.

  • Now I can create plots that compare the percentage of dog owners I surveyed who are happy, the

  • percentage of cat owners who are happy, and the percentage of all the people who are happy

  • with their pet (no matter what kind).

  • To do this, I just need to compute the number of happy dog owners divided by the total number

  • of dog owners, the same for cat owners, and the same for everyone I surveyed.

  • Interesting.

  • According to my survey results, cats make people really happy.

  • But when I put in the features for a cat, my AI usually says it won't make the owner

  • happy.

  • How can I have such good accuracy at predicting happiness and always be wrong about cats?!

  • I still don't have answers about why the data is skewed towards dogsso I guess

  • I should look at who even filled out my survey?