Placeholder Image

字幕表 動画を再生する

  • Hi, I'm Lawrence from the tensorflow.

  • Team on in this video will continue looking at how to do text classification continuing from what we learned in part one will be classifying the IMDb data sets to build a model that can unfair if a movie review is positive or negative.

  • Up to this point, we've done the pre processing of the data, getting it into a raise of numeric values that can then be used to train a neural network.

  • But to design a neural network that can learn from this, we have to use something called an embedding in an embedding.

  • A word is converted into a vector in multi dimensional space, with the theory that words of similar sentiment will have a sort of similar direction in that space.

  • Now you might think, Wait a second.

  • How does a word get converted into a vector?

  • What would that look like?

  • Well, let's look at a very simplified example.

  • So say you're a fan of Regency Aargh!

  • Romances like those of Jane Austen.

  • Yeah, I know, I know.

  • Take characters from pride and prejudice on plot them on A to teach art where one access is the gender derived from their title, and the other is an estimation of their position in society based on their title.

  • So we'll look at one of my favorites, Mr Collins.

  • Now he's obviously a male, and from his title, Mr He's probably not nobility, so we'll plot him in blue.

  • Now, if we look at Mr Darcy on, we plot him in Red will see very similar results.

  • But what if we add Lady Catherine de Bourgh in Orange on?

  • We can see that she's a lady from her title as well as nobility, but from these vectors, we already have a bit of an understanding about these characters.

  • Now let's see what happens if I had 1/3 dimension, and that is the perceived richness of these characters, how much money they have.

  • Well, now see that there's a huge difference between Mr Darcy and Mr Collins on this.

  • Gives us a rough idea for how words when translated into a vector space, can have sentiment derived from them.

  • You can see that Mr Darcy has more in common with Lady Catherine than he does with Mr Collins.

  • Despite both of them being referred to as Mr.

  • This process is called embedding.

  • And there are a number of algorithms that can handle this for you in Tensorflow and care us.

  • You can use an embedding layer toe automatically.

  • Figure out the right axes for a plot like this, and to start sorting your words into vectors like these in order to derive sentiment.

  • That's pretty cool, right?

  • So let's see it in action.

  • Here you can see how the model is built using Paris.

  • The first layer is an embedded where we're asking it to take the 10,000 words that we have and figure out vectors for them in 16 dimensions.

  • The next line then flattens thes into a one dimensional vector, and this is fed into a dense layer with 16 nodes.

  • As there were 16 dimensions, they then output to a one node layer on.

  • This has an activation being a sigmoid on that pulls apart the results into a value between zero and one.

  • In our case, one is a good review, and zero is a bad one.

  • Next up will compile the model, giving it an optimizer, and a loss function will use a standard Adam Optimizer on as we want only to values one and zero is the output.

  • The binary cross entropy is a good way to calculate loss.

  • A common practice is to hold off on testing against your test data if you have a lot of data to work with.

  • So in this case we have 25,000 training records, so I'm gonna take out a portion of them, say, about 10,000 to use for testing and validation.

  • So, in other words, will validate against those until we have a finished training model.

  • Once the model is trained and we're happy with the lost values we can then test against the test set on.

  • This helps prevent introducing bias into our model.

  • So here I can create a partial training set of values and labels with the rest of for validation on, I'll train the model with them in the workbook.

  • You can see it's a 25,000 splits, but that's really easy to tweak if you want.

  • Well, then train it with the model dot fit call and it's set up to train for 40 bucks.

  • So we'll take just a few moments once we've trained weaken, then evaluate against the test labels on.

  • This shows that We're getting about 87 a half percent accuracy.

  • It's not bad.

  • It's not great, but it's not bad.

  • The rest of the notebook then details plotting the lost function to see if you're over fitting.

  • I'm just going to step over that for now on.

  • Once it's done, I would like to demonstrate how this would look with a new review.

  • I'm going to add a code block here, and I'm gonna create two new reviews.

  • The 1st 1 will be just a bunch of random words, so the review could be just about anything score wise.

  • The 2nd 1 will be filled with the value of 530 which happens to be brilliant on.

  • I'll call this the biased review.

  • I'll evaluate these with model doubt, predict, and now we can see the results.

  • The random one, which was made up of a jumble of random words, scored 10.34 which you probably expect, but the one with the review is made up entirely of the word Brilliant will, of course, be a positive review and you can see it scores a perfect one, and that's it.

  • In these videos, you saw how to build a tax sentiment classifications.

  • You can take all of the steps you're south and the workbook, which is linked in the description below.

  • In the next video in this series, you'll then switch gears on.

  • You'll look at how to do regression in tensorflow.

  • I'll see you there.

  • Whatever you do for more videos, don't forget to hit that subscribe button.

  • Thank you.

Hi, I'm Lawrence from the tensorflow.


動画の操作 ここで「動画」の調整と「字幕」の表示を設定することができます

B1 中級

ニューラルネットワークを設計する|テキスト分類チュートリアル Pt.2 (Coding TensorFlow) (Designing a neural network | Text Classification Tutorial Pt. 2 (Coding TensorFlow))

  • 0 0
    林宜悉 に公開 2021 年 01 月 14 日