字幕表 動画を再生する
KARMEL ALLISON: Hi and welcome to Coding TensorFlow.
I'm Karmel Allison, and I'm here to guide you
through a scenario using TensorFlow's high-level APIs.
This video is the third and final part
of a three-part series.
In the first video, we looked at data
and how to prepare your data for machine learning.
We then moved on to optimizing your data for machine learning
with TensorFlow and keras including
building a simple model.
And here's the model we defined.
We will start with a simple sequential model, which
strings together the modular keras
layers, hooking the output of each
into the input of the next.
Our first layer will do all of the data
transformations we just discussed,
and then we go through some standard densely connected
layers.
Our final layer here will output the class predictions
for each of the four wilderness areas
that we are interested in.
Notice here that we are just establishing the architecture
of our model, and we still haven't yet
hooked it up to any data.
Once we have the layer architecture established,
we can compile the model, which adds the optimizer, loss,
and metrics we are interested in.
TensorFlow provides a number of optimizers and lost choices,
which we could explore if we wanted to.
And finally the rubber meets the road.
We pass our data set into our model, and we train.
Now in a real world situation with large data sets,
we would likely want to leverage hardware accelerators
like GPUs or TPUs.
And we may even want to distribute training
across multiple GPUs or nodes.
You can find out more about using distribution strategies
for this in the links included in the description below.
So for now I will just point out that the same code will
work in the distributed settings and when
eager execution is disabled.
For now, we will just assume that we will wait
for this to finish training.
Once it's done training, you can test it.
And while that's pretty straightforward,
we first need to load in our validation data.
It's important here that we use the same processing procedure
for our test data that we did for our training data.
So maybe we'll define a function that we can use in both cases
to ensure repeatability.
We call the evaluate method of our model
with the validation data, which returns the loss and accuracy
that we get on our test data.
Note here that because we took care of our data
transformations using feature columns,
we know that the transformation of our input validation data
will happen in the same way as it did for our training
data, which is critical to ensuring repeatable results.
So now we validated our model on independent
held out data that was processed in the same way as our training
data.
And we can pretend for a minute that we
are happy with the results, and we
are ready to deploy this model.
There is a lot of tooling that is required
for real world deployment, which the library TFX makes possible.
I put a link to this in the description below.
TensorFlow provides a model saving format
that works with the suite of TensorFlow products,
including TensorFlow Serving and TensorFlow.js.
The TensorFlow saved model includes a checkpoint with all
of the weights and variables, and it also includes the graph
that we built for training, evaluating, and predicting.
Keras now natively exports to TensorFlow
saved model format for serving.
This saved model is a fully contained serialization
of your model, so you can load it back into Python later
if you want to retrain or reuse your model.
And now we've gone through all of the critical stages
to build a model for our data in TensorFlow.
And maybe we're done.
It could be that this is the perfect model, and we're happy.
But then we'd all have to go and find new jobs,
so let's assume that we want to improve the accuracy
of the model we have built.
There are lots of places we might decide to make changes.
We could go and collect more data.
We could change the way we process and parse the data.
We could change the model architecture,
add or remove layers.
We could change the optimizer or the loss.
We could try different hyper parameters and so on.
What I will show you today is how
to use the same data and features to try out
one of TensorFlow's canned estimators, which
are built in implementations of some more
complex models including those that don't fit nicely
into a layer-based architecture.
So what if we wanted to shake things
up and try a different model?
To rewind, this is the model we had,
a densely connected neural network.
Let's try changing it.
Here we are using the same feature columns,
but we're configuring one of the TensorFlow canned estimators.
We are using the DNN linear combined classifier,
which is also known as the wide and deep model.
And that gives us an architecture
that looks something like this, allowing
us to trivially leverage all the research that
went into developing this model structure.
This model combines traditional linear learning
with deep learning, and so we can
feed in our categorical data directly to the linear half
and then configure a DNN with two dense layers
for the numerical data.
We can then train the wide and deep model
just as we did with our other model.
Notice here that the caned estimator
expects the input function rather than data set directly.
Estimators control their own sessions and graphs
so that at the time of distribution they can build
and replicate graphs as necessary.
So our input function here gives the estimate
of the instructions for getting our data set
and producing the tensors that we want,
and the estimates will then call this function in its own graph
and session when necessary.
Here we wrap the same data loading function
that we used in our previous model in a lambda
with the correct file names preconfigured
so that the estimator can call this function at runtime
to get the appropriate features and labels.
We can use the same strategy to evaluate using test data.
And lo and behold, if we run this for 20 epochs,
we have another train model that we can compare to the first.
Note that this is just one of a number of canned estimates
that TensorFlow offers.
We have boost to trees, model for time series analysis, r
and n's, walls, and more as well.
Note that for estimators, we first
have to tell the model what shape and type of tensor
to expect at inference time.
For that, we have to define an input receiver function.
It sounds confusing.
And I'm not going to lie.
It's a little confusing.
We want a function that builds the tensor shapes
that we expect at serving time.
Luckily, we can use this convenience function,
which just needs the shapes of the tensors we
will want to run inference on.
Because we are in eager mode, we can just
grab a row from our data set and use
that to tell the convenience function what to expect.
Here we grab the features from the first row
and don't do any additional processing
because we are assuming at inference time the data will
look the same, just without labels.
In real world scenarios, your inference data
may need additional processing, such as parsing
from a live request string.
And the input receiver function is where
you would encode that logic.
We can then use the function returned
by build raw serving input receiver function to generate
a saved model that can be used in TF serving, TF
hub, and elsewhere just as we did with keras.
So now we've made a full loop.
And if we had more time, we could keep going
and try out some more of these paths.
I'll leave you to explore that, and I
hope that this series has been as fun for you
as it was for me.
Remember if you have any questions,
please leave them in the comments below
and don't forget to hit that subscribe button.