Placeholder Image

字幕表 動画を再生する

  • [MUSIC PLAYING]

  • CHRISTINA GREER: Hi.

  • My name is Christina and I'm a software engineer

  • on the Google Brain team.

  • I'm here today to tell you about some tools

  • that my team and I have built to help

  • make the end-to-end lifecycle of the machine learning pipeline

  • easier.

  • I'm going to start by talking about model analysis

  • and validation.

  • These are two different components in TFX,

  • but they are very similar in how they're actually executed.

  • The main difference is how you as an end user will use them.

  • I'm going to start by talking about the evaluator.

  • So why is model evaluation important?

  • Well, for one thing, we have gathered data.

  • We've cleaned that data.

  • We've trained a model.

  • But we really want to make sure that model works.

  • And so, model evaluation can help

  • you assess the overall quality of your model.

  • You also may want to analyze how your model is performing

  • on specific slices of the data.

  • So in this case, with the Chicago taxi example

  • that Clemens started this off with,

  • why are my tip predictions sometimes wrong?

  • Slicing the data and looking at where you're doing poorly

  • can be a real benefit, because it identifies some low hanging

  • fruit where you can get gains in accuracy by adding more data

  • or making some other changes to make some of these segments

  • improve.

  • You also want to track your performance over time.

  • You're going to be continuously training models and updating

  • them with fresh data, so that your models don't get stale.

  • And you want to make sure that your metrics are improving

  • over time and not regressing.

  • And model evaluation can help you with all of this.

  • The component of TFX that supports this

  • is called the evaluator.

  • And it is based on a library called TensorFlow Model

  • Analysis.

  • From the pipeline perspective, you

  • have inputs, which is your eval set that

  • was generated by your ExampleGen.

  • You have the trainer outputting a saved model.

  • You also need to specify the splits in your data

  • that you find most interesting, so that the evaluator can

  • precompute metrics for these slices of data.

  • Your data then goes into the evaluator.

  • And a process is run to generate metrics

  • for the overall slice and the slices that you have specified.

  • The output of the evaluator is evaluation metrics.

  • This is a structured data format that has your data,

  • the splits you specified, and the metrics that correspond

  • to each one of these splits.

  • The TensorFlow Model Analysis library

  • also has a visualization tool that

  • allows you to load up these metrics

  • and dig around in your data in a user friendly way.

  • So going back to our Chicago taxi example,

  • you can see how the model evaluator can help you look

  • at your top line objective.

  • How well can you predict trips that result in large tips?

  • The TFMA visualization shows the overall slice of data here.

  • The numbers are probably small, but accuracy is 94.7%.

  • That's pretty good.

  • You'd get an A for that.

  • But maybe you want to say 95%.

  • 95% accuracy is a lot better number than 94, 94.7.

  • So maybe you want to bump that up a bit.

  • So then you can dig into why your tip predictions are

  • sometimes wrong.

  • We have sliced the data here by the hour of day

  • that the trip starts on.

  • And we've sorted by poor performance.

  • When I look at this data, I see that trips

  • that start, like, 2:00, 3:00 AM were performing quite poorly

  • in these times.

  • Because of the statistics generation tool

  • that Clemens talked about, I do know

  • that the data is sparse here.

  • But if I didn't know that, perhaps I would think,

  • maybe there's something that people

  • that get taxis at 2:00 or 3:00 in the morning

  • might have in common that causes erratic tipping behavior.

  • Someone smarter than me is going to have to figure that one out.

  • You also want to know if you can get better

  • at predicting trips over time.

  • So you are continuously training these models for new data,

  • and you're hoping that you get better.

  • So the TensorFlow Model Analysis tool

  • that powers the evaluator and TFX

  • can show you the trends of your metrics over time.

  • And so here you see three different models

  • and the performance over each with accuracy in AUC.

  • Now I'm going to move on to talking

  • about the ModelValidator component.

  • With the evaluator, you were an active user.

  • You generated the metrics.

  • You loaded them up in the UI.

  • You dug around in your data.

  • You looked for issues that you could

  • fix to improve your model.

  • But eventually, you're going to iterate.

  • Your data is going to get better.

  • Your model's going to improve.

  • And you're going to be ready to launch.

  • You're also going to have a pipeline continuously

  • feeding new data into this model.

  • And every time you generate a new model with new data,

  • you don't want to have to do a manual process of pushing

  • this to a server somewhere.

  • The ModelValidator component of TFX

  • acts as a gate that keeps you from pushing

  • bad versions of your model, while allowing you to automate

  • pushing of quality models.

  • So why model validation is important--

  • we really want to avoid pushing models with degraded quality,

  • specifically in an automated fashion.

  • If you train a model with new data

  • and the performance drops, but say

  • it increases in certain segments of the data

  • that you really care about, maybe you

  • make the judgment call that this is an improvement overall.

  • So we'll launch it.

  • But you don't want to do this automatically.

  • You want to have some say before you do this.

  • So this acts as your gatekeeper.

  • You also want to avoid breaking downstream components.

  • If your model suddenly started outputting something

  • that your server binary couldn't handle,

  • you'd want to know that also before you push.

  • The TFX component that supports this

  • is called the ModelValidator.

  • It takes very similar inputs and outputs to the model evaluator.

  • And the libraries that compute the metrics

  • are pretty much the same underneath the hood.

  • However, instead of one model, you provide two--

  • the new model that you're trying to evaluate and the last good

  • evaluated model.

  • It then runs on your if eval split data

  • and compares the metrics on the same data between the two

  • models.

  • If your metrics have stayed the same or improved,

  • then you go ahead and bless the model.

  • If the metrics that you care about have degraded,

  • you will not bless the model.

  • Get some information about which metrics failed,

  • so that you can do some further analysis.

  • The outcome of this is a validation outcome.

  • It just says blessed if everything went right.

  • Another thing to note about the ModelValidator

  • is that it allows you to do next day eval of your previously

  • pushed model.

  • So maybe the last model that you blessed,

  • it was trained with old data.

  • With the ModelValidator, it evaluates it on the new data.

  • And finally, I'm going to talk about the pusher.

  • The pusher is probably the simplest component

  • in the entire TFX pipeline.

  • But it does serve quite a useful purpose.

  • It has one input, which is that blessing that you

  • got from the ModelValidator.

  • And then the output is if you passed your validation,

  • then the pusher will copy your saved model into a file system

  • destination that you've specified.

  • And now you're ready to serve your model

  • and make it useful to the world at large.

  • I'm going to talk about model deployment next.

  • So this is where we are.

  • We have a trained SavedModel.

  • A SavedModel is a universal serialization format

  • for TensorFlow models.

  • It contains your graph, your learned variable weights,

  • your assets like embeddings and vocabs.

  • But to you, this is just an implementation detail.

  • Where you really want to be is you have an API.

  • You have a server that you can query

  • to get answers in real time or provide

  • those answers to your users.

  • We provide several deployment options.

  • And many of them are going to be discussed

  • at other talks in the session.

  • TensorFlow.js is optimized for serving in the browser

  • or on Node.js.

  • TensorFlow Lite is optimized for mobile devices.

  • We already heard a talk about how Google Assistant is using

  • TensorFlow Lite to support model inference on their Google Home

  • devices.

  • TensorFlow Hub is something new.

  • And Andre is going to come on in about five minutes

  • and tell you about that, so I'm not going to step on his toes.

  • I'm going to talk about TensorFlow Serving.

  • So if you want to put up a REST API that

  • serves answers for your model, you

  • would want to use TensorFlow Serving.

  • And why would you want to use this?

  • For one thing, TensorFlow Serving

  • has a lot of flexibility.

  • It supports multi-tenancy.

  • You can run multiple models on a single server instance.

  • You can also run multiple versions of the same model.

  • This can be really useful when you're

  • trying to canary a new model.

  • Say you have a tried and tested version of your model.

  • You've created a new one.

  • It's passed your evaluator.

  • It's passed your validation.

  • But you still want to do some A/B testing with real users

  • before you completely switch over.

  • TensorFlow Serving supports this.

  • We also support optimization with GPU and TensorRT.

  • And you can expose a gRPC or a REST API.

  • TensorFlow Serving is also optimized for high performance.

  • It provides low latency, request batching--

  • so that you can optimize your throughput

  • while still respecting latency requirements--

  • and traffic isolation.

  • So if you are serving multiple models on a single server,

  • a traffic spike in one of those models

  • won't affect the serving of the other.

  • And finally, TensorFlow Serving is production-ready.

  • This is what we used to serve many

  • of our models inside of Google.

  • We've served millions of QPS with it.

  • You can scale in minutes, particularly

  • if you use the Docker image and scale up on Kubernetes.

  • And we support dynamic version refresh.

  • So you can specify a version refresh policy

  • to either take the latest version of your model,

  • or you can pin to a specific version.

  • This can be really useful for rollbacks

  • if you find a problem with the latest version

  • after you've already pushed.

  • I'm going to go into a little bit more detail

  • about how you might deploy a REST API for your model.

  • We have two different options for doing this presented here.

  • The first, the top command is using Docker,

  • which we really recommend.

  • It requires a little bit of ramp up at the beginning,

  • but you will really save time in the long run

  • by not having to manage your environment

  • and not having to manage your own dependencies.

  • You can also run locally on your own host,

  • but then you do have to do all of that stuff long term.

  • I'm going to go into a little bit more detail on the Docker

  • run command.

  • So you start with Docker run.

  • You choose a port that you want to bind your API to.

  • You provide the path to the saved model that

  • was generated by your trainer.

  • Hopefully, it was pushed by the pusher.

  • You provide the model name.

  • And you tell Docker to run the TensorFlow Serving binary.

  • Another advantage of using Docker

  • is that you can easily enable hardware acceleration.

  • If you're running on a host with a GPU

  • and the Nvidia Docker image installed,

  • you can modify this command line by a few tokens,

  • and then be running on accelerated hardware.

  • If you need even further optimization,

  • we now support optimizing your model

  • for serving using TensorRT.

  • TensorRT is a platform for Nvidia

  • for optimized deep learning inference.

  • Your Chicago taxi example that we've been using here

  • probably wouldn't benefit from this.

  • But if you had, say, an image recognition model, a ResNet,

  • you could really get some performance boosting

  • and cost savings by using TensorRT.

  • We provide a command line that allows

  • you to convert the saved model into a TensorRT

  • optimized model.

  • So then again, a very simple change to that original command

  • line.

  • And you're running on accelerated GPU hardware

  • with TensorRT optimization.

  • So to put it all together again, we

  • introduced TensorFlow Extended or TFX.

  • We showed you how the different components that TFX consists of

  • can work together to help you manage

  • the end-to-end lifecycle of your machine learning pipeline.

  • First, you have your data.

  • And we have tools to help you make sense of that

  • and process it and prepare it for training.

  • We then support training your model.

  • And after you train your model, we

  • provide tools that allow you to make sense

  • of what you're seeing, of what your model's doing,

  • and to make improvements.

  • Also, to make sure that you don't regress.

  • Then we have the pusher that allows

  • you to push to various deployment options

  • and make your model available to serve users in the real world.

  • To get started with TensorFlow Extended,

  • please visit us on GitHub.

  • There is also more documentation at TensorFlow.org/tfx.

  • And some of my teammates are running a workshop tomorrow.

  • And they'd love to see you there.

  • You don't need to bring a laptop.

  • We have machines that are set up and ready to go.

  • And you can get some hands-on experience

  • using TensorFlow Extended.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

TensorFlow Extended (TFX) Post-training Workflow (TF Dev Summit '19) (TensorFlow Extended (TFX) Post-training Workflow (TF Dev Summit '19))

  • 2 0
    林宜悉 に公開 2021 年 01 月 14 日
動画の中の単語