Placeholder Image

字幕表 動画を再生する

  • [LOGO MUSIC]

  • KONSTANTINOS KATSIAPIS: Hello everyone.

  • My name is Gus Katsiapis.

  • And together with my colleagues Kevin Haas and Tulsee Doshi,

  • we will talk to you today about TensorFlow Extended,

  • and the topic covers two areas.

  • Let's go into that

  • I think most of you here that have used ML in the real world

  • realize that machine learning is a lot more than just a model.

  • It's a lot more than just training a model.

  • And especially, when machine learning powers your business

  • or powers a product that actually affects your users,

  • you absolutely need the reliability.

  • So, today, we'll talk about how, in the face

  • of massive data and the real world,

  • how do you build applications that

  • use machine learning that are robust to the world

  • that they operate in?

  • And today's talk will actually have two parts.

  • The first part will be about TensorFlow Extended, otherwise

  • known as TFX.

  • This is an end-to-end machine learning platform.

  • And the second part of the talk will

  • talk about model understanding and how you can actually

  • get insights into your business by understanding

  • how your model performs in real-world situations.

  • OK.

  • So let's get started with the first part, TensorFlow

  • Extended, otherwise known as TFX.

  • We built TFX at Google.

  • We started building it approximately two and a half

  • years ago.

  • And a lot of the knowledge that went into building TFX actually

  • came from experience we had building other machine

  • learning platforms within Google that preceded it,

  • that preceded TensorFlow even.

  • So TFX has had a profound impact to Google,

  • and it's used throughout several Alphabet companies

  • and also in several products within Google itself.

  • And several of those products that you can see here--

  • like Gmail or Add, et cetera, or YouTube--

  • have pretty large scale.

  • So they affect billions of users, et cetera.

  • So this is one more reason for us

  • to pay extra attention to building systems that

  • use ML reliably.

  • Now, when we started building TFX,

  • we had published a paper about it,

  • and we promised we would eventually

  • make it available to the rest of the world.

  • So over the last few years, we've

  • been open sourcing aspects of it, and several of our partners

  • externally have actually been pretty successful

  • deploying this technology, several of the libraries we

  • have offered over time.

  • So just to call out an interesting case study

  • that Twitter made, they actually made a fascinating blog post

  • where they spoke about how they ranked tweets with TensorFlow

  • and how they used TensorFlow Hub in order

  • to do transfer learning and shared word embeddings

  • and share them within their organization.

  • And they also showcased how they use TensorFlow Model Analysis

  • in order to have a better understanding of their model--

  • how their model performs not just globally

  • over the population, but several slices of their population

  • that were important to the business.

  • So we'll be talking about more of this

  • later, especially with the model understanding talk.

  • Now I think most of you here are either software developers

  • or software engineers or are very much

  • familiar with software processes and technologies.

  • So I think most of you probably recognize

  • several of the themes presented in this slide,

  • like scalability, extensibility, modularity, et cetera.

  • But, my conjecture is that most people think

  • about those concepts in terms of code and how to build software.

  • Now, with the advent of machine learning,

  • we are building applications that

  • are powered by machine learning, which

  • means that those applications are powered by data--

  • fundamentally are powered by data.

  • So if you just think about code and you don't think about data,

  • you're only thinking about half of the picture,

  • 50% of the picture.

  • So you can optimize one amazingly.

  • But if you don't think about the other half,

  • you cannot be better than the half--

  • than the half itself.

  • So I would actually encourage everyone just

  • to take each of those concepts and see,

  • how does this concept apply to data as opposed to just code?

  • And if you can apply these concepts to both data and code,

  • you can build a holistic application

  • that is actually robust and powers your products.

  • So we will actually go into each of those

  • individually and see how they apply to machine learning.

  • OK.

  • Let's start with scalability.

  • Most of you know that, when you start your business,

  • it might be small.

  • But the reality is that, as your business grows,

  • so might your data.

  • So, ideally, you want a solution that

  • is able to over time scale together with your business.

  • Ideally, you would be able to write a single library

  • or software-- piece of software--

  • and that's also could operate on a laptop

  • because you want to experiment quickly,

  • but it could also operate on a beefy machine

  • with tons of processors or a ton of accelerators.

  • And you could also scale it over hundreds or thousands

  • of machines if you need to.

  • So this flexibility in terms of scale is quite important.

  • And ideally, each time you hop from kilobytes to megabytes

  • to gigabytes to terabytes, ideally

  • you wouldn't have to use different tools because you

  • have a huge learning curve each time you change your technology

  • under the covers.

  • So the ideal here is to have a machine learning platform that

  • is able to work on your laptop but can also scale it

  • on any cloud you would like.

  • OK.

  • Now, let's talk about accessibility.

  • So everyone here understands that you

  • can have libraries and components that make up

  • your system, and you can have things

  • that work out of the box.

  • But, you always want to customize it a little bit

  • to meet your goals.

  • You always want to put custom business logic in some part

  • of your application.

  • And this is similar for machine learning.

  • So if you think about the concrete example,

  • when you fit data into machine learning model,

  • you need to do multiple transformations

  • to put the data in a format that the model expects.

  • So as a developer of an ML application,

  • you want to have the transformation flexibility

  • that an ML platform can provide to you--

  • whether that's bucketizing, creating vocabularies,

  • et cetera.

  • And that's just one example, but this applies pervasively

  • throughout the ML process.

  • OK.

  • Let's talk a little bit about modularity.

  • All of you probably understand the importance

  • of having nice APIs and reusable libraries that

  • allow you to build bigger and bigger systems.

  • But, going back to our original question,

  • how does this apply to artifacts produced by machine learning

  • pipelines?

  • How does this apply to data?

  • So ideally, I would be able to reuse the reusable components

  • of a model that was trained to recognize images and take

  • that part--

  • the reusable part of it--

  • and put it in my model that predicts kinds of chairs.

  • So ideally, we would be able to reuse parts of models as easy

  • as it would be to reuse libraries.

  • So check out TensorFlow Hub, which actually allows

  • you to reuse the reusable parts of machine learning models

  • and plug them into your own infrastructure.

  • And going a step further, how does this apply to artifacts?

  • So machine learning platforms usually

  • produce lots of data artifacts, whether that's

  • statistics about your data or something else.

  • And many times, those operate in a continuous fashion.

  • So data continuously arrives into the system,

  • and you have to continuously produce

  • models that mimic reality quickly,

  • that understand reality faster.

  • And, if you have to redo computation from scratch,

  • then it means that you can sometimes not follow real time.

  • So somehow you need to be able to take artifacts that

  • are produced over time and merge them easily,

  • quickly together so that you can follow

  • real time as new data arrives.

  • OK.

  • Moving on to the stability, most of us

  • are familiar with unit tests, integration tests, regression

  • tests, performance tests.

  • All of those are about code.

  • What does it mean to write a test about data?

  • ML and data are very much intertwined.

  • What is the equivalent of a unit test or an integration test?

  • If we take a step back, when we write tests,

  • we encode expectations of what happens in code.

  • So when we deal with applications that have both

  • code and data, we need to write expectations both in terms

  • of the code-- how the code behaves--

  • and in terms of what is the shape of the data

  • that goes into this black box, this black process.

  • So I would say that the equivalent of a unit test

  • or an integration test for data is writing expectations

  • about types of the data that goes into your system,

  • the distributions you expect, the values that are allowed,

  • the values that are not allowed, et cetera.

  • And we'll be discussing about those later.

  • If we take this a step further, code oftentimes

  • gives us very strong contracts.

  • When you have a sorting algorithm,

  • you can expect that it will do exactly

  • what the contract promises.

  • When you have a machine learning model,

  • basically you can think of it as a function that was generated

  • automatically through data.

  • So you don't have as strong contracts as we had before.

  • So in order to set expectations about how those black box

  • functions that were created by data behave,

  • we need to set expectations about what the data that

  • went into them was.

  • And this relates a lot to the previous stability

  • point we mentioned.

  • OK.

  • Moving on a little bit more from a systems view perspective,

  • ideally when you use an application,

  • you won't have to code everything yourself.

  • You would be able to reuse several components out

  • of the box that accept well-defined configuration,

  • and the configuration is ideally also

  • flexible enough for you to parameterize it a little bit

  • and customize it to your needs.

  • So reuse as many black boxes as possible, but touch them up

  • when you need to.

  • This is very similar to machine learning.

  • Ideally, I wouldn't have to rebuild everything

  • from scratch.

  • I would be able to reuse components, libraries, modules,

  • pipelines.

  • And I would be able to share them

  • with multiple folks within my business or publicly.

  • OK.

  • Now, once you have a system and it's actually

  • responsible for the performance of your product

  • or your business, you need to know what's going on.

  • So you need to be able to continuously monitor it and get

  • alerted when things are not moving as one would expect.

  • And, once again, here I would say that data

  • is a first-class citizen.

  • So unexpected changes to the data coming into your system--

  • whether that's your training system or your serving system--

  • need to be monitored because, otherwise,

  • you cannot know what the behavior of the system will be

  • especially in the absence of those strong contracts we

  • discussed earlier.

  • So both data need to be first-class citizens,

  • and both models need to be first-class citizens.

  • And we need to be able to monitor

  • the performance of the models as well over time.

  • Now, if we apply those concepts, we can build better software.

  • And if we apply the concepts we just

  • discussed to machine learning, we

  • can build better software that employs machine learning.

  • So this should, in principle, allow

  • us to build software that is safer.

  • And safe is a very generic word, but let

  • me try to define it a little bit here.

  • I would call it maybe a robustness

  • to environment changes.

  • So, as the data that undergirds your system changes,

  • you should be able to have a system that

  • is robust to it, that performs properly

  • in light of those changes, or at least you get notified when

  • those changes happen so that you change the system to become

  • more robust in the future.

  • And how do we do this?

  • It's actually hard.

  • I think the reality is that all of us

  • build on collective knowledge and experience.

  • When we reuse libraries, we build

  • on the shoulders of giants that built those libraries.

  • And machine learning is not any different.

  • I think the world outside is complex.

  • And if you build any system that tries to mimic it,

  • it has to mirror some of the complexities.

  • Or, if you build a system that tries to predict it,

  • it has to be able to mirror some of those complexities.

  • So what comes to the rescue here is, I would say,

  • automation and best practices.

  • So if we apply some best practices for machine learning

  • that have been proven to be useful in various other

  • circumstances, they're probably useful for your business

  • as well.

  • And many of those are difficult. We

  • oftentimes spend days or months debugging situations.

  • And then, once we are able to do that, we can encode the best

  • practices we learn into the ML software

  • so that you don't need to reinvent the wheel,

  • basically, in this area.

  • And the key here is that learning

  • from the pitfalls of others is very important to building

  • your own system.

  • OK.

  • So how do we achieve those in machine learning?

  • Let's look into a typical user journey

  • of using ML in a product and what this looks like.

  • In the beginning, you have data ingestion.

  • As we discussed, data and code are

  • tightly intertwined in machine learning.

  • And sometimes, in order to change your machine algorithm,

  • you have to change your data and vice versa.

  • So these need to be tightly intertwined,

  • and you need something that brings the data into the system

  • and applies the best practices we discussed earlier.

  • So it shuffles the training data so that downstream processes

  • can operate efficiently and learn faster.

  • Or, it splits your data into the training

  • and evolved set in a way that make sure

  • is that there is no leakage of information during this split.

  • For example, if you have a financial application,

  • you want to make sure that the future does not

  • look into the past where you're training, just

  • as a current example.

  • OK.

  • Once we are able to generate some data,

  • we need to make sure that data is of good quality.

  • And why is that?

  • The answer is very simple.

  • As we discussed, garbage in, garbage out.

  • This applies especially well to machine

  • learning because ML is this kind of black box

  • that is very complicated.

  • So if you put things in that you don't quite understand,

  • there is no way for you to be able to understand

  • the output of the model.

  • So data understanding is actually

  • required for model understanding.

  • We'll be talking about this later as well.

  • Another thing here is that, if you catch errors

  • as early as possible, that is critical for your machine

  • learning application.

  • You reduce your wasted time because now you've

  • identified the error early on where

  • it's easier to actually spot.

  • And you actually decrease the amount of computation

  • you perform.

  • I don't need to train a very expensive model

  • if the data that's going into it is garbage.

  • So this is key.

  • And I think the TL;DR here is that you should treat data

  • as you treat code.

  • They're a first-class citizen in your ML application.

  • OK.

  • Once you have your data, sometimes you

  • need to massage it in order to fit it

  • into your machine learning algorithm.

  • So oftentimes, you might need to build vocabularies

  • or you need to normalize your constants in order

  • to fit into your neural network or your linear algorithm.

  • And, in order to do that, you often

  • need full passes over the data.

  • So the key thing is that, when you train,

  • you do these full passes over the data.

  • But when you serve, when you evaluate your model,

  • you actually have one prediction at a time.

  • So how can we create the hermetic representation

  • of a transformation that requires the full possibility

  • of data and be able to apply that hermetic presentation

  • at serving time so that my training and serving

  • way of doing things is not different?

  • If those are different, my model would predict bogus.

  • So we need to have processes that ensure those things are

  • hermetic and equivalent.

  • And ideally, you would have a system that does that for you.

  • OK.

  • Now that we have data, now that we

  • have data that is of good quality

  • because we've validated it, and now

  • that we have transformations that

  • allows us to fill the data into the model,

  • let's train the model.

  • That's where the magic happens, or so we think,

  • when in reality everything else before it is actually needed.

  • But the chain doesn't stop here.

  • Once you produce a model, you actually

  • need to validate the quality of the model.

  • You need to make sure it passes a threshold

  • that you think are sufficient for your business

  • to operate in.

  • And ideally, you would do this not just globally--

  • how does this model perform on the total population of data--

  • but also how it performs on each individual slice of the user

  • base you care about--

  • whether that's different countries

  • or different populations or different geographies

  • or whatever.

  • So, ideally, you would have a view

  • not just how the model does holistically,

  • but in each slice where you're interested in.

  • Once you've performed this validation,

  • you now have something that we think is of good quality,

  • but we want to have a separation between our training system--

  • which oftentimes operates at large scale

  • and at high throughput-- from our serving system that

  • actually operates with low latency.

  • When you try to evaluate the model,

  • sometimes you want a prediction immediately--

  • within seconds or milliseconds even.

  • So you need a separation between your training

  • part of the system and your serving part of the system,

  • and you need clear boundaries between those two.

  • So you need something that takes a model,

  • decides whether it's good or not,

  • and then pushes it into production.

  • Once your model is in production,

  • you're now able to make predictions and improve

  • your business.

  • So as you can see, machine learning

  • is not about the middle part there.

  • This is not about training your model.

  • Machine learning is about the end-to-end thing

  • of using it in order to improve an application

  • or improve a product.

  • So TFX has, over the course of several years,

  • open sourced several libraries and components

  • that make use of those libraries.

  • And the libraries are very, very modular,

  • going back to the previous things we discussed.

  • And they can be stitched together into your existing

  • infrastructure.

  • But, we also offer components that understand the context--

  • and Kevin will be talking about this later--

  • we understand the context they are in,

  • and they can connect to each other

  • and operate in unison as opposed to operating as single things.

  • And we also offer some horizontal layers

  • that allow you to do those connections, both in terms

  • of configuration-- like you have a single way

  • to configure your pipeline-- and in terms of data

  • storage and metadata storage.

  • So those components understand both the state

  • of the world, what exists there, and how they can

  • be connected with each other.

  • And we also offer an entrance system.

  • So we give you the libraries that

  • allow you to build your system if you want, build

  • your car if you will, but we also offer a car itself.

  • So we offer an entrance system that

  • gives you well-defined configuration,

  • simple configuration for you to use.

  • It has several components that work out

  • of the box to help you with the user journey we just discussed,

  • which as we saw has multiple phases.

  • So we have components that work for each of those phases.

  • And we also have a metadata store

  • in the bottom that allows you to track basically

  • everything that's happening in this machine learning platform.

  • And, with this, I would like to invite

  • Kevin Haas to talk a little bit more

  • about the effects and the details.

  • KEVIN HAAS: Thanks, Gus.

  • So my name is Kevin Haas, and I'm

  • going to talk about the internals of TFX.

  • First a little bit of audience participation.

  • How many out there know Python?

  • Raise your hands.

  • All right.

  • Quite a few.

  • How many know TensorFlow?

  • Raise your hands.

  • Oh, quite a few.

  • That's really good.

  • And how many of you want to see code?

  • Raise your hands.

  • A lot less than the people who know Python.

  • OK.

  • So I'm going to get to code in about five minutes.

  • First, I want to talk a little bit about what TFX is.

  • First off, let's talk about the component.

  • The component is the basic building block

  • of all of our pipelines.

  • When you think about a pipeline, it's an assembly of components.

  • So, using this, we have ModelValidator.

  • ModelValidator is one of our components,

  • and it is responsible for taking two models--

  • the model that we just trained as part of this pipeline

  • execution and the model that runs in production.

  • We then measure the accuracy of both of these models.

  • And if the newly-trained model is

  • better than the model in production,

  • we then tell a downstream component

  • to push that model to production.

  • So what we have here is two inputs-- two models--

  • and an output-- a decision whether or not to push.

  • The challenge here is the model that's

  • in production has not been trained

  • by this execution of the pipeline.

  • It may not have been trained by this pipeline at all.

  • So we don't have a reference to this.

  • Now, an easy answer would be, oh, just hard

  • code it somewhere in your config,

  • but we want to avoid that as well.

  • So we get around this by using another project like Google

  • put together called ML Metadata.

  • With ML Metadata, we're able to get

  • the context of all prior executions

  • and all prior artifacts.

  • This really helps us a lot in being able to say,

  • what is the current production model

  • because, we can query the ML Metadata store

  • and ask for the URI, the artifact,

  • of the production model.

  • Once we have that, then now we have both inputs

  • that are able to go to the validator to do our testing.

  • Interestingly enough, it's not just the production model

  • that we get from ML Metadata, but we get all of our inputs

  • from ML Metadata.

  • If you look here, the trainer when it emits a new model

  • writes it to ML Metadata.

  • It does not pass it directly to the next component

  • of ModelValidator.

  • When ModelValidator starts up, it gets both of the models

  • from ML Metadata, and then it writes back the validation

  • outcome back to the ML Metadata store.

  • What this does is it does two things.

  • One, it allows us to compose our pipelines a lot more loosely

  • and so have tightly coupled pipelines.

  • And two, it separates the orchestration layer

  • from how we manage state and the rest of our metadata

  • management.

  • So a little bit more about how a component is configured.

  • So we have three main phases of our component.

  • It's a very specific design pattern

  • that we've used for all of TFX, and we

  • find it works really well with machine learning pipelines.

  • First, we have the driver.

  • The driver is responsible for additional scheduling

  • and orchestration that we may use to determine

  • how the executor behaves.

  • For example, if we're asked to train

  • a model using the very same examples as before,

  • the very same inputs as before, the very

  • same runtime parameters as before,

  • and the same version of the estimator,

  • we can kind of guess that we're going

  • to end up with the same model that we started with.

  • As a result, we may choose to skip the execution

  • and return back a cached artifact.

  • This saves us both time and compute.

  • Next, we have the executor.

  • What the executor is responsible for

  • is the business logic of the component.

  • In the case of the ModelValidator,

  • this is where we test both models and make a decision.

  • In the case of the trainer, this is

  • where we train the model using TensorFlow.

  • Going all the way back to the beginning of the pipeline,

  • in the case of example.jm, this is

  • where we extract the data out of either a data store

  • or out of BigQuery or out of the file system

  • in order to generate the examples that the rest

  • of the pipeline trains with.

  • Finally we have the publishing phase.

  • The publishing phase is responsible for writing back

  • whatever happened in this component

  • back to the metadata store.

  • So if the driver decided to skip work, we write that back.

  • If an executor created one or more new artifacts,

  • we also write that back to ML Metadata.

  • So artifacts and metadata management

  • are a crucial element of managing our ML pipelines.

  • If you look at this, this is a very simple task dependency

  • graph.

  • We have transform.

  • When it's done, it calls trainer.

  • It's a very simple finish-to-start dependency

  • graph.

  • So while we can model our pipelines this way,

  • we don't because task dependencies alone

  • are not the right way to model our goals.

  • Instead, we actually look at this as a data dependency

  • graph.

  • We're aware of all the artifacts that

  • are going into these various components,

  • and the components will be creating new artifacts as well.

  • In some cases, we can actually use components

  • that were not created by the current pipeline or even

  • this pipeline configuration.

  • So what we need is some sort of system

  • that's able to both schedule and execute components but also be

  • task-aware and maintain a history of all

  • the previous executions.

  • So, what's the metadata store?

  • First off, the metadata store will keep information--

  • I guess I click here--

  • the metadata store will keep information

  • about the trained models--

  • so, for example, the artifacts, the type of the artifacts

  • that have been trained by previous components.

  • Second, we keep a list of all the components

  • and all the versions, all the inputs and all

  • the runtime parameters that were provided to these components.

  • So this gives us history of what's happened.

  • Now that we have this, we actually

  • have this bi-directional graph between all of our artifacts

  • and all of our components.

  • What this does is it gives us a lot of extra capabilities

  • where we can do advanced analytics and metrics off

  • of our system.

  • An example of this is something called lineage.

  • Lineage is where you want to know where all your data has

  • passed through the system or all of the components.

  • So, for example, what were all the models that

  • were created by a particular data set

  • is something that we could use with ML Metadata?

  • On the flip side, what were all the components

  • or all the artifacts that were created

  • by a very particular version of a component?

  • We can answer that question as well using ML Metadata.

  • This is very important for debugging,

  • and it's also very important for our enquiries

  • when somebody says, how has this data been consumed

  • by any of your pipelines?

  • Another example where prior state helps us

  • is warm starting.

  • Warm starting is a case in TensorFlow

  • where you incrementally train a model using

  • the checkpoints in the weights of a previous version

  • of the model.

  • So here, because we know that the model has already

  • been warm started by using ML Metadata,

  • we're able to go ahead and warm start the model, saving us

  • both time and compute.

  • Here's an example of TensorFlow Model Analysis, also known

  • as TFMA.

  • This allows us to visualize how a model's

  • been performing over time.

  • This is a single model on a time series graph,

  • so we can see whether the model has

  • been improving or degrading.

  • Tulsee is going to be talking a little bit more about TFMA

  • in a bit.

  • Finally, we have the ability to reuse the components.

  • Now, I mentioned before caching is great.

  • Caching is great in production, but it's also great

  • when you're building your model the first time.

  • When you think about you have your pipeline,

  • you're probably working in the static data set,

  • and you're probably not working on your features that much.

  • You're probably spending most of your time on the estimator.

  • So with TFX, we more or less cache through all that part.

  • So as you're running your pipeline,

  • your critical path is on the model itself.

  • So you probably wondering, how do I develop models in this?

  • This is where the code comes in.

  • So at the core, we use TensorFlow estimators.

  • You can build an estimator using a Keras inference graph.

  • You can build it using a cant estimator,

  • or you can go ahead and create your own

  • using the low-level ops.

  • We don't care.

  • All we need is an estimator.

  • Once we have the estimator, the next thing we do

  • is we need to put it back into a callback function.

  • In the case of trainer, we have a trainer callback function.

  • And you'll see here the estimator

  • plus a couple more parameters are passed back

  • to the caller of the callback function.

  • This is how we call TensorFlow for the train and evaluate.

  • Finally, you add your code, pretty much

  • the file that had the callback function and the estimator,

  • into this TFX pipeline.

  • So you'll see there in the trainer, there's a module file,

  • and the module file has all the information.

  • This is what we use to call back to your function.

  • This will generate the save model.

  • I was also talking about data dependency graphs.

  • If you notice here on the graph, we

  • don't actually explicitly say transform runs

  • and then the trainer runs.

  • Instead, we're implied dependency graph

  • based on the fact that the outputs of transform

  • are required as inputs for the trainer.

  • So this allows us to couple task scheduling along

  • with our metadata awareness.

  • At this point, we've talked about components,

  • we've talked about the metadata store,

  • and we've talked about how to configure a pipeline.

  • So now, what I'll do is talk about how

  • to schedule and execute using some of the open source

  • orchestrators.

  • So we've modified our internal version of TFX

  • to support Airflow and Kubeflow, two popular open source

  • orchestrators.

  • We know that there's additional orchestrators, not all of which

  • are in open source, so we built an interface

  • that allows us to add additional orchestrators as possible.

  • And we'd love contributions.

  • So if somebody out there wants to go ahead and implement

  • on a third or a fourth orchestrator, please let us

  • know via GitHub, and we'll help you out.

  • Our focus is primarily on extensibility.

  • We want to be able to allow people

  • to extend ML pipelines, build new graphs as opposed

  • to the ones that we've been showing today,

  • and then also adding new components

  • because not all the components that you need for machine

  • learning for your particular machine learning environments

  • are the ones that we're providing.

  • So putting it all back together, this is the slide

  • that Gus showed earlier.

  • At the very beginning, we have the ExampleGen component

  • that's going to extract and generate examples.

  • Then, we go through the phases of data validation.

  • And assuming that the data is good,

  • then we go ahead and do feature engineering.

  • Provided that the feature engineering completes,

  • we train a model, validate the model,

  • and then push it to one or more systems.

  • This is our typical ML pipeline.

  • While some of the components can change--

  • the draft structure can change--

  • that's pretty much how we do it.

  • And what you see here on the right is Airflow,

  • and on the left is Kubeflow.

  • We use the very same configuration

  • for both of these implementations.

  • And the key thing that we're looking for in TFX

  • is portability.

  • We want portability where the same configuration

  • of a pipeline can move between orchestrators.

  • We also want to be portable across the on-prem,

  • local machine, and public cloud boundaries.

  • So going from my single machine up to running in Google Cloud

  • with, for example, Dataflow is really

  • a one-line change in my pipeline just to reconfigure Beam.

  • So that's it for how the internals of TFX works.

  • Next up, my colleague Tulsee will be talking

  • about model understanding.

  • TULSEE DOSHI: Awesome.

  • Thank you.

  • Hi, everyone.

  • My name is Tulsee, and I lead product for the ML Fairness

  • Effort here at Google.

  • ML Fairness, like many goals related to modeling,

  • benefits from debugging and understanding

  • model performance.

  • So, today, my goal is to walk through a brief example

  • in which we leverage the facets of TFX

  • that Gus and Kevin spoke about earlier to better understand

  • our model performance.

  • So let's imagine that you're an e-tailer,

  • and you're selling shoes online.

  • So you have a model, and this model

  • predicts click-through rates that

  • help inform how much inventory you should order.

  • A higher click-through rate implies a higher need

  • for inventory.

  • But, all of a sudden, you discover

  • that the AUC and prediction accuracy have

  • dropped on men's dress shoes.

  • Oops.

  • This means that you may have over-ordered certain shoes

  • or under-ordered certain inventory.

  • Both cases could have direct impact on your business.

  • So what went wrong?

  • As you heard in the keynote yesterday,

  • model understanding is an important part

  • of being able to understand and improve

  • possible causes of these kinds of issues.

  • And today, we want to walk you through how

  • you can leverage TFX to think more about these problems.

  • So first things first, you can check your inputs

  • with the built-in TF Data Validation component.

  • This component allows you to ask questions like,

  • are there outliers?

  • Are some features missing?

  • Is there something broken?

  • Or, is there a shift in distribution

  • in the real world that changes the ways your users might

  • be behaving that might be leading to this output?

  • For example, here's a screenshot of what

  • TensorFlow Data Validation might look like for your example.

  • Here, you can see all the features

  • that you're using in your data set--

  • for example, price or shoe size.

  • You can see how many examples these features cover.

  • You can see if any percent of them are missing.

  • You can see the mean, the standard deviation, min,

  • median, max.

  • And you can also see the distribution

  • of how those features map across your data set.

  • For example, if you look at shoe size,

  • you can see the distribution from sizes 0 to 14.

  • And you can see that sizes 3 to 7

  • seem to be a little bit missing.

  • So maybe we don't actually have that much data for kids' shoes.

  • You can now take this a step further--

  • so going beyond the data to actually ask

  • questions about your model and your model performance

  • using the TensorFlow Model Analysis.

  • This allows you to ask questions like,

  • how does the model perform on different slices of data?

  • How does the current model performance

  • compare to previous versions?

  • With TensorFlow Model Analysis, you

  • get to dive deep into slices.

  • One slice could be men's dress shoes.

  • Another example could be what you see here,

  • where you can actually slice over different colors of shoes.

  • The graph in this example showcases how many examples

  • have a particular feature.

  • and the table below allows you to deep dive into the metrics

  • that you care about to understand not just how

  • your model is performing overall,

  • but actually taking that next step

  • to understand where your performance may be skewed.

  • For example, for the color brick,

  • you can actually see an accuracy of 74%.

  • Whereas, for shoes of color light gray,

  • we have an accuracy about 79%.

  • Once you find a slice that you think

  • may not be performing the way you think it should,

  • you may want to dive in a bit deeper.

  • You actually now want to start understanding

  • why this performance went off.

  • Where is the skew?

  • Here, you can extend this with the what-if tool.

  • The what-if tool allows you to understand

  • the input your model is receiving

  • and ask and answer what if questions.

  • What if the shoe was a different color?

  • What if we were using a different feature?

  • With the what if tool, you can select a data point

  • and actually look at the features and change them.

  • You can play with their feature values

  • to be able to see how the example might change.

  • Here, for example, we're selecting a viewed data point.

  • You can go in and change the value of the feature

  • and actually see how the classification output would

  • change as you change these values.

  • This allows you to test your assumptions

  • to understand where there might be correlations

  • that you didn't expect the model to pick up

  • and how you might go about tackling them.

  • The what-if tool is available as part of the TFX platform,

  • and it's part of the TensorBoard dashboard.

  • You can also use it as a Jupyter widget.

  • Training data, test data, and trained models

  • can be provided to the what-if tool

  • directly from the TFX Metadata store

  • that you heard about earlier.

  • But, the interesting thing here is

  • CTR is really just the model's proxy objective.

  • You want to really understand CTR

  • so that you can think about the supply you should buy,

  • your inventory.

  • And so, your actual business objectives

  • depend on something much larger.

  • They depend on revenue.

  • They depend on cost.

  • They depend on your supply.

  • So you don't just want to understand

  • when your CTR is wrong.

  • You actually want to understand how getting this wrong

  • could actually affect your broader business--

  • this misprediction cost.

  • So in order to figure this out, you

  • decide you want to join your model predictions

  • with the rest of your business data

  • to understand that larger impact.

  • This is where some of the component functionality of TFX

  • comes in.

  • You could actually create a new component

  • with a custom executor.

  • You can customize this executor such

  • that you can actually join your model predictions

  • with your business data.

  • Let's break that down a bit.

  • So you have your trainer.

  • You train a model.

  • And then, you run the evaluator to be able to get the results.

  • You can then leverage your custom component

  • to join in business data and export every prediction

  • to a SQL database where you can actually quantify this cost.

  • You can then take this a step further.

  • With the what-if tool, we talked a little bit

  • about the importance of assumptions, of testing what

  • you think might be going wrong.

  • You can take this farther with the idea

  • of understandable baselines.

  • Usually, there are a few assumptions

  • we make about the ways we believe that our models should

  • be performing.

  • For example, I might believe that weekends and holidays

  • are likely to have a higher CTR with my shoes than weekdays.

  • Or, I may have the hypothesis that certain geographic regions

  • are more likely to click on certain types of shoes

  • than others.

  • These assumptions are rules that I have attributed to how

  • my models should perform.

  • So I could actually create a very simple rule-based model

  • that could encode these prior beliefs.

  • This baseline expresses my priors,

  • so I can use them to tell when my model might

  • be uncertain or wrong, or to dive in

  • deeper if my expectations were in fact wrong.

  • They can help inform when the deep model is overgeneralizing

  • or when maybe my expectations are overgeneralizing.

  • So here again, custom components can help you.

  • You can actually build a baseline trainer

  • with these simple rules whose evaluator also

  • exports to your SQL database.

  • Basically, you stamp each example

  • with its baseline prediction.

  • So now, you can run queries not just over the model predictions

  • and your business data, but also over your baseline assumptions

  • to understand where your expectations are violated.

  • Understandable baselines are one way

  • of understanding your models, and you

  • can extend even beyond this by building custom components

  • that leverage new and amazing research techniques

  • in model understanding.

  • These techniques include [INAUDIBLE],,

  • which Sundir touched on yesterday

  • in the keynote, but also path-integrated gradients,

  • for example.

  • Hopefully, this gave you one example of many ways

  • that we hope that the extensibility of TFX

  • will allow you to go deeper and extend the functionality

  • for your own business goals.

  • Overall, with TensorFlow Extended you can leverage many

  • out-of-the-box components for your production model needs.

  • This includes things like TensorFlow Data Validation

  • and TensorFlow Model Analysis.

  • TFX also provides flexible orchestration and metadata,

  • and building on top of the out-of-the-box components can

  • help you further your understanding of the model such

  • that you can truly achieve your business goals.

  • We're excited to continue to expand TensorFlow

  • Extended for more of your use cases and see how you extend

  • and expand it as well.

  • We're excited to continue to grow the TFX

  • community with you.

  • And come to our office hours tomorrow

  • if you're still around I/O to be able to talk to us

  • and learn more and for us to be able to learn

  • from you about your needs and use cases as well.

  • Thank you so much for joining us today,

  • and we hope this was helpful.

  • [APPLAUSE]

  • [MUSIC PLAYING]

[LOGO MUSIC]

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

TensorFlow Extended(TFX)。機械学習パイプラインとモデル理解 (Google I/O'19) (TensorFlow Extended (TFX): Machine Learning Pipelines and Model Understanding (Google I/O'19))

  • 1 0
    林宜悉 に公開 2021 年 01 月 14 日
動画の中の単語