Placeholder Image

字幕表 動画を再生する

  • Hi, I'm Robert Crowe,

  • and today I'm going to be talking about TensorFlow Extended also known as TFX,

  • and how it helps you put your amazing machine learning models

  • into production.

  • This is the final episode of our five-part series

  • on real-world machine learning and production.

  • We've covered a lot so far in episodes 1 through 4,

  • so if you haven't seen those yet, I'd really recommend watching them.

  • In today's episode, we'll be looking at an example

  • of how model understanding is critical for meeting your business goals.

  • Let's get started.

  • ♪ (music) ♪

  • We've talked about how TFX and TensorFlow model analysis

  • let you do deep analysis of your model's performance.

  • Let's look at why that's important.

  • In this example we have an online retailer who is selling shoes.

  • They're using a model to predict click-through rates

  • and using those predictions to decide how much inventory to order

  • for each of their products.

  • Everything seems to be working great,

  • when suddenly, they discover that their model's AUC

  • and prediction accuracy for a particular part of their product line,

  • men's dress shoes,

  • has started getting much worse than it was before.

  • Now, how much inventory should they order for men's dress shoes?

  • If these are high-end dress shoes,

  • the cost could be a significant part of their business.

  • That's why doing deep analysis of your model's performance,

  • not just once, but on an ongoing basis, is critical for your business.

  • TFX creates pipelines that enable that kind of ongoing deep analysis.

  • Remember that it's not just overall model performance.

  • Mispredictions on different parts of your data

  • do not have the same uniform cost to your business.

  • The data that you have is almost never the data that you wish you had,

  • and your model's objectives, things like AUC,

  • are really just proxies for your actual business objectives,

  • things like knowing how much inventory to order.

  • Finally, the real world doesn't stand still,

  • so your data and business conditions are constantly changing.

  • That's why you need to continue to monitor and analyze

  • how your model reacts to changes.

  • One way to look at this is to think about a triangle

  • which we call the ML Insights Triangle.

  • We found that usually when there's a problem

  • with your model's performance for your business,

  • it's because an assumption was violated.

  • The question is, which one?

  • So, what are these assumptions?

  • First, has something about the realities of our business changed?

  • Maybe we have a new supplier or a new product has been released.

  • Maybe our customers' behavior has changed.

  • All of these can affect our business,

  • and how well our models perform for our business.

  • Have we started getting bad data?

  • Maybe a sensor has gone bad,

  • or a service endpoint started getting flaky,

  • or maybe a software update has broken something,

  • or maybe the feature set that we've been using

  • isn't working for the current business conditions,

  • or maybe the problem really is with our model.

  • Maybe we need to change the architecture

  • or create an ensemble with a rules-based system

  • or just retune the hyper parameters.

  • When things go wrong, you need to start investigating

  • to look for potential problems.

  • The place to start is always with your data

  • because if your data isn't right, nothing will be right.

  • Fortunately, TFX builds tools and processes

  • for investigating your data right into your pipelines

  • with the StatisticsGen, SchemaGen and ExampleValidator components

  • and the tools provided by TensorFlow Data Validation.

  • You should look for outliers and missing values in your data

  • and also look for changes in the distributions

  • for each of your features.

  • For example, seasonality and trend can affect your data over time,

  • and if you don't look for it, you might not be aware of it.

  • TensorFlow Data Validation or TFDV,

  • provides visualization tools like this for investigating your data

  • and making comparisons between the data you're seeing now

  • and the data you were seeing last week or last month.

  • These are really valuable when you are trying to dive into your data.

  • You could also look for particular combinations of features,

  • regions of your loss surface where your data may be sparse.

  • Coverage of your feature space is important for model performance

  • and it will change over time as your data changes.

  • In regions where coverage is sparse,

  • you may need to focus on collecting examples

  • to fill in those spaces,

  • which might require creating new features or eliminating features

  • that aren't providing good predictive information.

  • This can often be a result of changes in your business conditions.

  • Perhaps a new shoe came on the market

  • or someone bought TV media that shifted CTRs from one brand to another.

  • Change is a constant in business and in life

  • and it's a constant for your data too.

  • Another way to investigate the problem

  • is to really dig into your model's performance.

  • Fortunately, TFX builds tools and processes

  • for doing deep analysis of your model's performance

  • right into your pipelines with the Evaluator component

  • and the tools provided by TensorFlow Model Analysis or TFMA.

  • It's really important to look at not just the top level metrics for your model,

  • but your model performance on individual slices of your data.

  • What slices make sense?

  • Try to think about combinations of features,

  • regions of your loss surface that define different parts of your data.

  • Look at edge cases and corner cases.

  • Look at important subsets of your data

  • or critical, but rare situations.

  • There is an art to understanding your data and how it reflects your business

  • and TFMA gives you tools like this

  • to explore and evolve your understanding of it.

  • We also make the "what-if" tool

  • available for exploring and experimenting with your data and your model.

  • It's a great tool for doing what-if experiments

  • to see how your model responds to changes

  • and in the process developing a better understanding

  • of your model and your data.

  • The results it displays aren't exact

  • because it only works with samples of your data,

  • but it can give you approximate results that can point you in the right direction.

  • It works in both TensorBoard and Jupyter Notebooks

  • and pulls in data from MO Metadata,

  • so that we can compare the results we have today

  • with last week or last month.

  • But remember, no model is 100% accurate all the time.

  • What matters is the cost to your business.

  • So to really understand the misprediction cost,

  • you need to join it with your business data

  • and calculate how much the inaccuracies in your model's objectives,

  • which are really just proxies for your business objectives,

  • end up costing you.

  • Without doing that, you have no way of knowing

  • if changes in your model's performance are a little problem

  • or a big problem or an emergency.

  • So that's how TFX helps you manage your models,

  • manage your ML Applications and manage your business.

  • TFX is the framework that Google and Alphabet companies use

  • for our production ML and now it's available for everyone to use.

  • For more information on TFX, visit us at tensorflow.org/tfx,

  • clone the repos on GitHub,

  • and don't forget to comment and like us below.

  • And thanks for watching.

  • ♪ (music) ♪

Hi, I'm Robert Crowe,

字幕と単語

ワンタップで英和辞典検索 単語をクリックすると、意味が表示されます

B1 中級

モデル理解とビジネスリアリティ(TensorFlow拡張版 (Model Understanding and Business Reality (TensorFlow Extended))

  • 2 0
    林宜悉 に公開 2021 年 01 月 14 日
動画の中の単語