Placeholder Image

字幕表 動画を再生する

  • [MUSIC PLAYING]

  • JACQUELINE PAN: Hi, everyone.

  • I'm Jackie, and I'm the Lead Program Manager on ML Fairness

  • here at Google.

  • So what is ML fairness?

  • As some of you may know, Google's mission

  • is to organize the world's information

  • and make it universally accessible and useful.

  • Every one of our users gives us their trust.

  • And it's our responsibility to do right by them.

  • And as the impact and reach of AI

  • has grown across societies and sectors,

  • it's critical to ethically design and deploy

  • these systems in a fair and inclusive way.

  • Addressing fairness in AI is an active area

  • of research at Google, from fostering

  • a diverse and inclusive workforce that

  • embodies critical and diverse knowledge to training models

  • to remove or correct problematic biases.

  • There is no standard definition of fairness,

  • whether decisions are made by humans or by machines.

  • Far from a solved problem, fairness in AI

  • presents both an opportunity and a challenge.

  • Last summer, Google outlined principles

  • to guide the responsible development and use of AI.

  • One of them directly speaks to ML fairness

  • and making sure that our technologies don't create

  • or reinforce unfair bias.

  • The principles further state that we

  • seek to avoid unjust impacts on people related

  • to sensitive characteristics such as race, ethnicity,

  • gender, nationality, income, sexual orientation, ability,

  • and political or religious belief.

  • Now let's take a look at how unfair bias might be created

  • or reinforced.

  • An important step on that path is

  • acknowledging that humans are at the center of technology

  • design, in addition to being impacted by it.

  • And humans have not always made product design decisions

  • that are in line with the needs of everyone.

  • For example, because female body-type crash test dummies

  • weren't required until 2011, female drivers

  • were more likely than male drivers

  • to be severely injured in an accident.

  • Band-Aids have long been manufactured in a single

  • color--

  • a soft pink.

  • In this tweet, you see the personal experience

  • of an individual using a Band-Aid that matches his skin

  • tone for the first time.

  • A product that's designed and intended for widespread use

  • shouldn't fail for an individual because of something

  • that they can't change about themselves.

  • Products and technology should just work for everyone.

  • These choices may not have been deliberate,

  • but they still reinforce the importance

  • of being thoughtful about technology

  • design and the impact it may have on humans.

  • Why does Google care about these problems?

  • Well, our users are diverse, and it's important

  • that we provide an experience that works equally

  • well across all of our users.

  • The good news is that humans, you, have the power

  • to approach these problems differently,

  • and to create technology that is fair and more

  • inclusive for more people.

  • I'll give you a sense of what that means.

  • Take a look at these images.

  • You'll notice where the label "wedding" was applied

  • to the images on the left, and where it wasn't, the image

  • on the right.

  • The labels in these photos demonstrate

  • how one open source image classifier trained on the Open

  • Images Dataset does not properly recognize wedding traditions

  • from different parts of the world.

  • Open datasets, like open images, are a necessary and critical

  • part of developing useful ML models,

  • but some open source datasets have

  • been found to be geographically skewed based on how

  • and where they were collected.

  • To bring greater geographic diversity

  • to open images, last year, we enabled the global community

  • of crowdsourced app users to photograph the world

  • around them and make their photos available to researchers

  • and developers as a part of the Open Images Extended Dataset.

  • We know that this is just an early step on a long journey.

  • And to build inclusive ML products,

  • training data must represent global diversity

  • along several dimensions.

  • These are complex sociotechnical challenges,

  • and they need to be interrogated from many different angles.

  • It's about problem formation and how

  • you think about these systems with human impact in mind.

  • Let's talk a little bit more about these challenges

  • and where they can manifest in an ML pipeline.

  • Unfairness can enter the system at any point in the ML

  • pipeline, from data collection and handling to model training

  • to end use.

  • Rarely can you identify a single cause of or a single solution

  • to these problems.

  • Far more often, various causes interact in ML systems

  • to produce problematic outcomes.

  • And a range of solutions is needed.

  • We try to disentangle these interactions

  • to identify root causes and to find ways forward.

  • This approach spans more than just one team or discipline.

  • ML fairness is an initiative to help address these challenges.

  • And it takes a lot of different individuals

  • with different backgrounds to do this.

  • We need to ask ourselves questions like, how do people

  • feel about fairness when they're interacting with an ML system?

  • How can you make systems more transparent to users?

  • And what's the societal impact of an ML system?

  • Bias problems run deep, and they don't always

  • manifest in the same way.

  • As a result, we've had to learn different techniques

  • of addressing these challenges.

  • Now we'll walk through some of the lessons that Google

  • has learned in evaluating and improving our products,

  • as well as tools and techniques that we're

  • developing in this race.

  • Here to tell you more about this is Tulsee.

  • TULSEE DOSHI: Awesome.

  • Thanks, Jackie.

  • Hi, everyone.

  • My name is Tulsee, and I lead product for the ML Fairness

  • effort here at Google.

  • Today, I'll talk about three different angles in which we've

  • thought about and acted on fairness

  • concerns in our products, and the lessons

  • that we've learned from that.

  • We'll also walk through our next steps, tools, and techniques

  • that we're developing.

  • Of course, we know that the lessons

  • we're going to talk about today are only

  • some of the many ways of tackling the problem.

  • In fact, as you heard in the keynote on Tuesday,

  • we're continuing to develop new methods,

  • such as [INAUDIBLE],, to understand our models

  • and to improve them.

  • And we hope to keep learning with you.

  • So with that, let's start with data.

  • As Jackie mentioned, datasets are a key part

  • of the ML development process.

  • Data trains a model and informs what

  • a model learns from and sees.

  • Data is also a critical part of evaluating the model.

  • The datasets we choose to evaluate on indicate what we

  • know about how the model performs,

  • and when it performs well or doesn't.

  • So let's start with an example.

  • What you see on the screen here is a screenshot

  • from a game called Quick Draw that

  • was developed through the Google AI Experiments program.

  • In this game, people drew images of different objects

  • around the world, like shoes or trees or cars.

  • And we use those images to train an image classification model.

  • This model could then play a game

  • with the users, where a user would draw an image

  • and the model would guess what that image was of.

  • Here you see a whole bunch of drawings of shoes.

  • And actually, we were really excited,

  • because what better way to get diverse input

  • from a whole bunch of users than to launch something globally

  • where a whole bunch of users across the world

  • could draw images for what they perceived an object

  • to look like?

  • But what we found as this model started to collect data

  • was that most of the images that users

  • drew of shoes looked like that shoe in the top right,

  • the blue shoe.

  • So over time, as the model saw more and more examples,

  • it started to learn that a shoe looked a certain way

  • like that top right shoe, and wasn't

  • able to recognize the shoe in the bottom right,

  • the orange shoe.

  • Even though we were able to get data

  • from a diverse set of users, the shoes

  • that the users chose to draw or the users

  • who actually engaged with the product at all

  • were skewed, and led to skewed training data

  • in what we actually received.

  • This is a social issue first, which

  • is then exacerbated by our technical implementation.

  • Because when we're making classification decisions that

  • divide up the world into parts, even if those parts are

  • what is a shoe and what isn't a shoe,

  • we're making fundamental judgment calls

  • about what deserves to be in one part

  • or what deserves to be in the other.

  • It's easier to deal with when we're talking about shoes,

  • but it's harder to talk about when we're

  • classifying images of people.

  • An example of this is the Google Clips camera.

  • This camera was designed to recognize memorable moments

  • in real-time streaming video.

  • The idea is that it automatically

  • captures memorable motion photos of friends, of family, or even

  • of pets.

  • And we designed the Google Clips camera

  • to have equitable outcomes for all users.

  • It, like all of our camera products,

  • should work for all families, no matter who or where they are.

  • It should work for people of all skin tones,

  • all age ranges, and in all poses,

  • and in all lighting conditions.

  • As we started to build this system,

  • we realized that if we only created training data that

  • represented certain types of families,

  • the model would also only recognize

  • certain types of families.

  • So we had to do a lot of work to increase our training data's

  • coverage and to make sure that it would recognize everyone.

  • We went global to collect these datasets, collecting datasets

  • of different types of families in different environments

  • conditions in different lighting conditions.

  • And in doing so, we were able to make sure

  • that not only could we train a model that

  • had diverse outcomes, but that we could also

  • evaluate this constrained on a whole bunch

  • of different variables like lighting or space.

  • This is something that we're continuing to do,

  • continuing to create automatic fairness tests for our systems

  • so that we can see how they change over time

  • and to continue to ensure that they are inclusive of everyone.

  • The biggest lesson we've learned in this process

  • is how important it is to build training and evaluation

  • datasets that represent all the nuances of our target

  • population.

  • This both means making sure that the data that we collect

  • is diverse and representative, but also

  • that the different contexts of the way

  • that the users are providing us this data

  • is taken into account.

  • Even if you have a diverse set of users,

  • that doesn't mean that the images of shoes you get

  • will be diverse.

  • And so thinking about those nuances and the trade-offs that

  • might occur when you're collecting your data

  • is super important.

  • Additionally, it's also important to reflect

  • on who that target population might leave out.

  • Who might not actually have access to this product?

  • Where are the blind spots in who we're reaching?

  • And lastly, how will the data that you're collecting

  • grow and change over time?

  • As our users use our products, they very