字幕表 動画を再生する
[MUSIC PLAYING]
LILY PENG: Hi everybody.
My name is Lily Peng.
I'm a physician by training and I work on the Google medical--
well, Google AI health-care team.
I am a product manager.
And today we're going to talk to you about a couple of projects
that we have been working on in our group.
So first off, I think you'll get a lot of this,
so I'm not going to go over this too much.
But because we apply deep learning
to medical information, I kind of wanted
to just define a few terms that get used quite a bit
but are somewhat poorly defined.
So first off, artificial intelligence-- this
is a pretty broad term and it encompasses that grand project
to build a nonhuman intelligence.
Machine learning is a particular type
of artificial intelligence, I suppose,
that teaches machines to be smarter.
And deep learning is a particular type
of machine learning which you guys have probably
heard about quite a bit and will hear about quite a bit more.
So first of all, what is deep learning?
So it's a modern reincarnation of artificial neural networks,
which actually was invented in the 1960s.
It's a collection of simple trainable units, organized
in layers.
And they work together to solve or model complicated tasks.
So in general, with smaller data sets and limited compute,
which is what we had in the 1980s and '90s,
other approaches generally work better.
But with larger data sets and larger model sizes
and more compute power, we find that neural networks
work much better.
So there's actually just two takeaways
that I want you guys to get from this slide.
One is that deep learning trains algorithms
that are very accurate when given enough data.
And two, that deep learning can do this
without feature engineering.
And that means without explicitly writing the rules.
So what do I mean by that?
Well in traditional computer vision,
we spend a lot of time writing the rules
that a machine should follow to make a certain prediction task.
In convolutional neural networks,
we actually spend very little time in feature
engineering and writing these rules.
Most of the time we spend in data preparation
and numerical optimization and model architecture.
So I get this question quite a bit.
And the question is, how much data is enough data
for a deep neural network?
Well in general, more is better.
But there are diminishing returns beyond a certain point.
And a general rule of thumb is that we
like to have about 5,000 positives per class.
But the key thing is good and relevant data--
so garbage in, garbage out.
The model will predict very well what you ask it to predict.
So when you think about where machine learning,
and especially deep learning, can make the biggest impact,
it's really in places where there's
lots of data to look through.
One of our directors, Greg Corrado, puts it best.
Deep learning is really good for tasks that you've done 10,000
times, and on the 10,001st time, you're just sick of it and you
don't want to do it anymore.
So this is really great for health care in screening
applications where you see a lot of patients
that are potentially normal.
It's also great where expertise is limited.
So here on the right you see a graph
of the shortage of radiologists kind of worldwide.
And this is also true for other medical specialties,
but radiologists are sort of here.
And we basically see a worldwide shortage of medical expertise.
So one of the screening applications
that our group has worked on is with diabetic retinopathy.
We call it DR because it's easier
to say than diabetic retinopathy.
And it's the fastest growing cause of preventable blindness.
All 450 million people with diabetes are at risk and need
to be screened once a year.
This is done by taking a picture of the back
of the eye with a special camera, as you see here.
And the picture looks a little bit like that.
And so what a doctor does when they get an image like this
is they grade it on a scale of one to five from no disease,
so healthy, to proliferate disease,
which is the end stage.
And when they do grading, they look for sometimes very subtle
findings, little things called micro aneurysms
that are outpouchings in the blood vessels of the eye.
And that indicates how bad your diabetes
is affecting your vision.
So unfortunately in many parts of the world,
there are just not enough eye doctors to do this task.
So with one of our partners in India,
or actually a couple of our partners in India,
there is a shortage of 127,000 eye doctors in the nation.
And as a result, about 45% of patients
suffer some sort of vision loss before the disease is detected.
Now as you recall, I said that this disease
was completely preventable.
So again, this is something that should not be happening.
So what we decided to do was we partnered
with a couple of hospitals in India,
as well as a screening provider in the US.
And we got about 130,000 images for this first go around.
We hired 54 ophthalmologists and built a labeling tool.
And then the 54 ophthalmologists actually
graded these images on this scale,
from no DR to proliferative.
The interesting thing was that there was actually
a little bit of variability in how doctors call the images.
And so we actually got about 880,000 diagnoses in all.
And with this labelled data set, we put it through a fairly well
known convolutional neural net.
This is called Inception.
I think lot of you guys may be familiar with it.
It's generally used to classify cats and dogs for our photo app
or for some other search apps.
And we just repurposed it to do fundus images.
So the other thing that we learned
while we were doing this work was
that while it was really useful to have
this five-point diagnosis, it was also
incredibly useful to give doctors
feedback on housekeeping predictions like image quality,
whether this is a left or right eye,
or which part of the retina this is.
So we added that to the network as well.
So how well does it do?
So this is the first version of our model
that we published in a medical journal in 2016 I believe.
And right here on the left is a chart
of the performance of the model in aggregate
over about 10,000 images.
Sensitivity is on the y-axis, and then 1 minus specificity
is on the x-axis.
So sensitivity is a percentage of the time when
a patient has a disease and you've
got that right, when the model was calling the disease.
And then specificity is the proportion
of patients that don't have the disease that the model
or the doctor got right.
And you can see you want something
with high sensitivity and high specificity.
And so up and to the right--
or up and to the left is good.
And you can see here on the chart
that the little dots are the doctors that
were grading the same set.
So we get pretty close to the doctor.
And these are board-certified US physicians.
And these are ophthalmologists, general ophthalmologists
by training.
In fact if you look at the F score, which
is a combined measure of both sensitivity and specificity,
we're just a little better than the median ophthalmologist
in this particular study.
So since then we've improved the model.
So last year about December 2016 we were sort of on par
with generalists.
And then this year--
this is a new paper that we published--
we actually used retinal specialists
to grade the images.
So they're specialists.
We also had them argue when they disagreed
about what the diagnosis was.
And you can see when we train the model using
that as the ground truth, the model predicted that quite well
as well.
So this year we're sort of on par
with the retina specialists.
And this weighted kappa thing is just
agreement on the five-class level.
And you can see that, essentially, we're
sort of in between the ophthalmologists and the retina
specialists, in fact kind of in between
the retinal specialists.
Another thing that we've been working on
beyond improving the models is actually
trying to have the networks explain
how it's making a prediction.
So again, taking a playbook or a play
out of the playbook from the consumer world,
we started using this technique called show me where.
And this is where using an image,
we actually generate a heat map of where
the relevant pixels are for this particular prediction.
So here you can see a picture of a Pomeranian.
And the heat map shows you that there
is something in the face of the Pomeranian
that makes it look Pomeranian-y.
And on the right here, you kind of have an Afghan hound,
and the network's highlighting the Afghan hound.
So using this very similar technique,
we applied it to the fundus images
and we said, show me where.
So this is a case of mild disease.
And I can tell it's mild disease because--
well, it looks completely normal to me.
I can't tell that there is any disease there.
But a highly trained doctor would
be able to pick out little thing called microaneurysms
where the green spots are.
Here's a picture of moderate disease.
And this is a little worse because you can see
some bleeding at the ends here.
And actually I don't know if I can signal,
but there's a bleeding there.
And the heat map--
so here's a heat map.
You can see that it picks up the bleeding.
But there's two artifacts in this image.
So there is a dust spot, just like a little dark spot.
And then there is this little reflection
in the middle of the image.
And you could tell that the model just
ignores it, essentially.
So what's next?
We trained a model.
We showed that it's somewhat explainable.
We think it's doing the right thing.
What's next?
Well, we actually have to deploy this into health-care systems.
And we're partnering with health-care providers
and companies to bring this to patients.
And actually Dr. Jess Mega, who is going to speak after me,
is going to have a little more details about this effort
there.
So I've given the screening application.
And here's an application in diagnosis
that we're working on.
So in this particular example, we're talking about a disease--
well, we're talking about breast cancer,
but we're talking about metastases of breast cancer
into nearby lymph nodes.
So when a patient is diagnosed with breast cancer
and the primary breast cancer is removed,
the surgeon spends some time taking out
what we call lymph nodes so that we can examine
to see whether or not the breast cancer has metastasized
to those nodes.
And that has an impact on how you treat the patient.
So reading these lymph nodes is actually not an easy task.
And in fact about in 24% of biopsies when they went back
to look at them, the 24% had a change in nodal status.
Which means that if it was positive, it was read negative,
and it was negative, read positive.
So that's a really big deal.
It's one in four.
The interesting thing is that there
was another study published that showed
that a pathologist with unlimited time,
not overwhelmed with data, actually
is quite sensitive, so 94% sensitivity in finding
the tumors.
When you put time constraint on the patient,
their sensitivity-- or sorry, on the provider,
on the pathologist, the sensitivity drops.
And people will start overlooking
where little metastases may be.
So in this picture there's a tiny metastasis right there.
And that's usually small things like this that are missed.
And this is not surprising given that so much information
is in each slide.
So one of these slides, if digitized,
is about 10 gigapixels.
And that's literally a needle in a haystack.
The interesting thing is that pathologists can actually
find 73% of the cancers if they spend all their time looking
for it with zero false positives per slide.
So we trained a model that can help with this task.
It actually finds about 95% of the cancer lesions
and it has eight false positives per slide.
So clearly an ideal system is one
that is very sensitive using the model, but also quite specific,
that relies on the pathologist to actually look over
the false positives and calling them false positives.
So this is very promising and we're
working on validation in the clinic right now.
In terms of reader studies, how this actually
interacts with the doctor is really quite important.
And clearly there are applications to other tissues.
I talked about lymph nodes, but we have some early studies
that actually show that this works for prostate cancer,
as well, for Gleason grading.
So in the previous examples we talked
about how deep learning can produce the algorithms that
are very accurate.
And they tend to make calls that a doctor might already make.
But what about predicting things that doctors don't currently
do from imaging?
So as you recall from the beginning of the talk,
one of the great things about deep learning
is that you can train very accurate algorithms
without explicitly writing rules.
So this allows us to make completely new discoveries.
So the picture on the left is from a paper
that we published recently where we
trained deep-learning models to predict a variety
of cardiovascular risk factors.
And that includes age, self-reported sex,
smoking status, blood pressure, things that doctors generally
consider right now to assess the patient's cardiovascular risk
and make proper treatment recommendations.
So it turns out that we can not only
predict many of these factors, and quite accurately,
but we can actually directly predict a five-year risk
of a cardiac event.
So this work is quite early, really pulmonary,
and the AUC for this prediction is 0.7.
What that number is means is that if given two pictures, one
picture of a patient that did not have a cardiovascular event
and one picture of a patient who did, it is right about 70%
of the time.
Most doctors is around 50% of time,
because it's kind of a random-- like it's
hard to do based on a retinal image alone.
So why is this exciting?
Well normally when a doctor tries
to assess your risk for cardiovascular disease,
there are needles involved.
So I don't know if anyone has gotten blood cholesterol
screening.
You fast the night before and then we take some blood samples
and then we assess your risk.
So again, I want to emphasize that this is really early on.
But these results support the idea
that we may be able to use something
like an image to make new predictions that we couldn't
make before.
And this might be able to be done in sort
of a noninvasive manner.
So I've given a few examples, three examples
of how deep learning can really increase both availability
and accuracy in health care.
And one of the things that I want to kind of also
acknowledge here is the reason why this has become
more and more exciting is, I think, because TensorFlow
is open source.
So this kind of open standard from general machine learning
is being applied everywhere.
So I've given examples of work that we've done at Google,
but there's a lot of work that's being done across the community
at other medical centers that are very similar.
And so we're really excited about what
this technology can bring to the field of health care.
And with that, I'd like to introduce Jess Mega.
Unlike me, she is a real doctor.
And she's the chief medical officer at Verily.
JESSICA MEGA: Well thank you all for being here.
And thank you Lily for kicking us off.
I think the excitement around AI and health care
could not be greater.
As you heard, my name is Jess Mega.
I'm a cardiologist and am so excited to be
part of the Alphabet family.
Verily grew out of Google and Google X.
And we are focused solely on health care and life sciences.
And our mission is to take the world's health information
and make it useful so that patients live healthier lives.
And the example that I'll talk about today focuses on diabetes
and really lends itself to the conversation that Lily started.
But I think it's very important to pause
and think about health data broadly.
Right now, any individual who's in the audience today
has about several gigabytes of health data.
But if you think about health in the years
to come and think about genomics,
molecular technologies, imaging, sensor data,
patient-reported data, electronic health records
and claims, we're talking about huge sums
of data, gigabytes of data.
And at Verily and at Alphabet, we're
committed to stay ahead of this so that we can help patients.
The reason we're focusing initially some of our efforts
on diabetes is this is an urgent health issue.
About 1 in 10 people has diabetes.
And when you have diabetes, it affects
how you handle sugar glucose in the body.
And if you think about prediabetes,
the condition before someone has diabetes,
that's one in three people.
That would be the entire center section of the audience today.
Now what happens when your body handles
glucose in a different way, you can have downstream effects.
You heard Lilly talk about diabetic retinopathy.
People can have problems with their heart, kidneys,
and peripheral neuropathy.
So this is the type of disease that we need to get ahead of.
But we have two main issues that we're trying to address.
The first one is an information gap.
So even the most adherent patients with diabetes--
and my grandfather was one of these--
would check his blood sugar four times a day.
And I don't know if anyone today has been
able to have any of the snacks.
I actually had some of the caramel popcorn.
Did anyone have any of that?
Yeah, that was great, right, except probably
our biology and our glucose is going up and down.
So if I didn't check my glucose in that moment,
we wouldn't have captured that data.
So we know biology is happening all of the time.
When I see patients in the hospital as a cardiologist,
I can see someone's heart rate, their blood pressure, all
of these vital signs in real time.
And then people go home, but biology is still happening.
So there's an information gap, especially with diabetes.
The second issue is a decision gap.
You may see a care provider once a year, twice a year,
but health decisions are happening every single day.
They're happening weekly, daily, hourly.
And how do we decide to close this gap?
At Verily we're focusing on three key missions.
And this can be true for almost every project we take on.
We're thinking about how to shift
from episodic and reactive care to much more proactive care.
And in order to do that and to get to the point
where we can really use the power of that AI,
we have to do three things.
We have to think about collecting the right data.
And today I'll be talking about continuous glucose monitoring.
How do you then organize this data so that it's in a format
that we can unlock and activate and truly help patients?
So whether we do this in the field of diabetes
that you'll hear about today or with our surgical robots,
this is the general premise.
The first thing to think about is the collection of data.
And you heard Lily say garbage in, garbage out.
We can't look for insights unless we understand
what we're looking at.
And one thing that has been absolutely revolutionary
is thinking about extremely small biocompatible
electronics.
So we are working on next-generation sensing.
And you can see a demonstration here.
What this will lead to, for example,
with extremely small continuous glucose monitors where we're
partnering to create some of these tools,
this will lead to more-seamless integration.
So again, you don't just have a few glucose values,
but we understand how your body is handling
sugar, or someone with type 2 diabetes,
in a more continuous fashion.
It also helps us understand not only
what happens at a population level
but what might happen on an individual level
when you are ingesting certain foods.
And the final thing is to really try to reduce costs of devices
so that we can really democratize health.
The next aim is, how do we organize all of this data?
And I can speak both as a patient and as a physician.
The thing that people will say is, data's amazing,
but please don't overwhelm us with a tsunami of data.
You need to organize it.
And so we've partnered with Sanofi
on a company called Onduo.
And the idea is to put the patient
in the center of their care and help
simplify diabetes management.
This really gets to the heart of someone
who is going to be happier and healthier.
So what does it actually mean?
What we try to do is empower people
with their glucose control.
So we turned to the American Diabetes Association
and look at the glucose ranges that are recommended.
People then get a graph that shows you
what your day looks like and the percentage of time
that you are in range--
again, giving a patient or a user
that data so they can be the center of their decisions--
and then finally tracking steps through Google Fit.
The next goal then is to try to understand
how glucose is pairing with your activity and your diet.
So here there's an app that prompts
for the photo of the food.
And then using image recognition and using Google's TensorFlow,
we can identify the food.
And this is where the true personal insights
start to become real.
Because if you eat a certain meal,
it's helpful to understand how your body ends up
relating to it.
And there's some really interesting preliminary data
suggesting that the microbiome may change
the way I responded to a banana, for example,
or you might respond.
And that's important to know because all
of a sudden those general recommendations that we make
as a doc-- so if someone comes to see me in clinic
and they have type 2 diabetes I might say, OK,
here are the things you need to do.
You need to watch your diet, exercise,
take your oral medications.
I need you to also take insulin, exercise.
You've got to see your foot doctor, your eye
doctor, your primary-care doctor,
and the endocrinologist.
And that's a lot to integrate.
And so what we try to do is also pair
all of this information in a simple way with a care lead.
This is a person that helps someone on their journey
as this information is surfaced.
And if you look in the middle of what I'm showing you here
on what the care lead and what the person is seeing,
you'll see a number of different lines.
And I want us to drill down and look into that.
This is showing you the difference between the data
you might see in an episodic glucose example
or what you're seeing with the continuous glucose monitor
enabled by this new sensing.
And so let's say we drill down into this continuous glucose
monitor and we look at a cluster of days.
This is an example.
We might start to see patterns.
And as Lily mentioned, this is not the type of thing
that an individual patient, care lead, or physician would end up
digging through, but this is where
you start to unlock the power of learning models.
Because what we can start to see is a cluster
of different mornings.
We'll make a positive association
that everyone's eating incredibly healthy here
at Google I/O, so maybe that's a cluster of the red mornings.
But we go back into our regular lives and we get stressed
and we're eating a different cluster of foods.
But instead of, again, giving general advice,
we can use different models to point out,
it seems like something is going on.
With one patient, for example, we
were seeing a cluster around Wednesdays.
So what's going on on Wednesdays?
Is it that the person is going and stopping
by a particular location, or maybe there's
a lot of stress that day.
But again, instead of giving general care,
we can start to target care in the most comprehensive
and actionable example.
So again, thinking about what we're talking about,
collecting data, organizing it, and then activating it
and making it extremely relevant.
So that is the way we're thinking about diabetes care,
and that is the way AI is going to work.
We heard this morning in another discussion,
we've got to think about the problems
that we're going to solve and use these tools to really make
a difference.
So what are some other ways that we can think
about activating information?
And we heard from Lily that diabetic retinopathy
is one of the leading causes of blindness.
So even if we have excellent glucose care,
there may be times where you start to have end organ damage.
And I had mentioned that elevated glucose
levels can end up affecting the fundus and the retina.
Now we know that people with diabetes
should undergo screening.
But earlier in the talk I gave you
the laundry list of what we're asking patients
to do who have diabetes.
And so what we're trying to do with this collaboration
with Google is figure out, how do we actually
get ahead of the product and think
about an end-to-end solution so that we realize
and bring down the challenges that exist today.
Because the issue, in terms of getting screened, one of it
is accessibility, and the other one
is having access to optometrists and ophthalmologists.
And this is a problem in the United States
as well as in developing worlds.
So this is a problem, not something just local.
This is something that we think very globally about when
we think about the solution.
We looked at this data earlier and this idea
that we can take algorithms and increase both the sensitivity
and specificity of diagnosing diabetic retinopathy
and macular edema.
And this is data that was published in "JAMA"
as Lily nicely outlined.
The question then is, how do we think
about creating this product?
Because the beauty of working at places like Alphabet
and working with partners like you all here today is we
can think about, what problem are we solving,
create the algorithms.
But we then need to step back and say, what does it
mean to operate in the space of health care
and in the space of life science?
We need to think about the image acquisition, the algorithm,
and then delivering that information
both to physicians as well as patients.
So what we're doing is taking this information
and now working with some of our partners.
There's a promising pilot that's currently ongoing both here as
well as in India, and we're so encouraged to hear
the early feedback.
And there are two pieces of information
I wanted to share with you.
One is that looking at this early observations,
we're seeing higher accuracy with AI
than with a manual greater.
And the thing that's important as a physician--
I don't know if there are any other doctors in the room,
but the piece I always tell people is there's
going to be room for health-care providers.
What these tools are doing is merely helping us do our job.
So sometimes people ask me, is technology and AI
going to replace physicians or replace the health-care system?
And the way I think about it is, it just augments the work
we do.
If you think about the stethoscope--
so I'm a cardiologist, and the stethoscope
was invented about 200 years ago.
It doesn't replace the work we do.
It merely augments the work we do.
And I think you're going to see a similar theme as we continue
to think about ways of bringing care in a more
effective way to patients.
So the first thing here is that the AI was performing better
than the manual grader.
And then the second thing is to think
about that base of patients.
How do we truly democratize care?
And so the other encouraging piece from the pilot
was this idea that we could start
to increase the base of patients treated with the algorithm.
Now as it turns out, I would love
to say that it's really easy to do everything in health
care and life science.
But as it turns out, it takes a huge village
to do this kind of work.
So what's next?
What is on the path to clinical adoption?
And this is what makes it incredibly exciting
to be a doctor working with so many talented technologists
and engineers.
We need to now partner with different clinical sites
that I noted here.
We also partner deeply with the FDA,
as well as regulatory agencies in Europe and beyond.
And one thing at Verily that we've decided to do
is to be part of what's called the FDA precertification
program.
We know that bringing new technologies and new algorithms
into health care is critical, but we now
need to figure out how to do that in a way that's
both safe and effective.
And I'm proud of us at Alphabet for really
staying ahead of that and partnering
with groups like the FDA.
The second thing that's important to note
is that we partner deeply at Verily
with Google as well as other partners like Nikon and Optus.
All of these pieces come together
to try to transform care.
But I know that if we do this correctly,
there's a huge opportunity not only in diabetes but really
in this entire world of health information.
It's interesting to think about it
as a physician who spends most of my time taking care
of patients in the hospital, how can we
start to push more of the access to care
outside of the hospital?
But I know that if we do this well
and if we stay ahead of it, we can close this gap.
We can figure out ways to become more preventative.
We can collect the right information.
We can create the infrastructure to organize it.
And most importantly, we will figure out how to activate it.
But I want everyone to know here,
this is not the type of work that we can do alone.
It really takes all of us together.
And we at Verily, we at Google, and we at Alphabet
look forward to partnering with all of you.
So please help us on this journey.
Lily and I will be here after these talks.
We're happy to chat with all of you.
And thank you for spending time at I/O.
[MUSIC PLAYING]