字幕表 動画を再生する
FEMALE SPEAKER: Please join me in
welcoming Mr. Kenneth Cukier.
[APPLAUSE]
KENNETH CUKIER: Thank you very much.
You can probably appreciate the fact that I've got a lot
of trepidation coming here to talk to you folks for the
obvious reason that I'm wearing a suit.
And the truth is I had a breakfast this morning at the
Council on Foreign Relations to talk to them about the
international implications and the foreign-policy
implications of big data.
That leads to the second trepidation and the context of
my remarks.
So the second trepidation is that this is a sort of
homecoming for the book.
Because my journey, so to speak, in the world of big
data started at Google and started at the
Googleplex in 2009.
It was you folks who opened up the kimono to what you were
doing in very small little slivers.
I never got the full picture.
But I was able to cobble it all together and see something
and then give it a label to it.
Luckily, there was a couple of labels that we
were thinking of.
And I reached for one that wasn't a popular term at the
time, and the term was big data.
And that was really helpful.
It was the cover story of "The Economist"
in February of 2010.
It was called "The Data Deluge", because they thought
they would sell it better than saying "big
data." But big data--
it was basically all about that and about what
you guys are doing.
And so it brings me great fear to walk into a room, because
you guys have been doing it for so long.
And that brings me into the context of my
conversation today.
I want it to be a conversation.
I was obviously just at the Council on Foreign Relations
thinking about this in ways that I am sure your engineers
never thought about it 10 years ago.
I may have heard a snort.
But here's the thing.
Many of you were thinking of it as a technological issue
when people around the world think of it in terms of the
competitivity of nations.
Our book, which is being released today in America, has
already been available in China, where it's been a
best-seller.
And when we hear questions from Chinese journalists to
us, they're all talking about the national project that
they're on.
Is this the way for us to leapfrog with the West?
Is this one area of technology, unlike the
internet and computing, where we can lead?
So the implications of this are vast.
And the implications are more than just technological.
I'm at a technology company-- in fact, the pioneer, in many
respects, of big data.
But I want to explain that I'm here as a journalist, as
someone who's looked in at your world and now can serve
as a sort of a filter.
And what I'd like to do is show you that world from a
non-engineer's perspective, from someone who just is
curious about the world and society and thinks deeply
about these issues.
Now there's a second disclosure I have to make, and
that is not only am I talking about big data, but my
presentation is big data.
Because there's 70 slides.
On top of it, I haven't actually really seen the
slides except for once or twice, because they just
arrived to my inbox this morning from someone who was
putting it together for me.
This is actually the recipe for disaster, so please have
forbearance.
I'm going to go really quickly, and I'm probably
going to skip through a couple of these slides.
So let me start with a story, and the story is the story of
a company called Farecast.
And it begins in the year 2003.
A guy named Oren Etzioni at the University of Washington
is on an airplane.
And he asks people how much they paid for the seats.
And it turns out, of course, for one person paid one fare,
and one person paid another fare.
But this made Oren Etzioni really, really upset.
And the reason why is that he took the time to book his air
ticket long in advance, figuring he was going to pay
the least amount of money.
Because that's the way the system worked.
And then he realized actually that that wasn't the case.
When he figured that out, he was really upset.
And he figured, if only I could knew what is the meaning
behind airfare madness.
How would I know if a price I'm being presented with at an
online travel site is a good one or a bad one?
And then he came up with the insight.
Because he's like you-- he's a computer scientist--
he realized actually--
that's actually just an information problem.
And I bet I can get the information.
All I would need is one simple thing--
the flight price record of every single flight in
commercial aviation in the United States for every single
route, every flight, and to identify every seat, and to
identify how long in advance the ticket was bought for the
departure, and what price was paid, and just run it through
a couple computers, and then make a prediction on whether
the price is likely to rise or fall, and score my degree of
confidence in the prediction.
Pretty simple.
So he scraped some data.
And it works pretty well.
And he runs a system.
It's great.
The academic paper that he writes is called "Hamlet--
To Buy or Not To Buy, That Is the Question." It works well,
but then he realizes, hey, this works so well, I'm going
to get more data.
And he gets more data, until he has 20 billion flight-price
records that he's crunching to make his prediction.
And now it works really well.
Now it's saving customers a lot of money.
It gets a little bit of traction, and Microsoft comes
knocking on the door.
He's in Washington.
He sells it for about $100 million--
not bad for a couple years work, and a couple PhDs in
computer science that was working with him.
But behind this, the key thing is this.
He took data that was generated for one purpose and
reused it for another.
When the Sabre database--
at the time probably the airline reservation system and
one of the biggest, actually the biggest civilian computer
project at its time when it was created in
the '50s and '60s--
was created by American Airlines and IBM.
They never imagined for a million years that the data of
the passenger manifest was going to become the raw
material for a new business, and a new source of value, and
a new form of economic activity.
And we're going to be creating markets with this data.
And if you want to understand what big data is, at least
from a person looking into it--
because Google's been doing big data for a long time.
What we're seeing across society is what you folks have
been doing for years.
We're seeing that data is becoming a new
raw material of business.
It is the oil, if you will, of the information economy.
There's a lot of data around in the world today.
You know this.
The arresting statistics are obvious.
Whenever we put on a big new sky survey--
telescope for you and me-- goes online.
Whenever it goes online, it usually ends up collecting as
much data in the first night or two as in the history of
astronomy prior to it going online.
And obviously, the human genome, et cetera.
You all know the data about big data, so I won't spend too
much time there.
But what we see behind big data are three features of
society, or shifts in the way that we think about
information in the world--
more, messy, and correlations.
So more.
We're going from an environment where we've always
been information-starved--
we've never had enough information--
to one where we-- that's no longer the operative
constraint.