字幕表 動画を再生する
>> [Narrator] Live from New York, it's The Cube
covering the IBM Machine Learning Launch Event
brought to you by IBM.
Here are your hosts, Dave Vellante and Stu Miniman.
>> Good morning everybody, welcome to the Waldorf Astoria.
Stu Miniman and I are here in New York City,
the Big Apple,
for IBM's Machine Learning Event #IBMML.
We're fresh off Spark Summit, Stu,
where we had The Cube, this by the way is The Cube,
the worldwide leader in live tech coverage.
We were at Spark Summit last week,
George Gilbert and I,
watching the evolution of so-called big data.
Let me frame, Stu, where we're at
and bring you into the conversation.
The early days of big data were all about
offloading the data warehouse and reducing the cost
of the data warehouse.
I often joke that the ROI of big data
is reduction on investment, right?
There's these big, expensive data warehouses.
It was quite successful in that regard.
What then happened is we started to throw
all this data into the data warehouse.
People would joke it became a data swamp,
and you had a lot of tooling
to try to clean the data warehouse
and a lot of transforming and loading
and the ETL vendors started to participate there
in a bigger way.
Then you saw the extension of these data pipelines
to try to more with that data.
The Cloud guys have now entered in a big way.
We're now entering the Cognitive Era,
as IBM likes to refer to it.
Others talk about AI and machine learning
and deep learning,
and that's really the big topic here today.
What we can tell you, that the news goes out
at 9:00am this morning, and it was well known
that IBM's bringing machine learning
to its mainframe, z mainframe.
Two years ago, Stu, IBM announced the z13,
which was really designed to bring
analytic and transaction processing together
on a single platform.
Clearly IBM is extending the useful life
of the mainframe by bringing things like Spark,
certainly what it did with Linux
and now machine learning into z.
I want to talk about Cloud, the importance of Cloud,
and how that has really taken over the world of big data.
Virtually every customer you talk to now
is doing work on the Cloud.
It's interesting to see now
IBM unlocking its transaction base,
its mission-critical data,
to this machine learning world.
What are you seeing around Cloud and big data?
>> We've been digging into this big data space
since before it was called big data.
One of the early things that really got me
interested and exciting about it is,
from the infrastructure standpoint,
storage has always been one of its costs
that we had to have,
and the massive amounts of data,
the digital explosion we talked about,
is keeping all that information
or managing all that information
was a huge challenge.
Big data was really that bit flip.
How do we take all that information
and make it an opportunity?
How do we get new revenue streams?
Dave, IBM has been at the center of this
and looking at the higher-level pieces
of not just storing data, but leveraging it.
Obviously huge in analytics, lots of focus
on everything from Hadoop and Spark and newer technologies,
but digging in to how they can leverage up the stack,
which is where IBM has done a lot of acquisitions
in that space and leveraging that
and wants to make sure that they have a strong position
both in Cloud, which was renamed.
The soft layer is now IBM Bluemix
with a lot of services
including a machine learning service
that leverages the Watson technology
and of course OnPrem they've got the z
and the power solutions
that you and I have covered for many years
at the IBM Med show.
>> Machine learning obviously heavily leverages models.
We've seen in the early days of the data,
the data scientists would build models
and machine learning allows those models
to be perfected over time.
So there's this continuous process.
We're familiar with the world of Batch
and then some mini computer brought in
the world of interactive,
so we're familiar with those types of workloads.
Now we're talking about a new emergent workload
which is continuous.
Continuous apps where you're streaming data in,
what Spark is all about.
The models that data scientists are building
can constantly be improved.
The key is automation, right?
Being able to automate that whole process,
and being able to collaborate
between the data scientist, the data quality engineers,
even the application developers
that's something that IBM really tried to address
in its last big announcement in this area
of which was in October of last year
the Watson data platform,
what they called at the time the DataWorks.
So really trying to bring together
those different personas
in a way that they can collaborate together
and improve models on a continuous basis.
The use cases that you often hear in big data
and certainly initially in machine learning
are things like fraud detection.
Obviously ad serving has been a big data application
for quite some time.
In financial services, identifying good targets,
identifying risk.
What I'm seeing, Stu, is that the phase that we're in now
of this so-called big data and analytics world,
and now bringing in machine learning and deep learning,
is to really improve on some of those use cases.
For example, fraud's gotten much, much better.
Ten years ago, let's say, it took many, many months,
if you ever detected fraud.
Now you get it in seconds, or sometimes minutes,
but you also get a lot of false positives.
Oops, sorry, the transaction didn't go through.
Did you do this transaction?
Yes, I did.
Oh, sorry, you're going to have to redo it
because it didn't go through.
It's very frustrating for a lot of users.
That will get better and better and better.
We've all experienced retargeting from ads,
and we know how crappy they are.
That will continue to get better.
The big question that people have
and it goes back to Jeff Hammerbacher,
the best minds of my generation
are trying to get people to click on ads.
When will we see big data really start
to affect our lives in different ways
like patient outcomes?
We're going to hear some of that today
from folks in health care and pharma.
Again, these are the things that people are waiting for.
The other piece is, of course, IT.
What you're seeing, in terms of IT,
in the whole data flow?
>> Yes, a big question we have, Dave, is
where's the data?
And therefore, where does it make sense
to be able to do that processing?
In big data we talked about you've got
masses amounts of data,
can we move the processing to that data?
With IT, the day before, your RCTO talked that
there's going to be massive amounts of data at the edge
and I don't have the time or the bandwidth
or the need necessarily
to pull that back to some kind of central repository.
I want to be able to work on it there.
Therefore there's going to be a lot of data
worked at the edge.
Peter Levine did a whole video talking about how,
"Oh, Public Cloud is dead, it's all going to the edge."
A little bit hyperbolic to the statement
we understand that there's plenty use cases
for both Public Cloud and for the edge.
In fact we see Google big pushing machine learning
TensorFlow,
it's got one of those machine learning frameworks out there
that we expect a lot of people to be working on.
Amazon is putting effort into the MXNet framework,
which is once again an open-source effort.
One of the things I'm looking at the space,
and I think IBM can provide some leadership here
is to what frameworks are going to become popular
across multiple scenarios?
How many winners can there be for these frameworks?
We already have multiple programming languages,
multiple Clouds.
How much of it is just API compatibility?
How much of work there,
and where are the repositories of data going to be,
and where does it make sense to do that
predictive analytics, that advanced processing?
>> You bring up a good point.
Last year, last October, at Big Data CIV,
we had a special segment of data scientists
with a data scientist panel.
It was great.
We had some rockstar data scientists on there
like Dee Blanchfield and Joe Caserta,
and a number of others.
They echoed what you always hear
when you talk to data scientists.
"We spend 80% of our time messing with the data,
"trying to clean the data, figuring out the data quality,
"and precious little time on the models
"and proving the models
"and actually getting outcomes from those models."
So things like Spark have simplified that whole process
and unified a lot of the tooling
around so-called big data.
We're seeing Spark adoption increase.
George Gilbert in our part one and part two last week
in the big data forecast from Wikibon
showed that we're still not on the steep part
of the Se-curve, in terms of Spark adoption.
Generically, we're talking about streaming as well
included in that forecast,
but it's forecasting that increasingly those applications
are going to become more and more important.
It brings you back to what IBM's trying to do
is bring machine learning
into this critical transaction data.
Again, to me, it's an extension of the vision
that they put forth two years ago,
bringing analytic and transaction data together,
actually processing within that Private Cloud complex,
which is what essentially this mainframe is,
it's the original Private Cloud, right?
You were saying off-camera,
it's the original converged infrastructure.
It's the original Private Cloud.
>> The mainframe's still here, lots of Linux on it.
We've covered for many years,
you want your cool Linux docker, containerized,
machine learning stuff,
I can do that on the Zn-series.
>> You want Python and Spark and Re and Papa Java,
and all the popular programming languages.
It makes sense.
It's not like a huge growth platform,
it's kind of flat, down, up in the product cycle
but it's alive and well
and a lot of companies run their businesses
obviously on the Zn.
We're going to be unpacking that all day.
Some of the questions we have is,
what about Cloud?
Where does it fit?
What about Hybrid Cloud?
What are the specifics of this announcement?
Where does it fit?
Will it be extended?
Where does it come from?
How does it relate to other products
within the IBM portfolio?
And very importantly, how are customers
going to be applying these capabilities
to create business value?
That's something that we'll be looking at
with a number of the folks on today.
>> Dave, another thing, it reminds me of two years ago
you and I did an event with the MIT Sloan school
on The Second Machine Age
with Andy McAfee and Erik Brynjolfsson
talking about as machines can help
with some of these analytics,
some of this advanced technology,
what happens to the people?
Talk about health care,
it's doctors plus machines most of the time.
As these two professors say,
it's racing with the machines.
What is the impact on people?
What's the impact on jobs?
And productivity going forward,
really interesting hot space.
They talk about everything from autonomous vehicles,
advanced health care and the like.
This is right at the core of where the next generation
of the economy and jobs are going to go.
>> It's a great point,
and no doubt that's going to come up today
and some of our segments will explore that.
Keep it right there, everybody.
We'll be here all day covering this announcement,
talking to practitioners,
talking to IBM executives and thought leaders
and sharing some of the major trends
that are going on in machine learning,
the specifics of this announcement.
Keep it right there, everybody.
This is The Cube.
We're live from the Waldorf Astoria.
We'll be right back.