字幕表 動画を再生する
Hello, I'm Martin Kronberg and welcome to the IOT Developer
In these episodes, we're taking a deep dive
into OpenVINO, Intel's new toolkit for AI
and computer vision development.
In the previous episode, we gave a high level
overview of OpenVINO.
And in this one, we will take a look
at OpenVINO in much more detail.
First things first, let's take a look
at the OpenVINO architecture.
OpenVINO is a set of tools designed
to help you build smart video applications.
These tools can be broken down in two parts, a deep learning
deployment toolkit and a traditional computer vision
The deep learning toolkit consists
of the inference engine which runs the deep learning model,
a model optimizer used to convert and optimize
existing models from other frameworks,
and a set of prebuilt deep learning models.
Also included are libraries to optimize the running of models
with the Math Kernal Library for deep neural networks
and the compute library for deep neural networks
to optimize deep neural networks on CPU and GPU, respectively.
On the other hand, the traditional computer vision
toolkit consists of OpenCV 3.3, which
is a popular library for computer vision,
the Intel Media SDK used to leverage fast hardware
encode and decode of video and OpenCL drivers and runtimes
in order to access the onboard Intel GPU effectively.
In order to give you guys a better understanding of how
all of these features work together,
I want to walk you through a sample deep neural network
computer vision work flow.
Let's say that I have a specific computer vision
application in mind.
And that I want to use OpenVINO.
The first thing I can do is look online
to see what pretrained models exist for me to use.
If you can find a pretrained model that meets your needs,
it's going to save you a lot of time, versus having
to train one yourself.
I can go under software at Intle.com/OpenVINO and look
at all the models available.
We have models that detect people,
license plates, road side objects,
even models that detect emotion on faces,
like we saw in the last episode.
If one of these does not fit my needs,
I can search for more models from any
of the popular frameworks including Caffe, TensorFlow,
or MXNet.
OpenVINO has a tool called the model downloader which
is a script that pulls all the necessary files for a model,
including topology, weights, and labels
and makes sure that their naming conventions are
compatible with the model optimizer.
Once I have that model downloaded,
I can use the model optimizer, which is a Python script,
to convert the model into the intermediate representation
format that the OpenVINO inference engine uses.
For this workflow example, let's say
that I'm building out a people tracker
and the pedestrian tracking model works for me.
So I'm going to use that.
Now that I have an idea of the model that I'm using,
how should I go about developing an OpenVINO app?
The first thing to think about is what ID you will be using.
While there are many options, I would
suggest trying Intel System Studio 2019.
In this newly released version of our development platform
integration with OpenVINO is super simple.
And if you have used eclipse based IDs in the past,
it'll be very familiar to you.
In addition to the debugging capabilities
that you get with Intel System Studios,
you can also use it to leverage VTune, a powerful Performance
If you want to optimize your application
VTunes give you a lot of insight into how
various process threads are performing on the CPU.
In fact, it can even tell you how
the various layers of your inference model are running.
So if you have a bottleneck happening
on one particular layer, a convolutional layer,
for example, you can work to reduce its complexity
or even send it to the GPU for processing to increase
overall performance.
After you have your ID set up, I would
say go on our GitHub to explore some of the reference
implementations there.
To get an idea of how to leverage this model,
I could take a look at the store traffic monitor, reference
implementation or installation and deployment information.
Right now the sample is using OpenCV and FFMpeg
to do the video stream encoding and decoding.
However, I could use the Media SDK encode/decode functionality
to get a more optimized performance.
What decoding does is transform an MP4 or other video format
into pixel value arrays for each frame.
I'm going be doing every operation on the image
on a frame by frame basis.
Now before I can run my inference model,
I want to do some image preprocessing,
let's say denoise, convert to grayscale, or resize.
I would use OpenCV to perform all those initial image
transformations on each of my frames.
Next, I will put my process frame
into the inference engine using the model I found earlier.
This will analyze the image and will
identify people in the frame as well as
the bounding boxes around them.
Now, if I want to display those labels
and bounding boxes on my image, I
will use OpenCV again to draw that information
onto the frame.
And finally, I want to encode all of those frames
into an output video file or stream.
Once again, I can either use OpenCV and FFMpeg or Media SDK.
Now, let's take a look at the end result
of that whole pipeline.
Here we see a retail environment where
we can keep track of people as they enter and leave.
We can also track inventory by seeing when people pick bottles
off of a shelf.
And that's it for today's episode.
Tune in two weeks and we'll discuss the various hardware
kits that you can use to develop OpenVINO applications.
Thanks for watching.
Follow the links provided.
And see you guys next time.