字幕表 動画を再生する
Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér.
In an earlier episode, we showcased a technique for summarizing images not in a word, but
an entire sentence that actually makes sense. If you were spellbound by those results, you'll
be out of your mind when you hear this one: let's turn it around, and ask the neural network
to have a sentence as an input, and ask it to generate images according to it. Not fetching
already existing images from somewhere, generating new images according to these sentences. Create
new images according to sentences. Is this for real?
This is an idea, that is completely out of this world. A few years ago, if someone proposed
such an idea and hoped that any useful result can come out of this, that person would have
immediately been transported to an asylum.
An important keyword here is "zero shot" recognition. Before we go to the zero part, let's talk
about one shot learning. One shot learning means a class of techniques that can learn
something from one, or at most a handful of examples. Deep neural networks typically require
to see hundreds of thousands of mugs before they can learn the concept of a mug. However,
if I show one mug to any of you Fellow Scholars, you will, of course, immediately get the concept
of a mug. At this point, it is amazing what these deep neural networks can do, but with
the current progress in this area, I am convinced that in a few years, feeding millions of examples
to a deep neural network to learn such a simple concept will be considered a crime.
Onto zero shot recognition! The zero shot is pretty simple - it means zero training
samples. But this sounds preposterous! What it actually means is that we can train our
network to recognize birds, tiny things, what the concept of blue is, what a crown is, but
then we ask it to show us an image of "a tiny bird with a blue crown".
Essentially, the neural network learns to combine these concepts together and generate
new images leaning on these learned concepts.
I think this paper is a wonderful testament as to why Two Minute Papers is such a strident
advocate of deep learning and why more people should know about these extraordinary works.
About the paper - it is really well written, there are quite a few treats in there for
scientists: game theory and minimax optimization, among other things. Cupcakes for my brain.
We will definitely talk about these topics in later Two Minute Papers episodes, stay
tuned! But for now, you shouldn't only read the paper - you should devour it.
And before we go, let's address the elephant in the room: the output images are tiny because
this technique is very expensive to compute. Prediction: two papers down the line, it will
be done in a matter of seconds, two even more papers down the line, it will do animations
in full HD. Until then, I'll sit here stunned by the results, and just frown and wonder.
Thanks for watching, and for your generous support, and I'll see you next time!