字幕表 動画を再生する
BIOGRAPHY
BIOGRAPHY. >> WELL, THE CHAT IS VERY SMALL,
I WANTED TO SEE YOU AND HEAR YOU, HALF THE TIME, I SET UP MY
MIC INCORRECTLY. WELCOME TO A SPECIAL FRIDAY TRAINING EPISODE,
I'M DAN SHIFFMAN, THERE'S A LOT OF THINGS THAT ARE EXCITING
ABOUT THIS EPISODE. SO SOMEBODY SAID HI, SO I'M
GOING TO TAKE THAT AS THINGS ARE WORKING. AND SO TODAY IS A
SPECIAL EPISODE. NUMBER ONE, WE ARE DOING A MACHINE LEARNING
PROJECT FROM START TO FINISH, TRAINING A MODEL ENTIRELY IN THE
CLOUD, GETTING THAT TRAINING MODEL BACK, AND THEN
IMPLEMENTING THAT MODEL IN THE BROWSER USING JAVASCRIPT. SO
ALL THOSE PIECES, THAT IS GOING TO HAPPEN, AND THE WHOLE THING
IS GOING TO TAKE AN HOUR AND A HALF. TO PRESENT ALL OF THIS TO
YOU, WE HAVE A GUEST. YINING SHI, YOU MIGHT REMEMBER HER FROM
THE CODING TRAIN TUTORIAL THAT SHE MADE, I WILL LINK HER, SHE
IS AN ARTIST AND RESEARCHER, A CORE CONTRIBUTOR FOR THE MACHINE
LEARNING 5 LIBRARY, THE ML5.JS LIBRARY, PART OF THIS TUTORIAL,
SHE WROTE THE STYLE TRANSFER MODULE OF ML5.JS, AND THAT IS
WHAT SHE IS GOING TO DO AND PRESENT. SO YINING WILL BE HERE
IN A MINUTE, AFTER MY LONG INTRODUCTION. AND THIS VIDEO IS
SPONSORED BY SPELL, SPELL IS A CLOUD COMPUTING FOR MACHINE
LEARNING SERVICE. I DID AN INTRODUCTION TO SPELL, HOW TO
SET IT UP, WHAT IT DOES, WHAT ARE THE BASIC COMMANDS. IF YOU
ARE WATCHING THIS AS AN ARCHIVE, YOU MIGHT WANT TO WATCH IT FIRST
AND RETURN. IF YOU ARE WATCHING THIS LIVE, HAVE NOT SEEN THAT,
WE WILL HELP YOU GET SET UP WITH THAT. IF YOU WANT TO SIGN UP
FOR AN CCOUNT AND FOLLOW ALONG, YOU CAN GET $100 IN FREEBIES.
YOU CAN GO TO SPELL.RUN/CODINGTRAIN.
OKAY, AND ALSO, THANK YOU TO
SPELL, I'M -- SO WE HAVE CLOSED CAPTIONING, FOR THE FIRST TIME,
I'M USING
REALP TIME HUMAN WRITTEN CAPTIONED GENERATED BY WHITE
COAT CAPTIONING. YOUTUBE HAS AUTO CAPTIONS, BUT THIS IS TYPED
BY A PROFESSIONAL CAPTIONER IN REALTIME AS I'M SPEAKING, I
THINK. THIS REMINDS ME OF THE ELEPHANT
AND PIGGY BOOK, WHICH IS -- YOU ARE IN A BOOK, AND THE
CHARACTERS -- THEY CAN MAKE THE CHARACTERS SAY WHATEVER THEY
WANT. I CAN MAKE THE CAPTIONER TYPE BLUEBERRY, MANGO,
WATERMELON, THOSE WORDS SHOULD BE APPEARING. SO THANK YOU TO
SPELL.RUN FOR THE SPONSORSHIP, THANK YOU TO YINING FOR BEING
HERE, AND THANK YOU TO WHITE COAT CAPTIONING FOR THE
CAPTIONING SERVICES AND TO SPELL FOR PROVIDING THE FUNDS FOR
THOSE. AND I WILL BE TO THE SIDE LOOKING FOR THE YOUTUBE
CHAT, I WILL TRY TO ANSWER THEM, AND MOSTLY WE WILL SAVE
QUESTIONS UNTIL THE END. IF THERE IS AN IMPORTANT KEY
QUESTION, I MIGHT INTERRUPT AND ASK THAT. AND ONE OTHER THING:
YINING WILL TELL YOU ABOUT THIS, BUT I CANNOT RESIST. TO TRAIN A
STYLE TENSOR MODEL ON THE CLOUD, WITH A GPU, IT TAKES A LONG
TIME. SO WE ARE LIKE A COOKING SHOW MECHANIC, WE'RE GOING TO
START THE TRAINING PROCESS AND THEN HAVE THE PRE-TRAINED MODEL
IN THE OVEN, FULLY BAKED, TO SHOW YOU HOW IT WORKS. IF YOU
WATCH THIS TUTORIAL, YOU WILL BE ABLE TO TRAIN YOUR OWN STILL
TRANSFER MACHINE LEARNING MODEL USING SPELL.RUN, AND IMPLEMENT
THAT MODEL IN THE
BROWSER. OKAY, SO I'M LOOKING IN THE
CHAT. THAT IS ALL OF MY INTRODUCTORY STUFF, YES. SO I
AM JUST GOING TO TRANSFER IT OVER TO YINING, I'M GOING TO
MUTE MY MICROPHONE, I WILL UNMUTE IT ONCE IN A WHILE ONCE I
HAVE SOMETHING IMPORTANT TO SAY. AND WE WILL GET STARTED, OKAY?
SPEAKER: THANK YOU SO MUCH. HI, I'M YINING, AND I'M EXCITED TO
BE HERE TODAY TO TALK ABOUT
STYLE TRANSFER. HERE.
AND I WANT TO THANK EVERYONE FOR WATCHING THIS VIDEO.
I HOPE YOU ENJOY THIS VIDEO.
LET'S GET STARTED! TODAY, WE ARE GOING TO TALK
ABOUT STYLE TRANSFER. WE ARE GOING TO DO FOUR THINGS
TODAY. WE WILL TALK ABOUT WHAT IS STYLE
TRANSFER, HOW DOES IT WORK, AND WE ARE GOING TO A PLATFORM
CALLED SPELL TO TRAIN A NEW STYLE TRANSFER MODEL, AND PORT
THE MODEL INTO ML5.JS TO CREATE A AN INTERACTIVE DEMO.
SPELL AND ML5JS ARE BOTH TOOLS
THAT MAKE ML MORE APPROACHABLE FOR A BROAD RANGE OF AUDIENCE.
FOR OUR PROJECT TODAY, ML5JS ALLOWS US TO RUN OUR MODEL IN
THE BROWSER. BY THE WAY, ML5.JS IS A
JAVASCRIPT LIBRARY BASED ON TENSORFLOW.JS. SO OUR MODEL
THAT WE HAVE TODAY WOULD ALSO WORK IN THE
TENSORFLOW.JS. AND SPELL PROVIDES COMPUTING
POWERS FOR US TO TRAIN A MODEL FASTER.
IF I TRAIN THE MODEL ON MY OWN
LAPTOP, IT MIGHT TAKE SEVERAL
DAYS, BUT WITH THE REMOTE GPU PROVIDED SPELL, IT WILL ONLY
TAKE A FEW HOURS. LET ME SHOW YOU WHAT ARE WE
GOING TO BUILD AT THE END OF THIS VIDEO.
THIS IS A DEMO: HTTPS://YINING1023.GITHUB.IO/
STYLETRANSFER_SPELL/. THIS DEMO READS THE IMAGES IT
GETS FROM OUR WEBCAM, AND TRANSFER THE STYLE OF THE IMAGE
INTO THE STYLE OF THIS ART WORK. THE STYLE IMAGE IS AN ANCIENT
CHINESE PAINTING CALLED FUCHUN SHANJU TU.
THE STYLE IMAGE DOESN'T HAVE TOO
MANY COLORS, BUT IF YOU TRAIN THE MODEL WITH OBVIOUS STYLE, IF
YOU USE THOSE STYLE IMAGES, YOU WILL GET A MORE OBVIOUS
RESULT. THIS IS THE DEMO THAT WE ARE
GOING TO BUILD
TODAY. BEFORE WE BUILD ANYTHING,
WHAT IS STYLE TRANSFER? STYLE TRANSFER IS THE TECHNIQUE
OF RECAST THE CONTENT OF ONE IMAGE IN THE STYLE OF ANOTHER
IMAGE.
FOR EXAMPLE, HERE IS A
PHOTOGRAPH, THIS TECHNIQUE CAN EXTRACT THE CONTENT OF THE
PHOTO, AND THE STYLE OF THIS ART
WORK, AND COMBINE THE TWO TO CREATE A NEW IMAGE.
HERE ARE MORE EXAMPLES.
SO, HOW DOES IT WORK? STYLE TRANSFER WAS FIRST
INTRODUCED IN THE PAPER A NEURAL ALGORITHM OF ARTISTIC STYLE IN
2015 BY GATYS.
IN THE PAPER, THEY PROPOSED A SYSTEM THAT USES CONVOLUTIONAL
NEURAL NETWORKS TO SEPARATE AND
RECOMBINE CONTENT AND STYLE OF ARBITRARY IMAGES.
BY THE WHY, AN
COLUSIONAL NEURAL NETWORK IS A DEEP NEURAL NETWORK USED TO
ANALYZE IMAGES. THE IDEA IS THAT IF WE TAKE A
CONVOLUTIONAL NEURAL NETWORK THAT IS TRAINED TO RECOGNIZE
OBJECTS WITHIN IMAGES THEN THAT
NETWORK HAS DEVELOPED SOME
INTERNAL REPRESENTATIONS OF THE CONTENT AND STYLE OF AN IMAGE.
MORE IMPORTANTLY, THE PAPER
FINDS THAT THE REPRESENTATIONS
IN THE IMAGE CAN BE
SEPARATED. WE CAN TAKE THE CONTENT AND
STYLE IN ONE
IMAGE AND ARE SEPARABLE, WHICH MEANS WE CAN TAKE THE CONTENT
REPRESENTATION FROM ONE IMAGE AND STYLE REPRESENTATION FROM
ANOTHER TO GENERATE A BRAND NEW IMAGE.
THE CNN THAT GATYS USED IS CALLED VGG. IT'S A NETWORK
CREATED BY THE VISUAL GEOMETRY GROUP AT OXFORD UNIVERSITY.
THIS CNN IS THE WINNER OF IMAGENET, AN OBJECT RECOGNITION
CHALLENGE IN 2014.
WE WILL SEE THE NAME VGG AGAIN WHEN WE TRAIN THE MODEL.
THAT'S BECAUSE WE ARE TRYING TO
GET REPRESENTATIONS OF AN IMAGE
FROM THIS VGG CONVOLUTIONAL
NEURAL NETWORK. NEXT, CONVOLUTIONAL NEURAL
NETWORKS LOOK LIKE
FILTERS, DIFFERENT LAYER HAS DIFFERENT REPRESENTATIONS OF AN
IMAGE. AN INPUT IMAGE CAN BE
REPRESENTED AS A SET OF FILTERED IMAGES AT EACH LAYER IN THE CNN.
WE CAN VISUALISE THE INFORMATION AT DIFFERENT LAYERS IN THE CNN
BY RECREATE THE INPUT IMAGE FROM ONE OF THE FILTERED IMAGE.
WE CAN SEE THAT IMJ
A, B, C, D, E ARE THE RECREATED IMAGES.
THEY ARE ALMOST
PERFECT. AS THE LEVEL GETS HIGHER AND
HIGHER, ALL OF THOSE DETAILED PIXEL INFORMATION IS LOST, BUT
THE HIGH LEVEL CONTENT OF THIS IMAGE IS STILL HERE.
FOR EXAMPLE, FOR THIS IMAGE E HERE, GIVEN THAT WE CANNOT SEE
IT CLEARLY, BUT WE CAN SEE THAT -- HERE'S A HOUSE, THIS IMAGE.
SO THIS IS HOW CONTENT REPRESENTATION LOOKS LIKE IN
THIS NETWORK. NEXT, WE WILL TALK ABOUT STYLE
REPRESENTATION.
ON TOP OF THE ORIGINAL CNN CONVOLUTIONAL NEURAL NETWORK,
REPRESENTATIONS THEY BUILT A NEW FEATURE SPACE THAT CAPTURES THE
STYLE OF AN INPUT IMAGE. THE STYLE REPRESENTATION
COMPUTES CORRELATIONS BETWEEN THE DIFFERENT FEATURES IN
DIFFERENT LAYERS OF THE CNN.
FOR DETAILED IMPLEMENTATION, WE CAN CHECK THE PAPER.
BUT AS THE LEVEL GETS HIGHER AND HIGHER, WE FIND THAT THEY
RECREATE THE STYLE OF THE INPUT IMAGE FROM STYLE
MATCHES THIS ARTWORK BETTER AND BETTER, BUT THE INFORMATION OF
THE GLOBAL ARRANGEMENT OF THE SCENE IS LOST.
FOR EXAMPLE, FOR IMAGE D AND E, THE STYLE IS VERY CLEAR TO US
NOW. BUT WE CANNOT SEE IF THERE'S A HOUSE ON THIS PHOTO
ANYMORE, BECAUSE THE CONTENT REPRESENTATION IS
LOST. AND THEN AFTER WE HAVE THE
CONTENT REPRESENTATION OF THE PHOTO, AND THE STYLE
REPRESENTATION OF THIS ART WORK, AND WE'RE GOING TO SYNTHESIZE
A NEW IMAGE THAT CAN MATCH THOSE TWO AT THE SAME TIME. THIS IS
HOW STYLE TRANSFER
WORKS. AND JANE COGEN, THE CREATOR OF
MACHINE LEARNING FOR ARTISTS, HE MAKES THIS AMAZING DEMO VIDEO
THAT TALKS ABOUT WHAT'S A CONVOLUTIONAL NEURAL NETWORK,
AND HOW IT SEES EACH LAYER,O YOU SO YOU WILL HAVE A BETTER
UNDERSTANDING OF HOW THIS CONVOLUTIONAL NEURAL NETWORK
SEES IMAGES AND HOW IT FILTERS OUT THE IMAGE AND GETS THE
REPRESENTATION OUT OF ONE IMAGE AFTER WATCHING HIS
VIDEO. O I HIGHLY RECOMMEND THAT YOU
WATCH HIS
VIDEO. AND GATYS'S PAPER OPENED UP A
NEW AREA OF RESEARCH, AND DIFFERENT KINDS OF TRANSFER
APPEARED IN THE LAST THREE YEARS.
WE ARE GOING TO TAKE A LOOK AT A FEW OF THEM HERE.
AND THEN WE ARE GOING
TO DIVE INTO TRAINING YOUR STYLE TRANSFER MODEL WITH
SPELL. IN 2016, THIS
PAPER CAME OUT. IT IS CALLED A FAST STYLE
TRANSFER, IT SHOWS THAT A NEURAL NETWORK CAN APPLY A FIXED STYLE
TO ANY INPUT IMAGE IN
REALTIME. IT BUILDS ON THE
GATYS STYLE TRANSFER MODEL, BUT IT IS A LOT FASTER. THIS FAST
STYLE
TRANSFER HAS AN IMAGE TRANSFORMATION NETWORK AND A
LOSS CALCULATION NETWORK TO TRAIN THIS NETWORK. WE NEED TO
PICK A FIXED STYLE IMAGE AND USE A LARGE BATCH OF DIFFERENT
CONTENT IMAGE AS TRAINING EXAMPLES. SO, IN THEIR PAPER,
THEY TRAINED THEIR NETWORK, THIS MICROSOFT COCO DATASET, WHICH IS
AN OBJECT RECOGNITION DATASET OF 18,000
IMAGES. TODAY, WE WILL USE A TENSORFLOW IMPLEMENTATION OF
THIS STYLE TRANSFER, SO WE ARE ALSO GOING TO USE THIS COCO
DATASET. WE ARE GOING TO DOWNLOAD THIS DATASTYLE LATER.
AND HERE IS AN IMAGE FROM
THEIR PAPER. THIS IS THE ORIGINAL PHOTO, THIS
GATYS RESULT AND THIS IS THE STYLE TRANSFER RESULT AND IT
WORKS A LOT FASTER. AND THE NEXT STYLE TRANSFER IS
FOR VIDEOS. THIS MODEL CAME OUT IN 2016,
TOO. WE MAY THINK WE KNOW HOW TO
TRANSFER IMAGES, FOR VIDEOS, WE CAN JUST TRANSFER THE FRAME --
EACH FRAME OF THE VIDEO ONE BY ONE AND THEN STITCH THOSE IMAGES
TOGETHER TO MAKE A
TRANSFER
VIDEO. BUT IF WE DO THAT, WE CAN SEE
THE RESULT IS NOT GOOD BECAUSE
THE VIDEO WILL FLICKER A LOT, BECAUSE MACHINE DOESN'T KNOW ANY
INFORMATION ABOUT THE PREVIOUS IMAGE.
SO YOU CAN SEE, IF WE JUST DO THAT, THE VIDEO WILL FLICKER
A LOT. THE PAPER IMPROVED
FRAME-TO-FRAME STABILITY BY ADDING AN OPTICAL-FLOW ALGORITHM
THAT TELLS THE MACHINE THE
POSSIBLE MOTIONS FROM FRAME TO FRAME.
IT'S ALSO CALLED TEMPORALLY
COHERENT, SO THE TRANSFERRED
VIDEO WOULDN'T BE FLICKERING TOO MUCH.
SO WE CAN SEE SOME RESULTS HERE. THIS VIDEO IS NOT FLICKERING AT
ALL. AND THEY GOT AMAZING RESULTS FROM THEIR
MODEL. THIS IS THE TRANSFERRED VIDEO,
THE RESULT LOOKS
GREAT.
LET'S GO TO THE NEXT
MODEL. THIS IS A VERY COOL MODEL
APPEARED IN 2017, IT'S CALLED
DEEP PHOTO TRANSFER: THE STYLE
TRANSFER WE SAW SO FAR WORK REALLY WELL IF WE ARE LOOKING
FOR SOME ARTISTIC PAINTING RESULTS, BUT THEY ADD SOME
DISTORTION TO THE INPUT IMAGE. THEY DON'T LOOK REALISTIC.
BUT THIS DEEP PHOTO TRANSFER CAN
PRODUCE REALISTIC PHOTOS.
THIS INPUT IMAGE ON THE LEFT, AND IN THE MIDDLE, THIS IS THE
STYLE IMAGE, AND THEN ON THE RIGHT, THIS IS THE OUTPUT IMAGE.
THE OUTPUT IMAGE LOOKS LIKE A REGULAR PHOTO TO ME, SO THE
RESULT IS ALWAYS SUPER
GOOD. THEY USED AFFINE TRANSFORMATION
TO MAKE SURE THAT THE SHAPES ARE NOT DISTORTED DURING
TRANSFORMATION. THE RESULT IS AMAZING.
THIS IS THE NEXT STYLE TRANSFER. THIS IS SEMANTIC STYLE TRANSFER:
IT CAN PRODUCE SEMANTICALLY MEANINGFUL RESULTS, THE MACHINE
HAS AN UNDERSTANDING OF THE OBJECTS ON THE IMAGE.
IN THIS EXAMPLE, THE MACHINE RECOGNIZE THAT BOTH IMAGES HAVE
NOSE, SO IT USES THIS INFORMATION IN THE
TRANSFORMATION PROCESS.
THERE ARE A LOT OF APPLICATIONS
OF THIS MODEL, FOR EXAMPLE, YOU CAN USE IT TO CONVERT A SKETCH
OR A PAINTING TO A PHOTO.
I THINK THE OUTPUT IS PRETTY GOOD.
THIS IS SEMANTIC STYLE TRANSFER. THE LAST STYLE TRANSFER IS VERY
SPECIAL. IT'S UNIVERSAL NEURAL STYLE
TRANSFER: ALMOST ALL PREVIOUS
STYLE TRANSFER, THERE ARE SOME
ABSTRACT STYLE IMAGES THAT DON'T WORK WELL.
IF THE STYLE IMAGE IS VERY
DIFFERENT FROM THE TRAINING IMAGES, THE RESULTS WON'T BE
VERY GOOD.
FOR EXAMPLE, IF IT IS A BLACK LINE WITH A WHITE BACKGROUND.
IF WE TRAIN TOO MANY IMAGES, YOU CANNOT GET A LOT OF INFORMATION
FROM THE LINE BECAUSE IT TRAINED A LOT OF OBJECTS.
BUT THIS MODEL CAN SOLVE THIS MODEL.
THIS NEW MODEL IS ALSO BASED ON NN, BUT DOESN'T NEED TO BE
TRAINED ON THESE IMAGES, IT
WORKS ON ANY ARBITRARY STYLE.
IT USES AUTO-ENCODER, IT HAS A
ENCODE AND DECODE PROCESS.
SO WE PUT THE INPUT IMAGE IN, WE ENCODE IT, AND AFTER WE DECODE
IT, IT CAN GIVE BACK THE
IMAGE. IT USE THE ENCODE PART ON BOTH
INPUT IMAGE AND STYLE IMAGE,
THEN USE THE DECODER TO DECODE THE COMPRESSED VERSION OF THE
BOTH INPUT AND STYLE IMAGE. IN THE END, YOU CAN GET THIS
RESULT. THIS IS TRULY AMAZING, I THINK
IN THE FUTURE, WE CAN PORT IT TO ML5JS AND PLAY WITH IT.
HERE ARE THE STYLE TRANSFER MODELS THAT THEY
TALK ABOUT. TODAY, WE ARE USING THE
TENSORFLOW IMPLEMENTATION THAT IS A COMBINATION OF GATYS' STYLE
TRANSFER, FAST-STYLE-TRANSFER, AND ULYANOV'S INSTANCE
NORMALIZATION. THIS TENSORFLOW IMPLEMENTATION
OF FAST-STYLE-TRANSFER IS MADE BY LOGAN ENGSTROM.
MAKE SURE, IF WE USE THIS CODE, WE CAN GIVE CREDIT TO
HIM.
NOW, FINALLY, WE ARE GOING TO USE SPELL TO TRAIN OUR OWN STYLE
TRANSFER MODEL.
THERE ARE 4 STEPS THAT WE NEED TO DO.
PREPARING THE ENVIRONMENT DOWNLOADING DATASETS
BECAUSE WE USED THE VGG MODEL AND THE COCO DATASET, IT IS
LARGE, AND SO IT MIGHT TAKE AN HOUR TO FINISH THIS ONE, AND
THEN WE'RE GOING TO RUN THIS STYLE PYTHON SCRIPT TO TRAIN THE
MODEL. I THINK IT WILL TAKE ABOUT TWO
HOURS AND SIX MINUTES, AND THEN IN THE END, WE'RE GOING TO
CONVERT THIS TENSORFLOW SAVED MODEL INTO A FORMAT THAT WE CAN
USE IN TENSORFLOW.JS AND ML5.JS. AND HERE IS THE
DETAILED INSTRUCTION HERE. IF YOU ARE CURIOUS, WE YOU CAN
READ THE READ ME THERE. HAHA
>> THERE WE
GO. SPEAKER: FOR STEPS FOR
1-3, YOU CAN CHECK OUT THE
TUTORIAL. AND YOU CAN FIND A STEP BY STEP
INSTRUCTION HERE: I'M GOING TO SWITCH TO THIS PAGE, CAN FOLLOW
THE INSTRUCTIONS HERE. I'M GOING TO TALK ABOUT THAT
LATER. FIRST FIRST, WE WILL TRAIN THE
STYLE TRANSFER MODEL ON THE
SPELL.
I WILL GO TO AN EMPTY
FOLDER. >> LIKE THIS?
>> I THINK THAT'S
GOOD, YES. >> THE FIRST STEP IS TO SET UP
THE ENVIRONMENT. SO WE'RE GOING TO GO TO OUR TERMINAL AND WE CAN
GO TO ONE OF THE DIRECTORIES. WE CAN FIND A FOLDER, SO ON MY
COMPUTER, I WILL JUST GO TO
CDDEV/LIVESTREAM. THERE IS AN EMPTY FOLDER AND NOT
ANYTHING THERE
YET. FIRST I NEED TO INSTALL SPELL.
BEFORE I DO THAT, I NEED TO INSTALL PIP. IT IS A PACKAGE
MANAGEMENT SYSTEM FOR PYTHON. IT IS LIKE NPM FOR JAVASCRIPT.
>> I DON'T KNOW IF I'M MUTED OR NOT, BUT YOU SHOULD MOVE THE
BOTTOM WHERE YOU ARE TYPING HIGHER UP BECAUSE THE CAPTIONS
ARE COVERING IT. SO IF YOU CAN MAKE YOUR TERMINAL WINDOW GO --
YEAH, THAT WORKS
TOO.
THIS IS MY TERMINAL WINDOW. BEFORE I INSTALL SPELL, I NEED
TO INSTALL PIP, THE PACKAGE MANAGEMENT STYLE FOR PYTHON. IT
IS LIKE NPM FOR JAVASCRIPT. THE NODE PACKAGE MANAGEMENT. IF
YOU DON'T HAVE PIP INSTALLED, WE CAN DO IT TOGETHER. I THINK I
DID IT, SO IT IS FASTER FOR ME. SO I'M GOING TO SWITCH TO THIS
PAGE TO SEE ALL OF THOSE STEPS. SO FIRST, TO INSTALL THE PIP,
WE'RE GOING TO DOWNLOAD THIS -- WE WILL MAKE THIS BIGGER, TOO.
WE'RE GOING TO DOWNLOAD THIS GET PIP
PYTHON SCRIPT. SO WE WILL DOWNLOAD THIS GET PIP
PYTHON SCRIPT, AND NOW IF I TAKE A LOOK AT MY FOLDER, THERE'S A
GET PIP PYTHON SCRIPT. AND THEN, I'M JUST GOING TO RUN
MY
SCRIPT. PYTHON GET PIP.PY, IF YOU ARE
USING PYTHON 3, YOU CAN DO
PYTHON
3..GET-PIP.PY. IF THIS IS THE FIRST TIME YOU HAVE INSTALLED
PIP, IT MIGHT TAKE A MINUTE. AND AFTER THIS IS SUCCESSFULLY
INSTALLED, WE'RE GOING TO PIP
INSTALL SPELL. I ALSO HAVE DONE THIS, SO IT
MIGHT BE FASTER FOR ME. SO HERE IT SAID ALL OF THE
REQUIREMENTS ARE SATISFIED BECAUSE I ALREADY DID IT ONCE.
SO NOW WE HAVE SPELL INSTALL ED, IF I TYPE IN SPELL, I SHOULD BE
ABLE TO SEE A SET OF COMMANDS THAT I CAN DO. I CAN DO SPELLCP
TO COPY A FILE, OR I CAN DO SPELLRUN TO RUN -- TO START A
NEW ONE. AND I CAN DO SPELL LOGGING TO
LOG INTO SPELL FROM MY LOCAL COMPUTER.
MY SPELL
USERNAME IS THIS, AND
MYMYPASSSWORD IS THIS. AND I AM SUCCESSFULLY LOGGED INTO SPELL.
AND I CAN ALSO DO SPELL, WHO AM I, TO CHECK WHO IS LOGGED INTO
SPELL AND IT SAYS THE USER NAME, THE EMAIL, CREATED AUGUST 13TH.
AND NOW WE HAVE SUCCESSFULLY SET UP SPELL, AND THEN WE CAN DO
PREPARE OUR ENVIRONMENT. AS I MENTIONED BEFORE, WE'RE
GOING TO USE THIS TENSORFLOW IMPLEMENTATION OF FAST STYLE
TRANSFER MADE BY LOGAN. SO NOW I'M GOING TO GO AHEAD AND
CLONE HIS
GITHUB REPOSITORY. SO I WILL DO GIT CLONE. AND
THEN I'M GOING TO GO TO HIS FOLDER, CD FAST STYLE TRANSFER.
AND NOW I'M HERE. THE NEXT STEP IS TO CREATE SOME
FOLDRSRS
RS AND PUT IN OUR STYLE IMAGE. FIRST, I WILL CREATE A FOLDER,
CKKP CHECKPOINT. I WILL CREATE A GIT IGNORE FILE INSIDE OF THE
FOLDER. AND I'M ALSO GOING TO CREATE A FOLDER CALLED IMAGES
HERE. AND I'M ALSO GOING TO CREATE
ANOTHER FOLDER INSIDE OF THE IMAGES CALLED STYLE.
THIS IS THE FOLDER WHERE OUR STYLE IMAGE IS
LIVING. IF I TAKE A LOOK AT THIS REPO,
THIS IS THE NEW FOLDER THAT WE JUST CREATED, AND THIS IS THE
NEW FOLDER THAT WE CREATED IMAGES.
AND THE NEXT STEP IS TO FIND A STYLE IMAGE THAT WE TRAIN THAT
CAN BE TRAINED
ON. AND WHEN WE ARE CHOOSING STYLE
IMAGES, WE NEED TO MAKE SURE NAT WE CAN USE THIS ARTWORK AND ALSO
WE CAN USE THAT
IMAGE BECAUSE WE NEED TO GIVE CREDIT TO THE IMAGES BECAUSE WE
DON'T WANT TO RUN INTO ANY COPYRIGHT
PROBLEM. I FOUND THIS PAINTING OF LOTUS
BY A CHINESE ARTIST NAMED
[SPEAKING IN CHINESE]. SO I GOT THIS IMAGE FROM
WIKIPEDIA, AND IF YOU HAVE ARTWORK THAT I CAN USE, YOU CAN
SHARE IT WITH ME AND I CAN TRAIN IT WITH SPELL AND SEND BACK THE
MODEL TO YOU IF YOU ALLOW ME TO USE YOUR
ARTWORK. IF THERE IS NO OTHER ARTWORK, WE
WILL TRAIN THIS AGAIN. I ALREADY TRAINED A MODEL ON THIS
IMAGE. >> THEY ARE BEHIND IN REALTIME,
I THINK YOU SHOULD PROBABLY MOVE FORWARD WITH THAT IMAGE, AND I
WILL SEE PEOPLE -- BECAUSE PEOPLE WILL DO THEIR OWN IMAGES
FOLLOWING ALONG, AND THEY WILL COME UP WITH A HASHTAG OR
SOMETHING IN THE END THAT PEOPLE CAN SHARE THEIR STYLE TRANSFER
MODELS ON TWITTER OR SOCIAL MEDIA. IT IS A GOOD PLACE TO
SHARE. >> OKAY, SOUNDS GOOD.
SO WE HAVE DECIDED TO USE THIS IMAGE. WHAT I'M GOING TO DO IS
TO PUT THIS IMAGE INTO
IMAGES/STYLE. SO I'M GOING TO GO
TO THE FOLD ER AND I'M GOING TO MAKE THIS
BIGGER. I DON'T THINK I CAN MAKE THIS
WINDOW BIGGER, BUT I CAN PUT THIS STYLE IMAGE INTO
IMAGES.STYLE. I'M GOING TO COPY THIS IMAGE,
THIS IMAGE IS CALLED
FUTRIN.JPG. I JUST COPIED THIS IMAGE HERE.
SO NOW WE HAVE OUR STYLE IMAGE. THE ONE THING THAT WE NEED TO DO
IS TO GET AT THOSE TWO FOLDERS, AND ALSO COMMIT THESE CHANGES TO
LET SPELL KNOW THAT WE MADE ALL THOSE CHANGES.
SO HERE I'M GOING TO DO GIT ADD IMAGES, AND ALSO ADD THAT FOLDER
CHECKPOINT. AND THEN I'M GOING TO COMMIT
THESE CHANGES. COOL. SO NOW WE HAVE PREPARED OUR ENVIRONMENT.
THIS IS DONE. WE CAN MOVE TO THE NEXT
STEP. WE NEED TO DOWNLOAD THE
DATASET.
IN ORDER TO TRAIN A MODEL, WE WILL NEED THE REQUIRED DATASETS.
FOR FAST STYLE TRANSFER THE
ARE IN THE STYLE SCRIPT, SO WE CAN DOWNLOAD THE FAST STYLE
TRANSFER GITHUB REPO
HERE. NEXT WE ARE GOING TO RUN THIS
SCRIPT
SETUP. AS YOU CAN SEE, IN HAD SETUP, WE
ARE GOING TO CREATE A FOLDER CALLED DATA AND THEN GO INTO
THAT DATA FOLDER AND THEN GET THIS -- THE VGG MODEL, THE
CONVOLUTIONAL NEURAL NETWORK MODEL, BACK. AND WE WILL ALSO
MAKE A FOLDER AND THEN DOWNLOAD THIS COCO
DATASET.
UNZIP THE COCO DATASET. TALKED ABOUT BEFORE, VGG IS CNN
FOR OBJECT RECOGNITION.
WE NEED IT TO GET REPRESENTATIONS OF THE IMAGE.
THAT'S WHY WE'RE GOING TO USE THIS VGG MODEL.
FAST-STYLE-TRANSFER USES COCO DATASET TO TRAIN THE NETWORK AND
OTHER OPTIMIZATION METHODS TO MAKE THE MODEL WORK IN
REAL-TIME. IT IS AN OBJECT RECOGNITION OF
18,000 IMAGES, AND WE NEED TO USE THIS BECAUSE THIS COCO
DATASET IS HUGE. IT MIGHT TAKE A WHILE. BUT WE ARE JUST GOING
TO DO IT. SO THIS IS WHAT WE LOOK LIKE IN
THIS SETUP SCRIPT, AND NEXT WE ARE JUST GOING TO RUN THIS
SETUP. >> NEXT, WE ARE GOING TO RUN
THIS SETUP
SCRIPT. IN OUR TERMINAL, WE WILL DO
SPELL RUN, AND THIS IS THE SCRIPT THAT WE'RE GOING TO RUN.
BUT HERE, WE CAN ALSO SPECIFY THE MACHINE TYPE
BY USING THIS
FLAG//MACHINETYPE.CPU, IT IS FREE TO USE, SO WE ARE GOING TO
RUN THIS
SCRIPT.
NOW YOU CAN SEE THE EMOJI, 15, THIS NUMBER IS IMPORTANT TO US
BECAUSE LATER WE ARE GOING TO USE THE OUTPUT OF THIS RUN TO DO
OUR NEXT TRAINING RUN. SO IT MIGHT -- OH.
IT IS DOWNLOADING THIS VGG MODEL. LET ME MAKE IT A LITTLE
BIT
SMALLER. I THINK AFTER DOWNLOADING THE
VGG
MODEL, IT IS GOING TO DOWNLOAD THE COCO DATASET. BUT HERE, I'M
GOING TO DO CONTROL C TO EXIT. IT WOULDN'T STOP THIS RUN, IT
WOULD STOP PRINTING ALL THOSE LOGS.
I TRIED TO RUN THIS RUN ON SPELL AND IT TAKES ME ONE HOUR AND 30
MINUTES TO FINISH IT. I CAN ALSO LOG INTO SPELL TO SEE
MORE DETAILED INFORMATION ABOUT EACH RUN, BUT ALSO IN THE
TERMINAL, WE CAN DO SPELL PS. IT WILL LIST ALL OF THOSE RUNS
THAT I HAVE DONE
BEFORE. SO I HAVE 15 RUNS, AND THE LAST
ONE IS RUNNING, AND I AM -- AND THIS IS THE COMMIT THAT I PUT.
AND THIS IS THE MACHINE TYPE. WE ARE JUST USING CPU.
BUT WE CAN ALSO LOG INTO THE SPELL WEBSITE, AND HERE I CAN
CLICK ON THIS RUN. AND HERE I
CAN SEE ALL THOSE -- ALL THE INFORMATION ABOUT EACH RUN.
THIS IS THE RUN THAT WE JUST DID, RUN 15.
AND IT WILL OUTPUT A FOLDER CALLED DATA. THESE ARE THE
LOGS, AND THIS IS THE CPU USAGE,
CPU MEMORY, SO THIS RUN WILL TAKE ABOUT 1.5 HOURS.
BUT LUCKILY, WE HAVE ANOTHER COMPLETE RUN. I THINK IT IS RUN
13. SO ON RUN 13, I RAN THE SAME COMMAND
SETUP HERE, AND IT IS ALREADY COMPLETED AND IT WILL OUTPUT A
FOLDER CALLED DATA, AND WE CAN CLICK ON THIS DATA TO SEE WHAT
KIND OF OUTPUT DID WE GET. WE WILL SEE THAT WE GOT THIS,
LET ME MAKE IT
BIGGER. WE HAVE THIS VGG MODEL, WE'VE
ALSO GOT THE COCO DATASET. HERE IT IS TRAIN 2014.
SO NEXT, WE'RE GOING TO USE THE OUTPUT FROM THIS RUN TO TRAIN
OUR MODEL.
WE FINISHED THE SECOND STEP, DOWNLOADING THE DATASET. AND
WE'RE GOING TO MOVE TO THE NEXT STEP, TRAINING WITH
SPELL SCRIPT. THIS IS THE COMMAND THAT WE'RE GOING TO RUN,
BUT LET'S TALK ABOUT THIS COMMAND BEFORE WE ACTUALLY
RUN IT. THIS COMMAND STARTS A NEW RUN,
AND IT USES
THE
DASH DASH MOUNT FLAG TO OUTPUT RUN 13. AND FOR 113, IT USES AN
OUTPUT FOLDER, DATA, AND WE'RE GOING TO USE THIS MOUNT FLAG TO
COPY THIS DATA FOLDER INTO THE FILE SYSTEM OF OUR NEXT RUN.
AND WE'RE GOING TO CALL THAT FOLDER DATASETS INSTEAD OF DATA.
SO THIS IS THE MOUNT COMMAND. WE CAN SEE MORE INFORMATION IN
SPELL'S DOCUMENTATION. AND THEN WE'RE GOING TO SPECIFY THE
MACHINE TYPE. I USED THE V100 MACHINE. WE CAN CHECK MORE
DETAILED MACHINE TYPE
HERE, THIS IS ON THE SPELL RUN/CORE CONCEPTS, YOU CAN TALK
ABOUT THE AVAILABLE MACHINE TYPES THAT YOU CAN USE, AND HERE
THERE'S A PRICING TABLE THAT LISTS ALL THE MACHINE STYLES
THAT WE CAN USE. THE ONE THAT I USED YESTERDAY IS
CALLED V100. AND NORMALLY, IT WOULD TAKE 12 HOURS TO TRAIN
THIS K18 MACHINE, AND IT WOULD TAKE FOUR HOURS TO TRAIN THIS
V100 MACHINE. BUT I TRIED IT FOUR TIMES, AND
IT ONLY TOOK ME TWO HOURS TO TRAIN ON THIS V100
MACHINE. THIS IS THE MACHINE TYPE.
AND THE NEXT COMMAND, WE SPECIFIED THE FRAMEWORK,
IT IS TENSORFLOW. WE WILL GET A PACKAGE, THOSE ARE TWO ACTUAL
PACKAGES, THEY ARE FOR VIDEO
TRANSFER. WE
WILL USE THE?--APT AND?--PIP TO RUN THE PACKAGES.
WE'RE GOING TO RUN THE STYLE PYTHON SCRIPT, AND WE'RE
GOING TO TELL THE SCRIPTS WE WANT THE OUTPUT TO BE AT A
FOLDER CALLED CKKP CHECK POINT, AND WE'RE GOING TO TELL THE
SCRIPT THAT THIS IS THE PATH TO OUR STYLE IMAGE.
AND THIS IS THE STYLE WEIGHT, THIS IS THE STYLE LOSS OF THAT
MODEL, WHICH IS 150, AND YOU CAN READ MORE ABOUT IT AT LOGAN'S
GITHUB REPO ABOUT THE DEFAULT STYLE WEIGHT AND OTHER
INFORMATION. IS
-- WE WILL SPECIFY THE TRAIN PATH. THIS IS THE PATH TO THE
COCO DATASET, AND THE PATH TO OUR VGG MODEL. WE DON'T NEED TO
CHANGE ANY OF THIS. THE ONLY THING WE NEED TO CHANGE IS OUR
RUN NUMBER, WHICH WOULD BE 13, BECAUSE 13 RUN WILL DOWNLOAD TO
ALL OF THOSE DATASETS. AND WE'RE ALSO GOING TO CHANGE THE
STYLE IMAGE NAME TO OUR OWN IMAGE NAME, WHICH IS
FUTRAN.JPG. OKAY, LET'S DO THIS.
SO I COPY AND PASTED THIS
COMMAND.
I'M JUST GOING TO REPLACE -- I WILL GO TO A CODE EDITOR FIRST.
I'M GOING TO REPLACE MY -- I WILL REPLACE THIS WITH MY REAL
STYLE TRANSFER, STYLE IMAGE, WHICH IS FUTRAN.JPG. AND ALSO
I'M GOING TO REPLACE THIS, THE RUN NUMBER OF THE SETUP RUN, TO
13. THAT'S THE RUN THAT WE USED. AND THAT'S IT.
SO NOW WE SHOULD BE ABLE TO COPY AND PASTE THIS COMMAND AND RUN
IT IN OUR SPELL. AND, BY RUNNING THIS, WE ARE
GOING TO START A NEW RUN TO TRAIN THE
MODEL. LET'S JUST DO
IT. IT SAYS CUSTOM SPELL, MACHINE
REQUESTED, RUN IS RUNNING, MOUNTING IS WHERE WE MOUNT THE
DATAFOLDER TO
THIS RUN. IT SAYS TESLA.100, THE MACHINE
TYPE, I THINK IT WILL GIVE MORE INFORMATION. BUT I'M GOING TO
DO CONTROL C TO LET IT STOP LOGGING ALL OF THOSE LOGS.
AND WE CAN ALSO DO SPELL.PS TO SEE OUR RUN.
SO NOW I ACTUALLY HAVE TWO RUNS RUNNING, TWO RUNS RUNNING. THE
FIRST ONE IS THE SET-UP, AND WE'RE STILL WAITING FOR THAT TO
FINISH, AND THIS IS THE TRAINING SCRIPT.
THIS IS THE V100
MACHINE. THE ONE THING I FORGOT TO
MENTION, BECAUSE IT TAKES A WHILE TO FINISH THIS RUN, IN
SPELL, THERE'S A PLACE WE CAN SET NOTIFICATIONS SO IT WILL
SEND EMAILS WHEN THIS RUN TAKES TOO LONG OR IT COSTS TOO MUCH
MONEY. SO ON MY SPELL ACCOUNT, IF I GO
TO SETTING, AND THE NOTIFICATIONS HERE, I CAN SET
SOME, LIKE, EMAIL NOTIFICATIONS SAYING, EMAIL ME IF THE RUN
EXCEEDS $20, THINGS LIKE THIS, IN CASE THE RUN TAKES TOO LONG.
SO WE CAN DO THIS. AND ALSO, IF YOU ARE CURIOUS
ABOUT THE VERSIONS OF PACKAGES AND FRAMEWORKS THAT WE HAVE IN
THE SPELL ENVIRONMENT, ONE THING THAT WE CAN DO IS
TO DO SPELL, RUN, PIP, PHRASE. IT WILL LOG OUT ALL OF THOSE
INSTALL PACKAGES
FOR US. SO THIS IS A NEW RUN, TOO.
SO WE WILL CAST THE SPELL
17.
THIS IS FINISHED, THE RUN TIME IS 10 SECONDS AND WE CAN SEE THE
PACKAGES, TENSORFLOW 1.10.1, THINGS LIKE THIS IF YOU ARE
CURIOUS ABOUT THE VERSIONS OF THE
FRAMEWORKS. YEAH, SO LET'S GO BACK TO SEE
HOW DID OUR RUN IS DOING. SO THIS IS THE RUN THAT I JUST
STARTED FOR TRAINING. IT HAS BEEN RUNNING FOR THREE
MINUTES, AND IT IS STILL
RUNNING.
IT WILL TAKE ABOUT TWO HOURS TO FINISH, BUT I HAVE A COMPLETE
ONE, WHICH IS RUN 14. AND RUN 14 TAKES TWO HOURS AND 6 MINUTES
TO FINISH, BUT HERE I TRAINED THE -- ANOTHER SPELL IMAGE, SEE
I HAD THIS EXACTLY THE SAME RUN. I TRAINED THIS MODEL ON THIS
LOTUS IMAGE. AND THIS IS THE OUTPUT OF THIS
RUN. SO WHEN WE'RE WAITING FOR OUR
RUN 16 TO FINISH, WE CAN USE THIS RUN 14. THIS RUN 14
OUTPUTS A NEW FOLDER CALLED CKPT CHECKPOINT. IF WE OPEN THIS
FOLDER, WE CAN SEE THERE ARE, LET ME MAKE THIS BIGGER.
IF WE OPEN THIS CKPT FOLDER, IF EVERYTHING GOES WELL, WE SHOULD
BE ABLE TO SEE FOUR FILES IN THIS FOLDER.
THEY ARE
CHECKPOINTS.DATA.INDEX.META. THIS IS A FORMAT OF TENSORFLOW'S
SAVED MODEL. THIS .META STORES THE GRAPH
INFORMATION AND THIS .DATA FILE HERE STORES THE VARIABLE OF THE
INFORMATION INSIDE OF THE GRAPH, AND THIS .INDEX IDENTIFIES THE
CHECKPOINT, AND THIS CHECKPOINT FILE ONLY TELLS US THE MODEL
PATH. BUT FOR THE NEXT STEP, WE ARE GOING TO COPY THOSE FOLDERS
BACK TO OUR
LOCAL COMPUTER. SO WE CAN USE SPELL.LS TO LIST
ALL OF THE OUTPUTS FOR US. SO I'M GOING TO DO THIS SPELL.LS
RUNS. AND THE RUN NUMBER IS
114, THE COMPLETED TRAINING RUN. SO IF WE DO THIS, SPELL WOULD
TELL US, OH, THE OUTPUT IS A FOLDER CALLED CKPT.
SO I ALSO WANTED TO SEE WHAT IS INSIDE OF CKPT SO I CAN DO SPELL
LS
RUNS/14CKPT. AND THEN IT LISTS ALL OF THE
FOUR FILES THAT WE SAW ON THE SPELL WEBSITE, AND WHAT WE'RE
GOING TO DO IS WE WANT TO COPY AND PASTE ALL OF THOSE -- TO
COPY ALL OF THE FILES BACK. SO I AM GOING TO CREATE A
NEW
FOLDER CALLED SPELL MODEL. AND THEN I'M GOING TO GO INSIDE
TO THE MODEL AND THEN HERE, I'M GOING TO COPY ALL OF THOSE FOUR
FILES. AND THE RUN NUMBER, AGAIN, IS
14.
SO WE HIT ENTER, AND WE WERE COOPYING -- COPYING THIS
FILE. >> SHORT INTERMISSION,
EVERYBODY. WE KNOW THAT TWO HALF HOURS HAVE PASSED.
WE'RE
GOOD,
WE'RE GOOD. IT IS A LITTLE BIT LESS THAN
AN HOUR, BECAUSE THE CAMERA STARTED BEFORE WE STARTED. AND
IF YOU ARE WONDERING IF THIS IS LIVE -- PEOPLE ARE LIKE, IS THIS
LIVE? SO THIS IS FINISHED. WE HAVE
SUCCESSFULLY COPIED ALL OF THE FOUR FILES, WHICH IS THE MODEL,
WHICH IS A TENSORFLOW SAVED MODEL BACK TO OUR LOCAL
COMPUTER. SO WE CREATED A RUN FOLDER
INSIDE OF THE GITHUB REPO
IS FINE. IF WE LIST THE FILES, WE CAN SEE
THE FILES ARE ON OUR LOCAL MACHINE. SO THIS IS HOW WE CAN
GET THE TRAINED MODEL BACK FROM SPELL'S REMOTE MACHINE.
AND ACTUALLY, WE CAN OPEN THAT TO SEE WHAT DO
THEY LOOK LIKE. I'M GOING
TO THAT DIRECTORY. I JUST CREATED THIS NEW FOLDER CALLED
SPELL MODEL. I'M GOING TO DRAG THIS MODEL OUT TO THE
DESKTOP. AND, AS WE CAN SEE, WE HAVE FOUR
FILES. THIS IS THE FORMAT OF THE TENSORFLOW SAVED MODEL. IF
WE OPEN THIS
CHECKPOINT FILE, FOR THERE ARE ONLY TWO LINES IN THIS FILE. IT
TELLS USH US THE MODEL CHECKPOINT PATH IS .CKPT.
THIS IS IMPORTANT INFORMATION, BECAUSE WE ARE GOING TO USE THIS
PATH FOR OUR NEXT STEP. SO JUST REMEMBER THE MODEL
CHECKPOINT PATH IS THIS.
OKAY. SO FAR, WE SET UP THE
ENVIRONMENT, WE DOWNLOADED THE DATASET, WE TRAINED THE MODEL
WITH THE STYLE PYTHON SCRIPT, WE COPIED OUR TRAINED MODEL BACK TO
OUR LOCAL COMPUTER, AND THEN THE LAST STEP IS TO CONVERT THE
MODEL TO A FORMAT THAT WE CAN USE IN TENSORFLOW.JS AND ML5.JS.
OKAY, LET'S DO THIS. AND BY THE WAY, THIS IS THE FOLDER -- THIS
IS THE IS THE TRAINED MODEL THAT WE GOT ON THE
DESKTOP.
OKAY, SO IF I GO BACK TO MY OLD
DIRECTORY, WHICH IS
LIVESTREAM HERE,
WE'RE GOING TO USE THE SCRIPTS THAT IS FROM FAST STYLE
TRANSFER DEEP LEARN.JS. THIS IS THE FORMAL NAME FOR TENSORFLOW
JS. THIS REPO
IS BUILT BY GIRO NAKANO, HIS WORK IS AMAZING. HE RECENTLY
CONTRIBUTED A NEW MODEL, SKETCH
RN, AS WELL. YOU SHOULD CHECK OUT HIS WORK. WE'RE GOING TO
USE HIS SCRIPTS TO CONVERT THE TENSORFLOW MODEL INTO A MODEL WE
CAN USE IN
ML5.JS. THE WAY WE ARE GOING TO DO IT IS
TO CLONE HIS GITHUB
REPO.
AND THEN WE WILL GO INSIDE THE GITHUB REPO. AND WE'RE GOING TO
PUT ALL OF THE CHECK POINT FILES THAT WE GOT INTO ONE OF THE FOLD
OF THE FOLDERS INSIDE OF THIS
GITHUB REPO. I HAVE TO GO TO FAST STYLE
TRANSFER.DEEPLEARN.JS AND GO TO
SOURCE. THIS IS NOT THE SOURCE, JUST THE
ROOT DIRECTORY. SO I'M GOING TO DRAG, I WILL COPY THIS FOLDER TO
THE ROOT DIRECTORY OF THIS GITHUB REPO.
AND I JUST DID, IT IS HERE. AND THEN WE CAN RUN -- WE'RE
GOING TO RUN TWO PYTHON IT SCRIPTS. WE WILL DUMP THE EXEC
CHECK POINTS TO CON RURAL -- CONVERT THE FORMATS. SO WE WILL
COPY AND PASTE
THIS COMMAND. SO I WILL ADD THIS IN THE CODE
EDITOR FIRST. SO THIS IS IN THE PYTHON SCRIPT, I WILL RUN THIS
SCRIPT, AND THE OUTPUT DIRECTORY IS SOURCE/CHECKPOINTS/OUR FOLDER
NAME, WHICH IS SPELL MODEL. AND THEN THE CHECKPOINT FILE IS
IN THE ROOT DIRECTORY OF THE
GITHUB REPO. SO IT IS THE SLASH SPELL
MODEL, SLASH CKPT. THIS IS THE PATH TO OUR MODEL WHICH WE SAW
BEFORE IN THIS CHECKPOINT FILE. THIS IS THE PATH TO OUR
CHECKPOINT. THAT'S WHY WE HAVE THIS NAME
HERE. OKAY. SO NOW I'M JUST GOING TO
RUN
THIS SCRIPT. AND THEN YOU CAN SEE IT IS DONE.
SO IT ACTUALLY CREATED ONE CHECKPOINT FILE, AND 49 OTHER
FILES. AND WE CAN GO TO -- WE CAN GO
THERE TO SEE WHAT IS THE OUTPUT. THE OUTPUT
LIVES IN SOURCE CHECK POINTS, AND THIS IS OUR MODEL.
AND YOU CAN SEE THAT WE GOT THE MANIFEST JSON. THIS TELLS US
THE STRUCTURE OF
THE GRAPH. AND ALSO 49 FILES THAT TELLS US ALL THE VALUES --
ALL THE VARIABLES IN EACH LAYER. AND THIS IS THE FORMAT THAT WE
CAN USE IN ML5.JS AND TENSORFLOW.JS.
OKAY. SO NOW I'M JUST GOING TO COPY
THIS MODEL BACK TO MY
DESKTOP. I WILL RENAME IT AND DRAG IT TO
MY
DESKTOP.
SO FAR, WE GOT TWO MODELS. WE HAVE A TENSORFLOW SAVED MODEL
THAT CAN WORK IN TENSORFLOW, OF COURSE.
AND THEN WE ALSO GOT ANOTHER MODEL THAT CAN WORK IN ML5.JS
AND TENSORFLOW.JS. SO THIS IS WHAT WE
GOT TODAY. AND THE NEXT STEP IS TO RUN THIS
MODEL IN ML5.JS. HERE ARE TWO
DEMOES, ON THE ML5 WEBSITE, AND WE ALSO HAVE THIS DEMO HERE THAT
YOU CAN SELECT A DIFFERENT STYLE, YOU CAN UPLOAD THE IMAGE,
YOU CAN CHANGE YOUR
STYLE HERE. AND YOU CAN UPLOAD THE IMAGE,
I'M GOING TO UPLOAD A
PHOTO.
THIS IS A PHOTO OF A CAT AND CLICK ON TRANSFER MY IMAGE, THIS
IS THE TRANSFERRED CAT. YOU CAN ALSO PLAY IT WITH DIFFERENT
STYLES, TOO. OH, I LIKE THIS
ONE. AND ALSO, YOU CAN USE WEBCAM.
ANDTHEN YOU -- AND THEN YOU CAN CLICK THIS BUTTON AND SEE THE
TRANSFERRED VERSION OF THE IMAGES FROM THE
WEB CAM. SO YOU CAN GO THERE AND CHECK
THIS DEMO OUT. BUT NEXT, WE'RE JUST GOING TO RUN THIS MODEL IN
OUR P5 -- IN OUR ML5
DEMO. SO WE CAN DO THIS
QUICKLY. HERE, WE ARE JUST GOING TO CLONE
THIS GITHUB
REPO.
AND THEN WE WILL GO INSIDE TO THAT FOLDER,
STYLETRANSFER_SPELL AND WE WILL PUT THIS INSIDE OF THE CODE
EDITOR. AND IN THIS, IN ITS MODELS
FOLDER, THERE IS ONE MODEL THERE. WE ARE GOING TO ADD OUR
NEW MODELS INSIDE OF THIS FOLDER.
SO WHAT WE'RE GOING TO DO IS TO FIND THAT GITHUB
REPO. AND INSIDE OF MODELS, I'M GOING
TO COPY AND PASTE THIS MODEL IN. I'M GOING TO RENAME IT TO
LOTUS, BECAUSE THE NAME OF THE ART IS CALLED
LOTUS. AND NOW WE GO BACK TO THE CODE
EDITOR, WE HAVE A NEW MODEL HERE, AND WE CAN TAKE A LOOK AT
WHAT IS INSIDE OF THE INDEX.HTML.
SO TO RUN THIS -- TO BUILD THIS DEMO, WE NEED P5 JS MAINLY TO
GET THE VADEIO FROM THE WEB CAM AND ALSO WE NEED A P5 LIBRARY TO
CREATE DOM ELEMENTS FOR US, AND THEN IN THE END WE WILL USE THE
ML5.JS LIBRARY. WE HAVE STYLES HERE, WE CAN IGNORE THEM FOR
NOW, AND WE ARE RUNNING THE SKETCH.JS SCRIPT HERE. AND IN
THE BODY, WE HAVE A HEADER TAG, WE HAVE A P TAG, AND WE ARE
LINKING THE SOURCE OF THE IMAGE, THE ART STYLE IMAGE, AND ALSO WE
ARE SHOWING THE ART IMAGE. BUT I'M GOING
TO CHANGE THIS
IMAGE TO THE LOTUS IMAGE. THIS IS A PRE-TRAINED MODEL.
I'M GOING TO ADD THIS IMAGE INTO THIS
IMAGE FOLDER. SO HERE, WE CAN SEE
IMAGES/LOTUS. SO WE'RE GOING TO SHOW THAT IMAGE, AND IN THE END,
WE HAVE A CONTAINER TO CONTAIN OUR CANVAS. AND NOW WE CAN GO
TO THE INDEX TERMINAL, AND THEN WE CAN GO TO SKETCH.JS. I'M
JUST GOING TO DELETE ALL THE CODE HERE. SO WE CAN DO IT
OURSELVES TOGETHER. SO TO BUILD THIS DEMO, WE NEED
THREE THINGS. SO WE NEED A VIDEO TO GET THE
IMAGES FROM OUR WEB CAM, SO WE HAVE VIDEO, AND WE ALSO NEED THE
STYLE TRANSFER FROM ML5 LIBRARY TO ALLOW US TO TRANSFER IMAGES.
SO I'M GOING TO HAVE ANOTHER VARIABLE CALLED STYLE.
AND
IN THE END WE WILL HOLD THE OUTPUT
IMAGE.
AND IN P5, THERE'S A SET-UP FUNCTION THAT IS CALLED ONCE IN
THE BEGINNING. IN THIS SET UP FUNCTION, WE ARE GOING TO USE
P5.JS TO CREATE
A CANVAS. THAT IS 320
WIDE AND 250 AS ITS
HEIGHT. AND WE'RE GOING TO USE THIS P5
DOWNLOAD LIBRARY TO PUT THE CANVAS ELEMENT INSIDE OF DIF
ELEMENT WHOSE I IT D
IS CANVAS CONTAINER. OKAY.
AND WE CREATE A CANVAS, THAT IS IT.
AND THEN WE'RE GOING TO CREATE THE VIDEO.
SO WE HAVE THIS FUNCTION CALLED CREATE CAPTURE. AND IF WE CAST
THE UPPER-CASE VIDEO, IT WILL TRY TO GUESS THE VIDEO FROM YOUR
WEB CAM. AND WE ARE ALSO GOING TO SAVE THE VIDEO HEIGHT,
BECAUSE WE DON'T NEED THE ORG VIDEO, BUT THE TRANSFERRED
VIDEO. SO WE'RE ALSO GOING TO
SAY VIDEO HEIGHT. WE ARE ALSO GOING TO CREATE THE RESULT
IMAGE, P5 DOWNLOAD LIBRARY HAS THIS -- I WANT TO MAKE IT A
LITTLE BIT BETTER. WE'RE GOING TO CREATE THIS
RESULT
IMAGE. TO CREATE IMG, PASS IT INTO THE
STRING THERE. AND WE'RE ALSO GOING TO HIDE THIS
IMAGE. WE'RE GOING TO DRAW THE IMAGE ON THE CANVAS, SO WE DON'T
REALLY NEED THIS IMAGE. IN THE END, WE'RE GOING TO USE ML5 TO
GET THE STYLE TRANSFER MODEL, RIGHT? SO STYLE EQUALS TRUE,
ML5.STYLE TRANSFER, AND WE GOING TO PASS IN THE PATH TO THE
MODEL. SO
ITS
MODELS/LOTUS. AND THEN WE CAN ALSO TELL THE
STYLE TRANSFER TO LOOK FOR INPUTS FROM OUR VIDEO. SO WE
ARE PASSING THE VIDEO, AND ALSO WE HAVE A CALLBACK
FUNCTION SAYING, OH, IF YOU FINISH THIS MODEL, LET ME KNOW.
THIS IS A CALLBACK FUNCTION CALLED MODEL LOTUS. WE ARE
GOING TO DEFINE THIS FUNCTIONAL. THIS IS A CALLBACK FUNCTION.
SO WE'RE GOING TO DO FUNCTION,
MODEL LOADED, AND ONCE THE MODEL IS LOADED, WE CAN JUST ASK THE
STYLE TRANSFER TO TRANSFER SOMETHING. BUT, AT FIRST, I
WANT TO CHANGE THE TEXT ON THIS P TAGGING TO MODEL LOADED JUST
TO LET PEOPLE KNOW THAT THE MODEL IS GOOD TO GO.
SO I'M GOING TO
SELECT AN ELEMENT. THIS IS A FUNCTION FROM P5 DOM
LIBRARY TO SELECT AN HTML ELEMENT FROM THE DOM.
THE ID STATUS, AND THEN I WANT TO CHANGE IT, THE HTML TO MODEL
LOADED. OKAY.
AND THEN ONCE THE MODEL IS LOADED, I'M GOING TO ASK THE
STYLE TO TRANSFER SOMETHING. SO I'M GOING TO SAY
STYLE.TRANSFER. AND I'M GOING TO PASS IN ANOTHER
FUNCTION
CALLED RESULT. THIS IS A CALLBACK FUNCTION, CONSTITUENCY
THE MODEL HAS ANYTHING BACK, THE FUNCTION IS CALLED. SO WE WILL
MAKE UP THIS FUNCTION. FUNCTION.RESULT, IT WILL GET TWO
THINGS. ONE IS IF THERE IS ANY ERROR DURING THIS PROCESS, IT
WILL PUT THE ERROR IN THIS ERROR VARIABLE. AND ANOTHER IS THE
OUTPUT, THE IMAGE. AND ONCE WE GOT THE RESULT,
WE ARE GOING TO GIVE
THE RESULT IMAGE AN ATTRIBUTE TO HOLD THIS IMAGE TO THE SOURCE.
SO WE'RE GOING TO SAVE THE RESULT
IMAGE.ATTRIBUTE. WE'RE GOING TO COPY THE SOURCE OF THIS
IMAGE.SOURCE TO OUR RESULT IMAGE. AND AFTER WE GOT THE
RESULT, WE WANT TO CALL THIS STYLE.TRANSFER AGAIN OVER AND
OVER TO SEE -- TO SEE MORE RESULTS.
SO WE'RE GOING TO DO STYLE.TRANSFER RESULT AGAIN.
AND ONE THING IS MISSING, BECAUSE WE DID UPDATE THE SOURCE
FOR RESULT IMAGE, BUT THIS RESULT IMAGE IS HIDDEN.
SO WE CANNOT SEE IT. AND P5 HAS A FUNCTION CALLED
DRAW. AND IT WILL RUN OVER AND OVER AGAIN IN THE DRAW FUNCTION,
WE'RE GOING TO DRAW THIS RESULT IMAGE.
SO I'M JUST GOING TO SAY IMAGE, LOWER CASE I.
IMAGE RESULT, ING, FROM ORIGIN 0-0, AND THE SIZE IS 320 TO 240.
THAT'S
IT.
WE NEED TO DO PYTHON
MINUS M .SERVER. AND IT STARTS THE SERVER AT LOCAL HOST 8000.
SO NOW IF I GO TO THE
LOCAL HOST,
WE SHOULD BE ABLE TO SEE SOMETHING.
SO THE MODEL IS LOADED, THIS IS
THE STYLE SOURCE. AND AS YOU CAN SEE, THIS STYLE
HAS MORE COLORS. SO THE RESULT IS A LITTLE BIT BETTER THAN THE
PREVIOUS MODEL. THIS IS THE DEMO THAT WE
BUILT TODAY. HE'S THESE ARE THE RESOURCES WE
USED, THIS IS GATIS'S PAPER FROM 2015, THIS IS THE PAPER, WHAT
NEURAL NETWORKS SEES, THIS STYLE TUTORIAL FROM SPELL, AND FROM
ML5.JS, IT HAS A STYLE TUTORIAL MADE BY CHRIS. AND I RECOMMEND
YOU TO CHECK THAT OUT, TOO. AND THIS IS THE LINK TO ML5.JS,
AND I ALSO WANT TO RECOMMEND THIS YOUTUBE CHANNEL BECAUSE I
LEARNED A LOT OF MACHINE LEARNING PAPERS FROM IT.
AND I WANT TO GIVE CREDIT TO THOSE TWO PROJECT CREATORS. WE
USED THE TENSORFLOW IMPLEMENTATION OF THE FAST STYLE
TRANSFER MADE BY LOGAN INGSTROM AND THE SCRIPT TO CONVERT THE
TENSORFLOW SAVED MODEL TO A FORMAT WE CAN USE IN
TENSORFLOW.JS AND ML5.JS. IT IS MADE BY NAKANO.
AND, TO WRAP UP TODAY, WE TRAINED A STYLE TRANSFER MODEL
WITH SPELL AND WE WILL RUN THIS
MODEL WITH ML5.JN THE BROWSER, YOU CAN CHECK OUT THE MODEL
HERE. AND THAT'S IT. I HOPE YOU LIKED THE VIDEO. AND IF YOU
RUN INTO ANY ISSU SHEN YOU ARE TRAINING OR RUNNING THE MODEL,
YOU CAN LEAVE COMMENTS ON THE GITHUB. YEP.
>> COME OVER HERE, SO PEOPLE ARE ASKING SOME INTERESTING
QUESTIONS. AND
I'M GOING
TO DO A SHORT Q&A SESSION AND WE WILL MONITOR AS WE ARE TALKING A
LITTLE BIT. SO ONE THING THAT SOMEBODY ASKED THAT IS
INTERESTING, WE ARE RUNNING SLOW IN THE BROWSER, IT IS AMAZING
THAT IT RUNSSS AT ALL. PEOPLE ASKED WHAT PERFORMANCE
CONSIDERATIONS ARE THERE, CAN THIS ACTUALLY RUN ON A MOBILE
PHONE? AND HOW FAR DID YOU PUSH THOSE
EXPERIMENTS? >> FOR NOW, IT WORKS WELL IN
CHROME. BUT I KNOW THAT TENSORFLOW.JS
SUPPORTS IOS AND OTHER OS. BUT IT HAS SLIGHTLY DIFFERENT
RESULTS A IN DIFFERENT OS. SO I'M NOT SURE.
>> RIGHT. >> BUT, YOU KNOW, MY EXPERIENCE
DOING THIS STUFF OVER THE LAST 10-PLUS YEARS, THE THING THAT
YOU ARE DOING NOW, YOU KNOW, IN A COUPLE YEARS THAT WILL WORK ON
THE SMALLER DEVICES. AND THEN THE NEWER THING WILL BE SUPER
FAST, AND THAT WILL WORK ON THE SMALLER DEVICES -- THIS STUFF IS
ALL VERY CYCLICAL AND, IN FACT, IF IT RUNS IN A BROWSER. AND
AGAIN, TO BE CLEAR, THE TRAINING PROCESS HERE IS A THING THAT YOU
CANNOT EASILY DO IN THE BROWSER. THAT IS A THING THAT TOOK A VERY
LONG TIME. YOU CAN DO IT ON YOUR OWN COMPUTER, YOU CAN BUY A
GPU, BUT USING A CLOUD COMPUTING SERVICE, WHICH SPELL IS ONE OF
MANY OPTIONS, IS AN -- AND SPELL MAKES IT SUPER EASY BECAUSE YOU
CAN JUST DO IT FOR THE COMMAND LINE INTERFACE RIGHT FROM YOUR
COMPUTER. THERE WAS ANOTHER QUESTION, I
DON'T KNOW IF YOU HAVE THE ANSWER TO THIS, BECAUSE I DON'T.
PEOPLE ARE CURIOUS, ONE THING I TALKED A LITTLE BIT ABOUT IN
MORE BEGINNING LEVEL MACHINE LEARNING TUTORIALS IS A LOSS
FUNCTION. WHAT IS THE -- DO YOU KNOW HOW
THE STYLE TRANSFER TRAINING PROCESS WORKS? LIKE HOW DOES IT
FIGURE OUT, LIKE, HOW WELL IT IS DOING?
>> IT DOES. SO, FOR FAST STYLE TRANSFER, IT HAS AN
IMAGEGE TRANSFORMATION NETWORK AND A LOSS CALCULATION NETWORK.
IT KIND OF -- I THINK I NEED TO CHECK THE PAPER IN
DETAIL, BUT IT CALCULATES THE LOSS AND THEN GOES BACK TO
MINIMIZE THE LOSS FUNCTION. >> I THINK WE WRAP
UP. THIS WAS AN HOUR AND 20
MINUTES, I'M EXCITED TO SEE HOW REPLICABLE THIS IS FOR YOU. CAN
YOU CLONE THIS PYTHON CODE, CAN YOU PICK YOUR OWN STYLE IMAGE,
AND CAN YOU THEN RUN IT WITH ML5 IN YOUR WEB CAM AND STYLE YOUR
OWN FACE FROM THE WEB CAM? IF YOU ARE ABLE TO FOLLOW THIS AND
DO THIS, THIS WAS SUGGESTED IN THE CODE TRAINING SLACK CHANNEL,
WHICH SLACK CHANNEL FOR PATRONS OR MEMBERS. USE THE HASHTAG
THIS.STYLE. AND PEOPLE ARE COMMENTING THAT YOU ARE FOR
GETTING THE SEMICOLONS, WHICH YOU DON'T NEED. BUT THAT IS
FUNNY, I WAS -- THIS IS THE THING I ALWAYS FORGET.
>> YEAH, I USED SEMICOLONS ALL THE TIME UNTIL A COLLEAGUE SAID
USUALLY USE THAT IF IT IS NOT CLEAR, AND THEN I SWITCHED TO
NOT. SO I'M GOOD WITH BOTH.
>> WE COULD BE HERE FOR THE NEXT THREE HOURS DISCUSSING IF YOU
SHOULD USE THEM OR NOT. SO THIS.STYLE, YOU CAN SHARE THINGS
YOU MAKE ON TWITTER WITH THAT HASHTAG, WHATEVER SOCIAL MEDIA
YOU USE, THERE'S A COMMENTS SECTION ONCE THE VIDEO IS
ARCHIVED. IN ADDITION, I WILL HOPEFULLY CREATE A PAGE ON THE
CODING TRAIN.COM WITH THE LINKS THAT YINING HAS SHOWN YOU HERE.
AND WE WILL HAVE ALL OF THE LINKS AND THE RESOURCES AND ALL
THE ARTISTS AND EVERYTHING, WE WILL UPDATE THE VIDEO
DESCRIPTION FOR THIS ARCHIVE FOR THE ARCHIVED VERSION OF THIS
LIVESTREAM AFTERWARDS AS WELL. THIS.STYLE -- I'M LOOKING TO SEE
IF THERE ARE ANY URGENT OR BURNING QUESTIONS.
WE CAN WAVE GOODBYE FROM
THIS.STYLE. OKAY, BUT AND THEN THE ONLY WAY
-- OH, GET THE SLIDESISM
WHATEVER MATERIALS WE CAN PUBLISH, WE WILL PUBLISH AND
SHARE THE SLIDES AS WELL. AND I WANT TO MENTION, CAN I GO TO
YOUR BROWSER HERE? IF I GO TO YOUTUBE/CODINGTRAIN,
AND HOPEFULLY THIS IS NOT GOING TO --
>> YOU CAN CLOSE THIS. >> I DON'T WANT TO CLOSE IT. IT
IS SO WONDERFUL. YEAH, I WILL CLOSE IT.
SO YOU CAN SEE THAT NEXT UP, SCHEDULED FOR, WHOOPS, SCHEDULED
FOR OCTOBER 5, I THINK WE ARE GOING TO DO IT EARLIER. IT SAYS
8:00AM PACIFIC TIME, OR 11:00 EASTERN, WE WILL DO ANOTHER
TUTORIAL WITH ALL OF THE
SAME ELEMENTS. THIS IS WITH ALL THE SAME ELEMENTS, ML5, SPELL,
AND TENSORFLOW TO TRAIN SOMETHING CALLED AN LSTM, A LONG
SHORT TERM MEMORY NETWORK. THIS IS A KIND OF NEURAL NETWORK THAT
IS WELL-SUITED FOR SEQUENCES. SO IF YOU WANTED TO TRAIN A
MODEL TO KNOW ABOUT HOW CHARACTERS APPEAR NEXT TO EACH
OTHER IN TEXT OR MUSICAL NOTES APPEAR NEXT TO EACH OTHER IN A
SONG, OR HOW STROKES APPEAR IN SEQUENCE IN
A DRAWING, THERE ARE SO MANY POSSIBILITIES. WE WILL SHOW YOU
HOW TO TAKE A TEXT FROM YOUR FAVORITE AUTHOR AND TRAIN A
MACHINE LEARNING MODEL ON SPELL.RUNCLOUD COMPUTING SERVICE
TO DOWNLOAD THE MODEL AND THEN HAVE THE MODEL GENERATE NEW TEXT
IN THE STYLE OF THAT AUTHOR FROM THE BROWSER. THAT IS TWO WEEKS
FROM TODAY, AND NEXT FRIDAY I
WILL BE BACK. AND YEAH, SO STAY
TUNED. CERTAIN THINGS THAT YOU CANNOT
FOLLOW TODAY, I DID WORKFLOW VIDEOS. AND YOU NEED THE EXACT
SAME STUFF, SO IF YOU ARE RUNNING GET, USING VISUAL STUDIO
CODE, RUNNING STUFF FROM YOUR TERMINAL, AND I HAVE AN INTRO TO
SPELL VIDEO. SO A LOT OF YOU ARE LIKE, HOW DO I FIND SPELL
AGAIN? YOU CAN FIND THAT IN THE INTRO TO SPELL VIDEO. I WILL
LINK TO THAT. GREAT.
I'M GOING TO GO, THIS IS THE AWKWARD
PART, CJ WAS ANOTHER WONDERFUL YOUTUBE CHANNEL. ALL OF THESE
THINGS
THAT YOU CAN DO IN OPEN BROADCAST VIDEO, PRESS A BUTTON
AND AN OUTRO VIDEO. >> THANK YOU FOR WATCHING, BYE.
>> THANK YOU, EVERYONE. LOOK FORWARD TO HEARING FROM YOU IN
THE COMMENTS. >> THANK YOU TO SPELL, WHITE
COAT CAPTIONING, AND