字幕表 動画を再生する 英語字幕をプリント [TRAIN HORN] Hello. And welcome to a coding challenge, Generating StyleGAN Rainbows With Runway ML and passing those to P5 JS to display them in the browser for our delight. And so this is basically making a version of software that I built for the I/O Festival. The talk that I gave at the I/O Festival will be out on the internet at some point soon. And if it is, I will include a link to it in this video's description where I had people play a game on stage. And when they finished the game, an AI rainbow was generated and tweeted to the I/O Rainbow's on Twitter account. So I'm going to build a little mini version of this today. And so the primary tool that's going to do the generating of the rainbows is something called Runway ML. So this is the first coding challenge that I'm using with Runway. And I've talked in a couple other videos in a livestream more about what Runway is, how to download and install it, how to sign up for an account, how to get some free credits, and that sort of thing. So I'll refer you to that video if you want to find out more. But if you have Runway downloaded-- find a link to download it in the description. If you've signed up for an account, and you're on the Browse Models page, you will find yourself here. And where you want to go next is you want to look for StyleGAN. Now, I see StyleGAN right here. But just in case you don't, I could type it in up here under Browse Models. I could click here. This is now giving me, "Generate photorealistic images of faces, landscape, and more." There's more information about the license for StyleGAN, credits about who created and authored the original StyleGAN paper that you can find out about. But I just want to use StyleGAN. So I'm going to click Add to Workspace. And if you don't have a workspace already, you can click New Workspace. I have one called Coding Train Livestream. And then here I am. Now, I can generate a variety of types of things with StyleGAN. And these are known as checkpoints. So there are checkpoints, you can see, for cars, and landscapes, and portrait. Wouldn't it be nice if there was a checkpoint for rainbows? And guess what. There is. So I'm going to click the rainbows checkpoint. Then I'm going to choose the input source, which is just going to be a vector. And I'll talk about what that is in a little bit as I get to writing some code to do this. And then what I want to do is click Run Remotely. So this is very important for me to be clear. This requires cloud credits that you have to pay for. If you sign up through the link in this video's description, you'll get free $10 in credits. And there's actually a coupon code, CODINGTRAIN, which you can get an additional $10 in credits as well. So that's certainly enough to run this example. Some models in Runway you can run locally on your computer without using cloud credits. But this one in particular only runs remotely at the moment. So I'm going to click Run Remotely. It's running. So I'm going to give it some time to start up. So now we see it's starting to populate. And this is what's known up here as the latent space, this sort of space of imaginary rainbows that this generative model is producing. And this is one of the reasons why I love using Runway is that I can actually just kind of browse around this space, and kind of have this 2D view of this multi-dimensional world of all of these rainbows. And I can do things like, oh, I really like this one. And I could just change the output here to preview. And so I can see it here. And I could download this one. And I now have my beautiful StyleGAN generated rainbows. But what if I want to have these rainbows in my own software? If I wanted to show them on a web page? Or if I want to tweet them from my Twitter bot? Or any other types of thing that you might be making, whether it's JavaScript, processing, open frameworks, or some other piece of software that you want to connect to Runway. The way that you do that is through talking to Runway over the network. So over here in the bottom-- I'm sorry. In the top right, there is a Network tab. If I click on this, it's showing me a variety of different options. I can communicate with Runway over OSC. This is something I did in another video tutorial, communicating with processing and Runway over OSC to demonstrate the PoseNet model for detecting human skeletal poses. I could also use Socket I/O for real time web sockets. But really what I want to do is just an HTTP connection. I want to make an HTTP request. I want to post some data to Runway. And I want to receive the image back. And, in fact, there's JavaScript code right here out of the box that I could copy and paste. So I encourage you to just actually go and grab this JavaScript code, and make your own example. But I am going to do this using the built in P5 function, HTTP Post. So I'm going to write my own code for doing this, referencing everything that's here under the HTTP option. My next step then is to go to the browser. And I am going to write this code in the P5 web editor. So first thing that I might do is let's make sure this is running, if this works. OK. Great. Let's make a button. Create Button. And I'm going to call that button rainbow. And I'm going to attach that button to an event called-- whoops. Let me hit stop here for a second-- Generate Rainbow. So when I run it, a button will appear. Rainbow. And then presumably when I click the button, a function called Generate Rainbow will be executed. So I need to write that function. And in that function, this is where I want to send my request to Runway itself. So I need to make an HTTP post. So I probably want to look at the HTTP post. This is a P5 specific function. You could use fetch. I have some video tutorials about how to use the fetch function to make a post request. And you could do that here. But I'm going to use P5 in this example. So I'm going to go to the HTTP post reference page on P5. And this is going to show me the stuff that I need to include in the post request. So here's the stuff that I need. I need a path. Where am I posting this to? I need a data type, which is what kind of data am I sending along with this post request? And then any parameters, data that needs to go along with the post request, as well as the callback and the error callback for getting information back. So if I could just take this and bring this into my code, I'm just going to put this in the comments right here as a reminder. So what are the things? The path is-- Runway's telling me this. It's actually this. The server address. Local host port 8000. So I'm going to paste that in here. The data type is JSON. That's the kind of data. The data I'm sending is what? So this is the data I need to send to Runway. And Runway's telling me about that here. Input specification. What does it expect? It expects a field called Z, which is an array of 512 floats. What is that? And then it requires some other value called truncation. What is that? So if you wanted to dive deeply into this input specification, you would probably want to do some more research on the StyleGAN model itself. Look at the paper. Look at the GitHub repo. And kind of understand more about the neural network's architecture, and its parameters, its hyper parameters that control its behavior. I think it's worth, though, for a moment taking a minute in this video tutorial to talk about what this z is, cause it's a very important concept in machine learning. So there is this machine learning model called StyleGAN. And it needs some kind of input in order to generate some kind of output. Now, the output that it generates is an image, 512x512. I mean, ultimately what it's outputting is just a whole lot of numbers. But those numbers can be interpreted as colors of pixels and repackaged in the image. So that's happening for you by Runway. Right? We're seeing the output of it right here, packaged as an image. But what's the input? I mean, ultimately, in this particular example, I don't care about the input. I just want, give me a rainbow. Give me a rainbow. Give me a rainbow. But in order for the model to generate a rainbow, it's got to start from somewhere. And in essence, I could start with something random. But what that random thing that I want to start with is is something called a vector, referred to as z. And what it is is 512 numbers. So I have this list of 512 numbers, probably between 0 and 1. So generally inputs to neural networks are normalized with some range. And in a way, this is like a unique signature for a particular output. So if I want to just get any so-called output, I can just make up a list of random numbers. And I would always get the same exact rainbow with the same set of numbers. So we could see that happen. Right? If I fix that set of numbers, I'll always get the same output. But what I could do is tweak these numbers a little bit, dial some up, dial some down. And that's going to change the output. And that's what you're seeing here in this space. What you're seeing here is rainbows that are attached to given z inputs. And Runway is being very clever about showing you similar ones in a two dimensional flat space on a computer screen. But actually, all of those rainbows that are generated live in 512 dimensional space. So that's kind of crazy, and mind blowing, and very confusing. I think I have a video tutorial where I do something with four dimensional space, and I can barely understand that. But this is kind of the weirdness of working in machine learning is you could imagine a three dimensional space would just be full of rainbows in 3D. Like all over this room, there'd be rainbows everywhere. 2D. It's just on a poster, like look at all the rainbows. But the only way to actually literally organize all of the rainbows generated by StyleGAN would be to have them all sitting in 512 dimensional space. Not a thing we can understand as human beings. So that's why Runway cleverly organizing them for you to look at in two dimensional space is quite useful. But you could kind of walk through that space. Right? I could do a random walk from vector, to vector, to vector in that five dimensional space to produce an animation of kind of morphing, changing rainbows. And that's something you should really do after you watch this video, and share it with me. Cause I would love to see that. So one of the nice things about if I'm working in Runway and I find a rainbow that I really like, for example-- oops. I can zoom in and out. This is nuts. This one's kind of crazy looking. If I like this one, oh, look at this strange double rainbow. So let's use this one. So if I like this one, I can actually click here and export that vector, and those 512 numbers as JSON itself. So if I click here, and I click back here, I could see this is it. This is that JSON file. So I'm just going to go call this rainbow.JSON. Let's actually go to the web editor. This is sort of nuts what I'm doing. But why not? Let's add a file. Then I'm going to drop this file in here. And then I'm going to look at this. And we can see, look, this is just that array of numbers. And actually, why even bother making it a separate JSON file? Because I'm just going to say const z equals this array. So I actually just literally copy pasted that array of numbers. It looks like, by the way, it's between negative 1 and 1. Into my processing sketch. So I'm going to call this rainbow z. Now, where was I? I was somewhere. I was over here in runway, because what I wanted to do was send that array of 512 floats as the z property in the data that I'm sending. So I'm going to do z, rainbow z. And then I need truncation. So truncation is a hyper parameter-- I spelled that totally wrong-- associated with StyleGAN. If you want to learn more about truncation, that's something you probably just want to read about in the paper itself. But it kind of changes the craziness factor, in a way, of the rainbow that you're going to get. And it's a number. I believe that is a number between 0 and 1. And I think the default that's being used right now, my guess is that it's 0.5. So it's possible I'm actually going to get a different rainbow out if I'm wrong about that truncation number. But now I have the data. Then I need a callback and an error callback. So I want to post to a path. I want to post that data type. This is sort of silly to have this separate variable here. I can just put JSON right in here. Then I want to post that data. And I want to say, got rainbow, or got error. So I need two callbacks. So now I want to say, function, got rainbow. Data. And let's just console log the data to see if it comes back from Runway. All right. We're going to run this. I'm going to click the Rainbow button. Got error is not defined. OK. Fine. I need to define the got error function. Got error. Error. Console.log. Error. This is good for some error checking. OK. Now I'm going to press this button. Ooh. I got an error. [BUZZER] [DING] I found what I got wrong. So the server address is local host port 8000. But I want to make a post request to the query route. So this is actually what I need as the URL path. So I'm going to copy this. Go back to my code. We're going to hope that this fixes it. I'm going to put that in here. Slash query. Now I'm going to hit rainbow. Ah! Look at that. So it console logged something. What in the world? I know you might not believe this, but this is actually a rainbow right here. This is the strangest looking text version of a rainbow. But what's actually happening there? I'm getting an object. Oh, and it's got an image in it. But the image is just this sequence of all these characters. So this has to do with base 64 encoding. Let's go back to Runway to make sure I'm right about this. You can see this is the output. And image. And that image is a Base64 image. Base64 encoding, first of all, this is something that I've used in a couple other videos where I've explained this more thoroughly. So I'll link to that in the video's description. But essentially, it's just a way of encoding all the colors of an image as ASCII characters. So instead of having numbers for the colors, we have unique characters that correspond to certain color values. The nice thing about using Base64 is the web speaks Base64. So I can create an image very easily in JavaScript, in P5 with the Base64 encoding of the image. So rather than console log that, let me try to do that. And you could read more also about it on the Base64 Wikipedia page. But let me go back here. And I'm just going to say create image data image. So image property of the data object that's coming back from Runway has the Base64 encoding in it. And P5's create image function knows how to turn that into an image element that will appear on the web page. So let me bring this over here. Let me run this again. Let me hit rainbow. And there it is. Look. And it's the same one! It's the same one, because I gave it this exact vector. But what might be more interesting here is why not make a random one each time? So I'm going to do this. When I post I'm going to create a variable called rainbow Z, which is an empty array. I'm going to loop all the way up to 512. And I'm going to say rainbow Z index I is a random number between negative 1 and 1. And that's going to be the rainbow Z. So now every time I get a rainbow, press the rainbow button, it will be a different one. So now I'm getting random rainbows. Now, here's the thing. They're kind of just, by default, making all these dumb elements. Maybe what I want to do is actually draw them onto the canvas. So maybe I'll make the canvas. They happen to be 512x512. So I'll make the canvas 512x512. What I'll do is put this in a variable called rainbow image. I could push them into an array to save them. Then I'm going to say rainbow image hide. So we don't actually see it. But I'll draw it onto the canvas. So now what this is doing is it's creating the image dom element from the Base64 encoding, hiding it from the dom, and then drawing it onto the canvas. So every time I press this rainbow button-- [STATIC] Oh, silly JavaScript and your asynchronous nature you. I think I can't draw the image right here, because it's not actually ready yet. So what I think that I'll do, since I happen to have a draw loop, is I'll move this here. And I will make this a global variable that I will declare at the top. I could do this in other ways. And then I'll just check. As long as rainbow image exists, I will draw it. So now, this should give me every time I click the rainbow button-- oh, and I still want to hide it. Whoa. Whoa. Every time I click the rainbow button I get a new StyleGAN generated rainbow right here in P5 JS in the web editor being generated from Runway from the cloud. Oh, we should do a diagram. Let's review all of the pieces in this example because there are a lot of them. So I have my own laptop that's sitting there on the table over there. And there is the web browser running. That's a thing that's running. And there is also the software Runway that's running. Now, Runway has spun up a local server at local host 8000. The browser is actually making requests to the P5 web editor's server, which you don't necessarily have to do. I could just develop my JavaScript locally. But I'm actually writing my JavaScript code from the P5 Web JS editor. But it is executing and writing that code locally in the browser. So this is kind of not a super important point. But it makes a post request to Runway. So when the code makes a post request to Runway, Runway, in turn, runs on the StyleGAN model on a cloud GPU. You need to have credits to do that. That is returned back to Runway, the resulting rainbow, and then sent back to P5 and rendered in the browser. This diagram didn't turn out like I imagined it. So I thought it would be more interesting. But these are the pieces. P5 and runway are both running locally. But the actual StyleGAN model is running on a Runway server in the cloud that you have access to through your account. Now, at some point you might realize, well, what if I wanted to create a website where that would show StyleGAN rainbows. I mean, you can't run Runway locally on your laptop. But then a website that's deployed somewhere else, how would you manage that. So if somebody opens up your P5 sketch, it won't work unless they're running Runway themselves on their local computer. But stay tuned. I know that Runway is developing features to be able to deploy your server that's running the StyleGAN model too, like a permanent URL in the cloud somewhere, that you could then have your JavaScript programming accessing that other people could run without having to install Runway themselves. So that's something that you could stay tuned and follow the future development of Runway. The other thing that's really important for me to mention here is that this StyleGAN model doesn't just exist by accident. So the StyleGAN architecture is something that comes from the original StyleGAN paper and pre-trained model. One of the founders and creators of Runway, Anastasis Germanidis, actually trained a particular checkpoint for StyleGAN with rainbows. And this was trained with 5,000 images tagged with the word rainbow keyword, sorted for relevance from the Flickr API using this scraping code to scrape from Flickr, from Sam Levine, AntiBoredom on GitHub. So if you want to find out more about training your own checkpoint with StyleGAN I would refer you to these resources, which I'll include in the video's description. OK. So what are you going to do with this? I hope that you use this StyleGAN rainbow model for something fun. But more likely, hopefully what you're taking away from this is the fact that you can write JavaScript code that connects to Runway running a machine learning model that's actually running in the cloud. It could be running locally also, depending on which model you're using from Runway if it supports that. And then send a post request. Connect via WebSockets. Connect via OSC. Some network connection to Runway. Get the results back. And use that in your own web application. I would love to see people figure out interesting ways. Like how would you generate the rainbow vectors in such a way that you're kind of doing a random walk through that latent space, that 512 dimensional space. So that's something you could really think about and play with, and render something out perhaps. You might not even need to use JavaScript. You might be able to do this even more, from processing, for example. But if you make something with this, please share it with me. Go to thecodingtrain.com, the coding challenge page associated with this particular example, which you'll find linked to in this video's description. And may we fill the world with more and more generated rainbows. See you soon. Goodbye. [TRAIN HORN] [MUSIC PLAYING]
B1 中級 コーディングチャレンジ #150。ランウェイとp5.jsを使ったAIの虹 (Coding Challenge #150: AI Rainbows with Runway and p5.js) 1 0 林宜悉 に公開 2021 年 01 月 14 日 シェア シェア 保存 報告 動画の中の単語