字幕表 動画を再生する 英語字幕をプリント DAVID MALAN: Hello, world. This is CS50 on Twitch. My name is David Malan, and I'm here with CS50's own-- COLTON OGDEN: Colton Ogden. Thanks for joining, everybody. CS50 on Twitch. DAVID MALAN: Indeed. Really happy to be back here. Thank you so much for the invitation. COLTON OGDEN: The master jedi. DAVID MALAN: Yeah, so I hear. Nice to see everyone in the chat here already. We were watching as everyone was saying hello to each other, too, just a bit ago. COLTON OGDEN: Let me pull up the Twitch chat you were just talking about. I can see it over there. Let me actually scroll up. We have quite a few messages that we didn't quite read off just yet, but a lot of people in the chat already. DAVID MALAN: Yeah, Bhavik Knight-- he and I are now bffs on Facebook . So nice to see you in the chat, again. COLTON OGDEN: He deemed you the Master Yoda. DAVID MALAN: Oh, I see. Thank you. COLTON OGDEN: MKloppenburg, looking forward to this one. Bella Kirs, another-- oh, RealCuriousKiwi. That's Brenda. She's started joining us on-- DAVID MALAN: It is. Brenda from New Zealand. Yeah, Brenda what time is it there back home in New Zealand? COLTON OGDEN: I reload the page and the entire chat disappeared. Guess that's how Twitch works. Yes yeah, you'd have to go back to the actual video, scrub backwards, and it'll replay the chat for you at that moment in time, which is a feature or a bug, depending on how you look at it, I suppose. Yeah, a bunch of people. TZZEK is a new person, I think, with the what is that? The dog sensor flex emoji. I'm not actually show which one that is. Music's pretty loud. We changed the audio a little bit. It should sound a little bit better for our voices. I realize now the music might have sounded a little bit louder as a result of that. But yeah, thanks everybody for joining us. So what are we talking about today? DAVID MALAN: So today Colton asked me here to talk about regular expressions, which is a topic that in CS50, the open courseware course on edX and here on campus, that we don't really spend time on, even though there's definitely some opportunities. So I thought we would introduce them as a solution to problems that you can perhaps tee up for us. COLTON OGDEN: Yeah and you were my first intro to regular expressions, actually. I think for some scripts you needed me to do stuff and-- DAVID MALAN: Look at that. Coming full circle. COLTON OGDEN: It is. This is a thing called a regular expression, use it. And I was like, I don't know what a regular expression is, but here we are. DAVID MALAN: How about now? COLTON OGDEN: To a degree. Irene is here. She says, hi, David and Colton. And hi, everybody. DAVID MALAN: Nice to see you, Irene, and Brenda 9:00 AM in New Zealand tomorrow, apparently. So you're a day ahead of us. COLTON OGDEN: Oh, and Minter27, finally, I caught you from the start. Alex Gabrielov, 'sup from Russia. DAVID MALAN: Wow, nice to see you, Alex. All the way here from Cambridge, Massachusetts. COLTON OGDEN: People all over the place. And of course Andre. Andre was actually here when I was testing audio. DAVID MALAN: Andre keeps asking the hard questions. So we're going to take questions maybe at the end, Andre. COLTON OGDEN: If you want, I can I can bring you in if you want to maybe we can start. People feel free to definitely contribute questions, but we don't really have to say as much. People have lots of questions, typically. All programmers and Bhavik Knight. But we'll get started, and if people have questions, maybe they can contribute. DAVID MALAN: Yeah, absolutely. COLTON OGDEN: Oh by the way, new feature. We have the people that so now follow us. DAVID MALAN: Nice. COLTON OGDEN: Something we introduced yesterday. The chat is going to have this grey theme, but the follower theme is kind of in the middle and a little bit more transparent. But buimik7 has now followed us. So thank you very much for buimik7 for the follow there. DAVID MALAN: Nice, nice little animation there. COLTON OGDEN: Sorry, what I was saying is I'll cut us in here. So I have your screen, if you want to maybe start us off, and if people have questions, they can provide it to us in the chat. DAVID MALAN: Sure, so it sounds like you had a good teacher all those years ago who introduced you to regular expressions. COLTON OGDEN: He was pretty good. He was pretty good. DAVID MALAN: What are regular expressions, then? COLTON OGDEN: I would describe regular expressions as a way of matching patterns in text. So being able to specify characters that can either be specific or generic for a class of characters, defining what's called a grammar, and although I don't know the super deep details on formal grammar definitions and whatnot, but I know that it is a grammar. Computer languages, parsers, typically use what are called grammars to verify that you're using the correct semantic details that define what C is, what Python, is et cetera, and extracts symbols out of your text. And you can do the same thing with the regular expressions. That's I guess how I think of regular expression. DAVID MALAN: Yeah absolutely. And a lot of the validation that you might be doing in any web programming that you've done, or if you've taken CS50 when we do a bit of this in Python and JavaScript, you might just be in the habit of checking things for equality or maybe emptiness. So if the user did not type something in, their input might be empty, or null, or the empty string, depending on the language. Otherwise, there's a value there which might pass validation, but there's so many opportunities to actually check did the user give you what you wanted them to give you. For instance, did they type an actual email address that has an at sign and a username and a domain name. Did they type in a phone number in a specific format that you care about? And so many other ways where you care not just about the presence of a string, but what it is formatted as. And even more powerfully, suppose that someone does type in, for instance, a phone number, but some people here in the US use parentheses, some people use hyphens, some people might use a plus and an area code. There's so many different ways you might type in a number. You can actually clean those up pretty readily using regular expressions. COLTON OGDEN: Indeed. And I know that's an example that's typically, I think, seen would be like an email address, and something that's a simple identifier of what an email address is, is usually that at symbol right. And normally naive approach would be inspecting every character in a string, if you're in C or in Python, and say if character is equal to at, well, then I can kind of guess that maybe this is an email address, but what if that's, for example, the first character? Then clearly that would be an invalid email address. Or the last character, that would be invalid email address. So that naive approach can be very sloppy and prone to a lot of, I think, errors. DAVID MALAN: Yeah absolutely. And now, Minter27, I see your green screen is a bit messed up. I think this is actually by design. You're hopefully seeing a white border around the bottom and the side, and then a big black box. That's actually my black terminal window. So if that's what you're seeing, that's intentional. We're going to start typing in the terminal window soon. COLTON OGDEN: And then we got MGuudow says hi from Denmark. DAVID MALAN: Hello from Cambridge, Massachusetts. All right, so shall we dive in? I thought we would start with Python, which is a language familiar to some folks, especially if you've tuning into the stream lately. I'm going to go ahead and just open up, for instance, a hello.py program. I happen to be using VIM, which is a command line text editor, similar in spirit to the more graphical Atom, or VSCode, or other such tools these days. And this will just allow me to stay within my terminal window environment, but you can follow along with any text editor, if you'd like. And let's just do something simple in Python. So for instance, if I want to get a user's name, I might say something like name = input, and then just prompt them for their name using Python's built in input mechanism. For those of you who've been following along with CS50, you might know this function as getString, which actually does a bit more error checking, but the idea is exactly the same. And then let's just keep it simple and say something like print, for instance, hello, and then print out the person's name. So no regular expressions yet, no conditions, no fancyness. Let's just make sure we're getting into the momentum of actually writing a program in Python. COLTON OGDEN: And shoutouts to fdc227. Thank you for the follow, as well. DAVID MALAN: Well, nice to see you as well. So let me go ahead and just run this. Python of hello.py. I'm technically going to use Python 3, the latest version. So let me be ever so specific there. It's asking me for my name, so I'll go ahead and type in Colton. And there we have it. Hello dot Colton. Now if I run this program again, which I can do by hitting up in my terminal window's history, which you might know from a Linux or Mac computer or Windows 10, I can play this game again and type in my name David. And then, here, for instance, we might type in Brenda, our friend in New Zealand and now we have a program that's very dynamic. But suppose that we're not such fans of Colton, and we don't want him to be able to participate in this, and we don't want to say hello, Colton if Colton is tuning in. So how might we do this? Well, let's go ahead and back into the program. Again, that program is called VI or Vim, and here's the program at hand. And let me start to add some conditional logic. So for instance, I might say something like, well, if name equals equals 'Colton', well why don't we kind of mess around here and say something like good bye instead of hello. Otherwise, we can go ahead and print out the name as we intend. So still no regular expressions, just using string comparisons now with Python's equality operator, equals equals. And now let me go ahead and run Python 3 of hello.py, and now I'll go ahead and run that. David will go ahead and play along. Very nice and polite. Brenda, very nice and polite. Now we go ahead and type in Colton, and ooh, goodbye, Colton. So not all that polite anymore, but we've just checked for the presence of Colton. So this is all fine and good, but suppose I do this. Huh, I did that quickly, but it actually seemed to work this time. I went ahead and just typed in Colton. Now you can perhaps see it a bit more. Notice here that I'm still greeting you, even though you are Colton. COLTON OGDEN: That very first one did you put a space as well? DAVID MALAN: I did. I secretly put it at the end of the string. COLTON OGDEN: Ah, OK. I missed that part, OK. DAVID MALAN: Indeed, so I did that really fast, but you'll notice that unless Colton's name is exactly C-o-l-t-o-n, it's not actually going to match. So how can we tolerate this? Well, Python actually allows us some ways to do this. If I go into Python, into hello.py again, I could be a little more dynamic, and I could say something like if Colton in name, which will search for a substring of the original string-- look for Colton as a substring of the variable name. Now let me go ahead and run this. So let me go ahead and run Python 3 of hello.py, and I'll go ahead now and type in Colton. Still works. Space space space space Colton, still works. More subtle, Colton space still works. But better yet, Colton Ogden also still works. COLTON OGDEN: OK, so you catch all instances of me. DAVID MALAN: I can catch all instances of Colton. Of course, it does-- I don't know, Coltonscopy is Colton's favorite username here, but it will also catch that. COLTON OGDEN: Spelling it wrong, by the way. DAVID MALAN: Coltonscopy. OK, well let's be precise with your name, Coltonoscopy. That too is going to get caught. Why? COLTON OGDEN: Because it has my name in it. It's just a substring of the total name that you have there. So just those first six characters. DAVID MALAN: Yeah, so it's getting a little constrained as to how we might want to express-- it it's getting a little constrained as to how it's behaving, and we might want to start expressing ourselves a bit more explicitly. Or what if we want to do something else altogether? What if I go ahead and type in my full name, David Malan, and I just want it to print out, hello David? COLTON OGDEN: Right, just the first name. DAVID MALAN: Yeah, so now things are getting a little more interesting, and here's an opportunity just to use regular expressions. Now we don't have to, and let's make sure we make clear the different ways in which you can solve problems. . So if a human types in David space Malan, and all you want to say to them is hello, David, how do you think about solving this problem? COLTON OGDEN: Well, I think a somewhat naive approach would be look for the first space. DAVID MALAN: OK, so we could look for the first space. So let's try this. Let me go into hello.oy again, and let's go ahead and get rid of this and just start the story from when we've gotten the user's name. So if we want to split on the space, how could we do this? Like in C, as you alluded to earlier, oh my god, you could do this so tediously. Iterate over every character in the string, and then actually look for the space and print it out. So let's do that for c in name, if c equals equals a space, then we can go ahead and break perhaps. Else, we can go ahead and print out, for instance, the letter C. Now this is going to be a little broken for the moment, but let's see what happens first. Let me go ahead and save this. And I'm going to open a second tab, so that we don't have to keep quitting and opening the program again. So let's go ahead and run hello.py. Let's go ahead and type in Colton Ogden's name. And OK, so we're kind of one step closer to doing this. COLTON OGDEN: You can see that at least the iteration's working. DAVID MALAN: Yeah, it seems to be working, and so that's a nice progress. And what's your middle name? COLTON OGDEN: Taylor, T-A-Y-L-O-R. DAVID MALAN: So here, too, should still work if all we care about is your first name. Now we're opening a can of worms if we want to get your middle name, too. We'll have to come back to that, but let's go ahead now and focus on cleaning this up a little bit. It's printing one character per line. Do you want to propose for folks why that is happening? COLTON OGDEN: Because all we're doing is for every single c in name, which is going to be every character in the name, it's going to make it two checks-- well, it's going to make one check. Two checks, possibly one check, but it's going to say if the character's equal to space, break, else print whatever the correct character is. And print in Python, by default, will print out a new line character, unless you specify a separator. DAVID MALAN: Exactly, and so this is-- COLTON OGDEN: Or end of line character. DAVID MALAN: Exactly, and this is a little ugly looking in Python, but you very verbose they have to say the end of my string should not be the default, which is /n, but rather it's just, for instance, the empty string, thereby overwriting it. Looks atrocious, but unfortunately this is the way it is. And it's kind of goes both ways. In C, by contrast, you don't get the new line for free. You actually have to put /n almost everywhere, unless you don't want it. So Python optimized for, presumably, the common case. COLTON OGDEN: We have a few comments and I thought that maybe we'd go through some of those and get us up to speed here. Last I remembered seeing was the hi from Denmark, so unfamiliar4 says bang uptime. I'm not sure-- do you you know that reference to. DAVID MALAN: Let me see. How do we spell that? COLTON OGDEN: Bang uptime. DAVID MALAN: Well, let's try it. So bang up time. I'm not sure if that's what you meant, but-- COLTON OGDEN: Not sure what that is, but thank you. I'm hoping maybe that just means, hey, we're up. We're streaming, it's the time. DAVID MALAN: We are up, yes. COLTON OGDEN: Minter27 just said that that was what he was referring to was the black background. He thought that was the green screen. DAVID MALAN: Oh no. That's our black screen. COLTON OGDEN: Hybridpenguin. Hello David and Colton. I'm a CS graduate from Lunds University in Sweden and love your stream. It's really fun watching these streams on YouTube before work, but today I could finally join the stream. DAVID MALAN: Very nice. Glad to see you live as well. COLTON OGDEN: So thank you very much, Hybridpenguin. Asley says, did you know that Python is named after Monty Python? DAVID MALAN: I did actually. COLTON OGDEN: And that's an homage to yesterday's stream with Veronica. That was one of the things that she brought up first us. DAVID MALAN: Yes, I did. I saw part of that yesterday, too. And if you haven't, you can go on CS50's Twitch channel, look at the last few streams, in fact, as well as on YouTube.com/CS50. COLTON OGDEN: Yeah, yesterday's kind of ties into today because Veronica talked about a lot of Python stuff. So super, super, awesome stream. Blowintothecartridge says hi, David and Colton. Watching you guys from Switzerland. DAVID MALAN: Hello from Cambridge. COLTON OGDEN: Keep these streams up. Very, very educational. Thanks for all the great content. Thank you blowintothecartridge. We have seen you before on stream. fdc227, hello from the University of Bristol. Thanks for making knowledge available around the world. We have a lot of coverage today. DAVID MALAN: Indeed all over the world. Welcome from Harvard University. COLTON OGDEN: Cloudxyzc says hi. Hello, Cloud. Forsunlight, who is Fatima on Facebook, Hello Colton and David, regulars, and everybody. Thank you, Fatma, for joining us. And Cloud remarks seems Europe is well represented here. DAVID MALAN: Indeed. COLTON OGDEN: Lots of hellos. Lots of hellos. And then Bhavik Knight says name.split. DAVID MALAN: Oh, spoiler! But good segue, too. COLTON OGDEN: Yeah, we can take a look at that maybe next. I'm assuming that's probably where we're going. DAVID MALAN: Yeah, indeed. So let me turn my attention back to the code where we just left off by adding Colton's fix, where we changed the end of line to quote unquote. Let me go ahead and rerun this now with just Colton Ogden, since his middle name doesn't really add much to the demonstration. And, oh, so close. Now it just looks stupid. COLTON OGDEN: Now we need that new line character. DAVID MALAN: We're going to need it somewhere. So you know what? Let me just go ahead and put it at the very end of the program. We get one for free. We can just call print like this, and now we don't have to worry about being inside the loop anymore. All right so let's go ahead and run this instead. Python of hello.py, Colton, voila. Now we've printed this all out. COLTON OGDEN: It looks great. DAVID MALAN: So we're on our way here, right? We've done a nice heuristic by just looking for the space, but honestly this is pretty tedious, and it feels very C like to iterate over the entire string, looking for some special character. It's not wrong. It's perhaps not well designed because we could abstract this away, and as Bhavik Knight proposes, we can actually use built in functionality. So let me go ahead and do that instead. Let me go ahead and say something like this. Components gets name.split, and split on something like, quote unquote, with a space in the middle. So split, if you're unfamiliar, let's go ahead pull up the Python documentation here. Python 3 str split. str of course implying, a string in Python, the data type. Let's go-- COLTON OGDEN: We actually made a reference to how you're not a huge fan of the Python documentation, yesterday. DAVID MALAN: No, I already have misgivings about pulling this up. COLTON OGDEN: And Veronica was saying how much she is a fan of the Python documentation. DAVID MALAN: No, we hereby retract all of yesterday's claims to the contrary. Python's documentation is not very good, if only because it's very arcane. It's incomplete. It leaves too much to the reader's imagination. So here we have str.split. Notice that it takes in two arguments, both named arguments potentially, the separator, as implied by sep which I specified as quote unquote with a space in the middle, and then max split which tells you how many maximal substrings you want to get back in case you care. Negative one the default of no limit whatsoever. So let's just take a look here. These little screenshots in the documentation, if I zoom in here in green, are what are using Python's interactive interpreter. So some human in making the documentation typed this into their Mac or PC and pretty much just copied and pasted the output and put a green box around it. So for instance, if you had a string that had 1,2,3 and you call split on it, passing in comma as the split separator, well, you're going to get back this, a data structure in Python of type list, which is like a dynamic array, which has 1 and 2 and 3, which are not numbers. They themselves are strings or substrings, specifically. Here too notice we can max out the number of return values that we actually get in that list. Here we're getting 1 and then 2,3 all as one substring because max split was specified as only split it once for us. And then here, we're getting back everything, including an empty string because we now have two commas in a row. So just one tool in your toolkit if you've never used split before. It's just a useful way of literally splitting a string. COLTON OGDEN: And I think you can actually not specify the space and it will still default to spaces, right? DAVID MALAN: Well, let's take a look. When in doubt consult the documentation, except when the documentation doesn't say. So using sep as the default delimiter string, that's it OK. So if sep is given, consecutive delimiters are not grouped together, dot, dot, dot, duh, duh, duh. Splitting an empty string with specified separator. COLTON OGDEN: Oh here it is. It's right here. DAVID MALAN: There we go. So if sep is not specified, as you propose, or is none, which is the default per the signature up there, a different splitting algorithm is applied. Runs of consecutive whitespace is regarded as a single separator, and the result will contain no empty strings as the start or the end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with none returns the empty list. COLTON OGDEN: It's kind of like a combination of strip and split. DAVID MALAN: It is. So it normalizes the space for you. So if you've got multiple spaces in between Colton and Ogden, you're going to split on that. So actually, this is going to be a nice setup for what are regular expressions because we can split on exactly that. COLTON OGDEN: Right, nice. DAVID MALAN: All right. So let's go back to the code here and see what we get back. I called the return value components, and let's just go ahead and, for the moment, print out components to see what's going on inside there. Now I'm going to go over here. I'm going to go ahead and run Python 3 of hello.py. Let's go ahead and type in your full name this time so we get back as many components as possible, and you'll see, just like the documentation, we got back 1 2 and 3, or Colton Taylor Ogden. So how do we go about getting just the first name? If we're assuming a name like structure with first name last name, how do we get the first, would you say? COLTON OGDEN: Well, the lists in Python you can index into them just like in arrays. So you could say index in, by default, in Python. Unlike Lua, Python is indexed at 0, which is what most the programming languages are indexed at. So you could just say, if you're getting the first element, just components index 0, components square brackets 0. DAVID MALAN: OK, good. So we can do exactly that. So instead of printing components, let's print out components 0, go back to my code here, rerun hello.py, and run Colton Taylor Ogden, and voila, we're back where we started. COLTON OGDEN: Easy. DAVID MALAN: All right, so easy peasy. Has nothing to do with regular expressions yet, and that's because we've deliberately confined ourselves to pretty simple inputs. So let's make it a little more complicated and instead of actually, instead of actually getting, say, just an individual name out of it, let's suppose that your input isn't your name, but your email address. Now it's getting a little more interesting, and we're not going to care so much who you are, but that you've given us a valid email address. So let's undo all this, and let's change our variable to be called email. Let's change our prompt to say email. And now let's just say something like, if email, let's just do something like 'Thanks for the email'. Else. Let's just go ahead and say, where is your email? So this is about as simple as validation can get, and you might recall doing this in the context of web programming. Just checking if a string is actually present or not. So let's now go to the program here, run hello.py, and type in, oh forget it. I'm not going to type anything. Huh, well where is your email? So that's a nice little sanity check. All right, let me go ahead and type in my email address. Thanks for the email, but let's just saying ooh, and just type in anything. COLTON OGDEN: Oh, thanks for the email. DAVID MALAN: Unfortunately, now we have a validation problem. It'd be really nice to ensure that, no, you've got to cooperate and give us a valid email address. So how do we do this with split or equals equals? COLTON OGDEN: Well, the first thing that we can do most-- well, I would say all emails do need to have some sort of symbol to specify the-- I don't know what the technical name for it is, but the name and then the domain and subdomain. So we could just check to see whether there's an at symbol in the string as the very first step. DAVID MALAN: OK, so let's do that. So let's check for an at symbol, and we did something like this before when we checked for a Colton. Let's just check for an at symbol. So if at sign in email, let's go ahead and say 'Thanks for the email', else, we can go ahead here and say print 'Where is your email?' again. All right, so you can probably see where this is going. It's not going to be a perfect program, but here we go. All right, Malan@harvard.edu. Nice, cogden@CS50.harvard.edu, nice. Colton is a, ah, hmm, interesting. So here in the US, if you just put random punctuation in a sentence, it often means an expletive. Unfortunately-- COLTON OGDEN: A lot of unflattering Colton related topics today. DAVID MALAN: Yeah, well I'm just venting, really, today here on the internet. But so it of course, had an at sign in it, which was at the start of a subject-- I mean, which just had an at sign somewhere, and unfortunately that's the only question we're asking. So we have to be more precise. It can't just contain an at sign. COLTON OGDEN: So then we need to say, basically, make sure the at sign, first of all, can't be the start or the end, because the at is the specifier for, you have some name, some user, and then they belong to some domain dot whatever. So it can't be at the beginning or the end, and thanks DragonQuestSlime for following. We have some chat to catch up on, but we can maybe do that after the next example. DAVID MALAN: Now Bhavik Knight proposes splitting on the at sign here. COLTON OGDEN: We could do that. DAVID MALAN: Unfortunately, that's going to still be vulnerable to a different sort of threat. If we have multiple at signs in the email address, even if, though, we don't want those, we're going to get multiple parts. And we could check for that, to be fair, but it's not going to be quite as clean. It'd be a lot nicer if I can say, this is the format that I expect. Does the user's input actually match this, so to speak. COLTON OGDEN: The more of these sort of if statements, I think, that we can avoid is ultimately the goal. DAVID MALAN: Absolutely. So let's start to do this a little more sophisticatedly, if you will, and instead of just checking sort of loosely for the presence of an at sign, let's see if someone's email input is user name at domain. And we'll define it only at that super high level for now. So how might I go about doing this? Well, it turns out we can use the regular expression library. So a regular expression is a string that is you said as we began today is a pattern of symbols, numbers, letters, punctuation, and included in many languages is support for matching regular expressions and checking whether the user's input matches, indeed, some pattern you intend. So RE stands for regular expression. You might verbally abbreviate regular expression as RegEx, for regular expression. And so if I import this library, I'm going to have access to a whole bunch of Python functionality that comes related to regular expression. COLTON OGDEN: Do you think there's a RegEx versus RegEx war like there is with gif and jif? DAVID MALAN: Probably. RegEx, I say RegEx. What do you say? COLTON OGDEN: I say RegEx as well because you taught me RegEx. DAVID MALAN: Well, you learned well. RegEx, I mean that's fair because it's regular expression, and yet here I am saying RegEx. I just feel like it flows more-- COLTON OGDEN: Well, we say char, as well but it should probably be care. DAVID MALAN: Yo, that's horrible. No one should ever say care for character. COLTON OGDEN: Oh, man. DAVID MALAN: Anyhow, English is messy. As is always our languages. Oh, wait a minute, MKloppenburg, a little spoiler here. Yes, there are even more sophisticated ways of doing this. But we'll get there. We'll get there real soon. COLTON OGDEN: Andre had a funny thing he said earlier when we were talking about the Python documentation. DAVID MALAN: OK. COLTON OGDEN: Oh man, how far up was it? DAVID MALAN: Forsunlight was it? COLTON OGDEN: No, Andre said- oh where was it? Oh yeah, Forsunlight, which is Fatna, said, does CS50 have any plan to improve the Python documentation? Then Andre says, with a flamethrower. DAVID MALAN: [LAUGHS] That would not be inappropriate. I say this mostly with some historical context. For many years, CS50 actually introduced students to PHP at the end of the semester, instead of Python, the upside of which was PHP is even closer to C's syntax. It's pretty much C syntax with dollar signs in front of variables and a few other changes. But it's documentation is outstanding, honestly, especially for newbies to programming. It always has nice examples. It's standardized in how it presents its arguments to functions, return values to functions. There is often threaded discussion that's filtered out so that you have really good questions about the function or the library. So we gave that up when we switched to Python, which is really assumes, I think, a more comfortable demographic, and also an audience that is OK with incompleteness. So unfortunately, we are now among them. COLTON OGDEN: I think the thread idea would be great for the Python documentation because that's actually really smart. DAVID MALAN: The thread, what do you mean. COLTON OGDEN: Having threaded discussion that gets filtered. DAVID MALAN: Oh yeah. COLTON OGDEN: Like Reddit, for example. I think that's a great idea. DAVID MALAN: Indeed. Did COLTON OGDEN: We miss any other questions? I'm going to make sure we didn't. We have people who suggest-- so Cloudxyzc has been suggesting a bunch of different things, checking for the at symbol. He's been following along, and David's real feelings come out is says in reference to the Colton curse joke. DAVID MALAN: Yeah, indeed. COLTON OGDEN: And let's not get started with the gif pronunciation. DAVID MALAN: And you mean the jif? COLTON OGDEN: I say jif. I think because-- DAVID MALAN: I grew up saying GIF, but then I decided to go to the source. And if you actually look at the author who created GIFs, he has asked that we call it jif. I think he's the only one with a say in this situation. COLTON OGDEN: Yeah, yeah, yeah, and then Nikolai says Python doc's littered with inconsistencies. DAVID MALAN: Yes no. So anyhow, so without getting too far off track. We only got two lines of code here. We've got to take this home. So we've just gotten user's input, stored it in a variable called email, and let's now ask a more precise question, whether it looks like an email. And there are some more spoilers in the chat window here And we'll come back to those. Those are indeed good next steps, but let's just start to ask this question. So how might we do this? Well, it turns out you could say something like this. If re dot, hmm, how do we do this? Well, I'm going to call it search, and then I'm going to go ahead and specify a regular expression. Now what's a regular expression? It's going to be something at something else, ultimately. And then I'm going to go ahead, and say print out 'Thanks for the email'. And if it doesn't match that, I'm going to go ahead and say the familiar before, 'Where is your email?'. COLTON OGDEN: And to be clear, so this string that you're putting in re.search, the function-- [PHONE RINGING] Oh, I apologize for that. The function that's in re-- or the string that's the argument to re.search-- DAVID MALAN: Are we going to call it re and not r-e now? COLTON OGDEN: r-e I'm sorry. It's just little easier for me. DAVID MALAN: I'm going to say r-e, but that's fine. COLTON OGDEN: So r-e dot search. That string is basically sort of an abstract representation of what you're looking for. DAVID MALAN: Exactly, indeed. So something is not what we're actually looking for, but let's get there. Let's just take baby steps. Let me save this. Let me go ahead and rerun the program. And let me type in my email address. It's malan@harvard.edu-- not going to validate, because that is literally not something at something. And, in fact, I screwed up entirely, missing one required positional argument string. COLTON OGDEN: OK. DAVID MALAN: So it turns out I'm typing too quickly. I actually have to search a specific string. What is it that I actually want to search needs to be the actual user's input. COLTON OGDEN: Right. DAVID MALAN: So let me actually search the email variable for something at something. COLTON OGDEN: UsmanJafri, thank you very much for joining us. Oh, and Eleevas at the same time, thank you for joining us both. DAVID MALAN: Welcome aboard. So now let's go ahead and run this again. Let me go ahead and type in my actual email address. And of course it doesn't validate because it is not something at something. Let's go ahead and do that, something at something, Enter. OK. So baby step, it's not the end goal, but at least we're one step closer. We now need to generalize what the first something is and the second something. COLTON OGDEN: Yeah, because right now it looks like it's just literally looking for something and something. DAVID MALAN: Yeah. So it turns out, let's start small here and just search for harvard.edu specifically. COLTON OGDEN: OK. DAVID MALAN: So now this still isn't quite correct, but now if I go back to my code and do something at something, that's not going to work anymore because it has to be Harvard.edu. And indeed, something at harvard.edu could now actually work. COLTON OGDEN: OK. DAVID MALAN: Now it turns out that if we go ahead and do malan@harvard.edu that too is not going to work. But if I go in here and stop expecting literally something, and just look for at harvard.edu-- let's go ahead and save this, go back to my program, and now go ahead and run malan@harvard.edu. Now we're getting somewhere. And I dare say, if we go and search for say, dmalan@harvard.edu-- slightly different user name-- that seems to be working. But, but, but if we search for cogden@cs50.harvard.edu, what you think? COLTON OGDEN: It's not going to work because there's the CS50 subdomain in front of it. DAVID MALAN: Exactly. So where is your email? It's not recognizing you, even though clearly that's a good looking email address. COLTON OGDEN: Can we bring up the source code one more time? I think I missed the exact step. DAVID MALAN: Sure. COLTON OGDEN: re.search@harvard-- oh, OK. Gotcha. Because it's looking basically just for that substring. DAVID MALAN: Exactly. COLTON OGDEN: Make it true, OK. DAVID MALAN: Now we could get rid of the at sign and say, OK, well this will now detect Colton's email address too, but now he could be at not Harvard.edu and that would still match. And so it's getting a little tricky to express precisely, yet generously, exactly what kind of string we're looking for. So it turns out that re.search takes as its first argument, not just a string, which is what I've been using, but a more general regular expression. And a regular expression is, again, a pattern. And it in that pattern you can use special place-holders for strings or for substrings and characters. So in particular, if I want to say something, so to speak, conceptually, I can actually say, put any character there, and then expect the at sign. And if I want to expect two characters, I can do this-- three characters, four characters, five characters or so forth. Or if I'm not sure how many characters, I can say zero or more characters. COLTON OGDEN: OK. DAVID MALAN: Or that's a little weird for an email address, because I do want a user name there. So I can actually say, one or more characters. COLTON OGDEN: So the star is, it could be 0. It could be nothing. DAVID MALAN: 0 or more. COLTON OGDEN: Plus means any positive number of characters. DAVID MALAN: One or more, exactly. COLTON OGDEN: And then he die is just a wild card. DAVID MALAN: A wild card that, for the most part, signifies any possible character-- small white lie. We have implications with whitespace and other special characters. But for the most part, it means any letter, number, or punctuation symbol. COLTON OGDEN: OK. It looks a lot more flexible now. DAVID MALAN: All right. So it's still not going to handle you just yet, but it is going to handle me, it would seem. So let me go ahead and save this. Let me go back to my program, clear the screen and start fresh. Let's go ahead and search for malan@harvard.edu, still working. dmalan@harvard.edu, still working. @harvard.edu, not working. COLTON OGDEN: Right, because you're specified. It has to be at least one character-- DAVID MALAN: At least one character-- COLTON OGDEN: Before. And is that technically true? Do emails have to have-- can emails be one character long as the first character of the subject? I've seen, like, g.harvard.edu-- DAVID MALAN: One is fine. COLTON OGDEN: --as a subdomain. DAVID MALAN: I am pretty sure you need at least one though. The email spec is actually super complicated. And someone proposed earlier that we use a library. That is going to be the best solution in the end, because the format of an email address, even though most of us have pretty normal looking email addresses, there can be some funkiness in there. But I'm pretty sure you need at least one character. So the plus is appropriate. But there's a way around this. Suppose that you forgot about the plus operator-- you could say, well, give me one character, and then give me zero or more of another. But plus exists just to express that same syntax, dot dot star a little more succinctly. COLTON OGDEN: Gotcha. It makes sense. DAVID MALAN: All right. So unfortunately-- let me go back to the previous version using the actual plus. It'd be nice if we could actually support my email address and your email address. COLTON OGDEN: Right. DAVID MALAN: So how do we go about expressing that? We kind of want to support something dot harvard.edu, but also no such something. COLTON OGDEN: So then maybe the zero or more thing we looked at earlier. DAVID MALAN: Yeah. So we need a way to kind of express conditionally, maybe it's there, maybe it's not. But if it is there, there's only one of those things. So let me go ahead and say, well maybe we'll support CS50 email addresses dot, but you know what, let's kind of make this optional. And so I can use some special syntax. I can use a parenthesis around-- whoops. I can use a parenthesis to the left and to the right of what I want to make optional. COLTON OGDEN: OK. DAVID MALAN: And then how do I say optional here? I don't want zero or more, because I don't want it to CS50, CS50, CS50, CS50 dot harvard.edu. COLTON OGDEN: So I'm thinking like, something or something else. DAVID MALAN: Or could potentially work. So you could actually express or, and you could literally say, a vertical bar, which means, or this. And of course there's nothing there, because my thought has ended with the parentheses. That looks a little weird. So I probably wouldn't typically do that. How else might we do this? COLTON OGDEN: Besides the or, I'm not-- I'm not sure, because the or's my first thought. DAVID MALAN: It's the right instinct. But there's just often multiple ways to express this. And so we've looked at star, which is zero more, plus, which is one or more. There also is question mark, which is 0 or 1. COLTON OGDEN: Oh, OK. 0. OK. So it's limited between just 0 or 1. DAVID MALAN: 0 or 1. So it's there or it's. COLTON OGDEN: We can't do CS50 dot CS50 in this case. DAVID MALAN: Exactly. COLTON OGDEN: OK. That makes sense. DAVID MALAN: All right. So let me go ahead now and save this. We're saving hello.py. Let me go back here, clear the screen, just so we can start fresh, and go ahead and type in malan@harvard.edu. Thanks for the email. And here's the test of dmalan@harvard.edu. Looking good. @harvard.edu, not looking good. That's expected. COLTON OGDEN: Right. DAVID MALAN: And here we go, cogden@cs50.harvard.edu? Thanks for the email. COLTON OGDEN: Nice. It worked. DAVID MALAN: So now we're detecting both of our strings here. COLTON OGDEN: Awesome. It's become a lot more robust. DAVID MALAN: Yeah. Now of course CS50 is offered at Harvard and Yale. So some of our staff have cs50.yale.edu email addresses. How can we go about expressing this then? COLTON OGDEN: Maybe for that we could use the or possibly, CS50 or Yale, to limit the two, right? And we could maybe do it within the parentheses where the CS50 dot is? DAVID MALAN: Sure. So we could say, cs50.harvard cs50.yale. COLTON OGDEN: And then you can get rid of-- oh. DAVID MALAN: We'd have to get rid of this, out here. COLTON OGDEN: OK. DAVID MALAN: But I think we can shrink this a little bit. There's a little redundancy here. COLTON OGDEN: You can get ride of the CS50 or take it outside of the-- DAVID MALAN: Yeah. COLTON OGDEN: --harvard or yale. DAVID MALAN: So let's unwind here. So this is where we started. COLTON OGDEN: Right. DAVID MALAN: If I know that the domain name now is going to be-- whoops-- is going to be harvard or yale, I can literally express exactly that-- COLTON OGDEN: Right, OK. DAVID MALAN: --and just say this. And the vertical bar, much like a bitwise or in C or other languages, just means harvard or yale. And notice, you might be inclined to be nice and stylistically pretty and do something like this, like you might in actual programming languages. This is not good though, here, because you are now literally saying, give me a space, then harvard then a space, or give me a space then yale than another space. COLTON OGDEN: Right. DAVID MALAN: So don't try to over engineer your style. Just say exactly and only what you mean. COLTON OGDEN: It's very much whitespace sensitive. DAVID MALAN: Indeed. Now some folks might be inclined here to put a question mark here. Do I want to do that though COLTON OGDEN: Well no because you do want at least some domain, right? DAVID MALAN: Exactly. We want some domain there. So harvard or yale. So we want one and only one, which is just implied by just typing it out. COLTON OGDEN: Right. DAVID MALAN: All right. So let's try this. So let's go ahead and save this. Let's go over here. Let's go ahead and run on malan@harvard.edu. Let's go ahead and run it on cogden.cs50.harvard.edu. Let's go ahead and run out it on malan@cs50.yale.edu. COLTON OGDEN: Wow. DAVID MALAN: We're looking pretty good. It's pretty versatile. COLTON OGDEN: Great. We're loaded to the world of Harvard and Yale, but still-- DAVID MALAN: At the moment. Yes, indeed, at the moment. If you have a long list of schools, it's going to get messy quickly. But notice this general principle. Like, honestly, if you were to look at this string, especially being new to regular expressions, I have no idea what this means. But notice how we built it up incrementally. And even to this day, 20-some odd years after learning regular expressions, I do this too. I start matching on the simplest thing possible, test it. Add a little piece, test it. Add another little piece, test it, so that you actually understand everything that's going on. COLTON OGDEN: Forget a little syntax, Google it. DAVID MALAN: Yes. COLTON OGDEN: Forget syntax, Google it. DAVID MALAN: Sure. Indeed. Because it looks very cryptic otherwise at first glance. COLTON OGDEN: I was thinking maybe we can take a few questions. DAVID MALAN: Sure. Let's take a look. COLTON OGDEN: We can take a few comments. I'm going to scroll up here's and see where we stopped at. Oh, Nikolai was referring to the PHP code. It's Inconsistent documentation. It's been forever since I looked at the PHP documentation. Japhics interchange format, says MKloppenburg, in reference to JIF. DAVID MALAN: Oh, yep. OK. Indeed. COLTON OGDEN: Oh, Nikolai with some inappropriate-- so from yesterday as well. So if we could keep it PG, keep it kid-friendly, that would be much appreciated. Do not want to ban anybody from the chat. But we cannot have any of that sort of thing continuing to go on. Appreciate the enthusiasm though. Nick Napoli-- is that the running zombie from the game, Dead Ahead? I'm actually not sure. It's a very ubiquitously seen Twitch widget, I think. I'll have to Google that actually, referring to the follow zombie I think. A lot of comments about the profanity. TwitchHelloWorld wrote, you mean plus, not star, right? I think he's referring to-- well we covered both of them. So I think as soon as he wrote that, you probably covered the both of them, the star and the plus? DAVID MALAN: Indeed. Yep, yep. COLTON OGDEN: OK, just making sure. DAVID MALAN: Any Twitch, hello world? COLTON OGDEN: Twitch, hello world. So here plus prototype type plus, though I thought Colton said star. I'm not sure. Did I say the wrong one? DAVID MALAN: I don't recall, to be honest. But let's scroll down further. I think Twitch hello world has to-- COLTON OGDEN: Oh, right. It was just explaining afterwards and he got mixed up. DAVID MALAN: Yeah. OK. Down here. COLTON OGDEN: Oh, gotcha. OK. Are plus star and question defined in the function read out search or in the general Python documentation? Thanks. DAVID MALAN: Technically the general documentation, because there are other functions that support regular expressions in Python. There's another function called match, instead of search, which is almost the same, except it starts searching at the beginning of a string. Frankly it's not all that much more compelling, though there might be an optimization gain there. But the syntax is actually derived from earlier languages. Python has simply incorporated them into its own syntax. So the argument to re.search and the argument to re.match and other functions too potentially use that standard regular expression syntax. COLTON OGDEN: And I believe you can type an r in front of the string. And in most text editors it will syntax the regular expression, right? DAVID MALAN: Not my version of them here. COLTON OGDEN: Oh, OK. DAVID MALAN: Yeah. And that actually stands for a raw string, which just tends to be used for regular expressions to escape certain characters. COLTON OGDEN: Gotcha. Gotcha. Blah, blah, blah. Oh, it says MKloppenburg tossed the R-E. I'll say, R-E not RE, the R-E documentation link in chat. How does it differentiate between a wild card and a literal dot? DAVID MALAN: Woo, really good question. At the moment, it doesn't. And in fact I've been a little sloppy here because it turns out-- let's see if I can do this. Let's go ahead and run this program with malan@harvardx.edu. COLTON OGDEN: Ugh. DAVID MALAN: Ooh. COLTON OGDEN: Because the dot is just saying, any character you want here, whether it's a period or whether it's something else. DAVID MALAN: Indeed. So even though I've specified a dot and it looks perfectly sensible, harvard.edu, yale.edu, dot indeed means any character. So you'd have to escape it. And this is true in a of languages, Python among them. Anytime you want to say, no, I mean a literal dot, often the answer is just to escape it, the convention for which it is a single backslash. So this now means, not a special place-holder, any character, literally a dot. COLTON OGDEN: Much like you would see for most-- what's it called? Escape characters in C. DAVID MALAN: Yep, exactly. Backslash N, backslash T, backslash A, any number of other ones as well. So let's go ahead and rerun this now, malan@harvardx.edu. That does not work now. But if I go ahead and type in malan.harvard.edu, now in fact it works. COLTON OGDEN: Nice. DAVID MALAN: So I should, for good measure, go back in and even fix this here, because I do want, literally, CS50 dot, if it's present, but I don't want to put it here-- COLTON OGDEN: Right. DAVID MALAN: --because then your user name would have to be dot, literally. COLTON OGDEN: Or more than one dot, right, because you have the plus? DAVID MALAN: One or more dot, indeed. Exactly. COLTON OGDEN: I'll make sure we didn't miss anyone. EMAAAAAN says hello. I'm Emaan. So thank you very much for joining us. I've finished CS50 but can't verify my account to get a certificate because I can't have an ID. What should I do? DAVID MALAN: As always, Minter27 email certificates @cs50.harvard.edu, which wonderfully is a valid email address for today. COLTON OGDEN: Nice. Unfamiliar-- the art of regular expressions-- captioning what you want and only what you want. DAVID MALAN: That's very beautiful, Unfamiliar. Thank you. COLTON OGDEN: Internet down, lunch break, says PresidentOfMars. Sorry to hear about that. Fatma says, thank you, Colton. There are a lot of content appropriate-- we don't need that here. Asley, Colton is nice, even when he is scolding people. I thought my-- blah, blah, blah. OK. Thank you very much, everybody, for the kind words. So where to go from here? DAVID MALAN: All right. So it turns out, it's still buggy, and no one seems to have pointed this out yet. Let me go ahead and claim that, you know what, my email address is malan@harvard.education, which frankly, these days might actually be a valid TLD. COLTON OGDEN: No, that's true. Yeah, yeah, yeah. DAVID MALAN: Educational-- that's not. COLTON OGDEN: TLD, top level domain, meaning the things you put at the end of-- in a website or an email address. DAVID MALAN: Exactly. So let's go ahead and type this in, and dang it, that is not harvard.edu. So where's the bug? COLTON OGDEN: Well you're just searching for it in the string, right? And like you said, match will identify-- does match-- OK, so you said match starts at the beginning of the string. Does that mean that it will keep searching on after that? So it'd basically search with just, starting at the beginning? DAVID MALAN: Correct. Only at the beginning. COLTON OGDEN: And not just matching the contents of the string itself? DAVID MALAN: Correct. COLTON OGDEN: OK. So in that case, search and match would have the same bug. You're just doing a search. You're just iterating through it, and so as it finds the edu part, it doesn't care whether there's nothing after it-- DAVID MALAN: Exactly. COLTON OGDEN: Or whether there's another [INAUDIBLE] or something after it. DAVID MALAN: Indeed. So we need to kind of specify that we want to search to the end of the string, and the very last character has to be edu, end of thought. And so it's kind of not obvious how to express this, right? Because you want to say, no more characters, but how do you type no more characters? Well, the authors of regular expressions years ago had to just decide on an arbitrary symbol that denoted end of string. And so the character they chose is, weirdly enough, the dollar sign. COLTON OGDEN: Interesting. DAVID MALAN: So that's not a literal dollar sign. That means edu have to be the last three characters of the string, otherwise we're not going to get a match. COLTON OGDEN: If we wanted a literal dollar sign, would it be backslash dollar sign? DAVID MALAN: Indeed. If you want edu, money, then, yes, escape the dollar sign. But here we want a literal dollar sign. COLTON OGDEN: OK. DAVID MALAN: So now let's go back here, run it again, malan@harvard.educational. Now it's no longer a valid email address. But malan@harvard.edu is actually a valid email address. COLTON OGDEN: OK. DAVID MALAN: All right. So it's not that material that we're matching at the beginning of the string here, but maybe. Let me go ahead and try this again. So david j malan@harvard.edu. That's not actually my email address, but it seems to match the pattern. COLTON OGDEN: Yeah, and it wouldn't it wouldn't fly for a normal email because you have spaces. DAVID MALAN: Yeah, that's not going to work. But I still think there's a bug. So honestly, after all these minutes of, like, building up this regular expression, we're still not done. But we're almost there. COLTON OGDEN: But this would be a million times worse if this were a series of if statements. DAVID MALAN: Yes. Yeah, checking for all possible valid email addresses is not going to be very fun either. So what do we want to express here, perhaps? COLTON OGDEN: No whitespaces allowed, essentially. DAVID MALAN: Yeah. So how would we express that? So it turns out we can approach this in a few different ways. Your first instinct might be to say, well, email addresses should only have, let's say, alphabetical letters. COLTON OGDEN: Sure. DAVID MALAN: So how might we express that? Well it turns out, you can have what are called character classes and a regular expression, whereby you literally type, square brackets, and then you type out all of the characters you want to allow. So for instance, ABCDEF, ABCDEFGHIJKLMNOPQRSTUVWXYZ. I mean, this is not going to scale very well, because I haven't even typed in the lowercase letters. So thankfully, character classes also support Ascii or Unicode ranges. A through Z-- COLTON OGDEN: OK. DAVID MALAN: --is valid. And it means the exact same thing as typing out all 26 letters. And if we want to say, lowercase, we can say little a through little z. That will also work. And it does not matter that the big Z is next to the little a. Each of these is being treated as a single character, except for the hyphen, which means a range of characters. COLTON OGDEN: And it's specifically within the context of these square brackets. Because if you did this outside of the square brackets-- DAVID MALAN: Yes. COLTON OGDEN: --that would be looking for the literal string, A-Z. DAVID MALAN: Literally. Literally. The square brackets are so important here. And if I want to do numbers, that will work too. At least for decimal I can say 0 through 9. COLTON OGDEN: OK. DAVID MALAN: So now I've got a lot of possible user names now. I'm skipping some characters, but we'll come back to that in a moment. Now that would seem to be a better way of expressing this and omitting spaces. COLTON OGDEN: Right. DAVID MALAN: So let's go ahead here, go back and try this again. david j malan@harvard.edu. And interesting, it's still actually matching. COLTON OGDEN: Tuanvu9884, thank you very much for the follow. Because it's doing a search, and it's finding-- it's still finding malan@harvard.edu. DAVID MALAN: Ooh. Yeah, you're good. You're good. COLTON OGDEN: I was taught-- yeah, I was taught by the best. DAVID MALAN: Hey. Yeah, wink, wink. So we kind of want to express that there cannot be anything to the left of these characters. So just to be super clear, Colton has indeed identified the fact that this character class, which is saying, give me one or more of these preceding characters, A through Z, a through z, or 0 through 9, that matches. Because M-A-L-A-N matches exactly that. There's no spaces in malan, but there is a space before it and then there's the period and the j and the space and the david before it. But the catch is that all of that stuff can happen kind of before this character class here. So right in this space, theoretically, is there room for david, j, or any other number of strings that have spaces, because this character class is only going to match what it can, which is malan. COLTON OGDEN: So in this case would we want to switch to the re.match and start from the beginning? DAVID MALAN: We could. So let's try that. So if we go to re.match, re being the library-- re.match-- I actually don't know what most people say. re.match will start, by definition, from the start of the string. So let's go ahead and save this. Let's go back here, try one more time. david j malan with those two spaces, and now it seems to be catching the mistake now. Now let's go ahead and do malan@harvard.edu. That's working as intended to. But we don't have to do this. And honestly I find this annoying in Python, that you have to vaguely remember whether it's match or search. And honestly I get them backwards all the time. Literally before we started, I Google to make sure I got it right. I just tend to use search. You might pay, theoretically I suppose, if we read closely in the docs, a slight performance penalty. Because by saying re.match, you're giving the runtime an advantage by just starting literally at the start of the string. But frankly that tends to be an over-optimization. So I would actually just use the opposite of the dollar sign, which completely confusingly is the carat symbol, which is over one of the numbers on your keyboards typically, depending on the country you're from. And that would mean, start from the start of the string. Dollar sign means end of the string. And now we have a complete thought from start to end. COLTON OGDEN: I like that. I like that better. DAVID MALAN: I think it's just kind of cleaner, even though you could certainly make an argument for using re.match instead. All right. So let's try this. Save this, go back to our program. Type in our david space j space malan@harvard.edu. Nope, not allowed. COLTON OGDEN: Right. DAVID MALAN: malan@harvard.edu is indeed now allowed. And just for good measure, make sure we didn't have a regression. colton odgen-- COLTON OGDEN: Regression testing. DAVID MALAN: --@cs50.harvard.edu do is also now working. COLTON OGDEN: Nice. DAVID MALAN: So we're in pretty good shape now. Why don't we pop off a few questions that I saw coming in related to emails? COLTON OGDEN: Yeah. Let me go up to where we stopped. So Bhavik says-- oh yeah, raw string, which is what you said. Very easy tutorial in RegEx, says Cloudxyzc. So that's what-- you're saying that you're doing an awesome job. DAVID MALAN: Oh, thank you very much. COLTON OGDEN: Some of us are new to it, like me, says Asley. Yeah, no, this is great. This is awesome. AkshayMandhan says hello. Good to see you. Thank you for joining us today. Brenda's plucking off, as well, certificates@CS50.harvard.edu for cert questions. Thank you, Brenda. Mitch27 says, thank you. Crabs01, I need a course on statistics. I think we're going to have a-- sorry, I'm blanking-- a stream on R with Andy Chan in a couple of weeks, which will kind of go into biostatistics. So tune in for that one. Dollar sign, then shortcut to go to the last character of the line, is that the case? DAVID MALAN: Oh yeah. This is now unrelated to regular expressions. Notice my cursor is currently here on the screen. If I hit the dollar sign, I can coincidentally go all the way to the end of the line as well. COLTON OGDEN: It almost seems like it is related. It seems like that's probably deliberate. DAVID MALAN: Oh, I suppose it is, actually. COLTON OGDEN: Yeah. DAVID MALAN: Yes. Dollar sign is, indeed, deliberate. And I can hit the carat symbol to go to the beginning. COLTON OGDEN: Yeah. Yeah, they have to design it that way. That's cool. Vim tutorial-- vim tutorial would be cool. DAVID MALAN: There you go. Well you're going to have to get someone better at Vim than me though. COLTON OGDEN: Irene, would edu dollar accept malan@harvard.educational.edu, or would it only match edu you only at the end? DAVID MALAN: If you literally typed dot edu dollar sign, it would only match dot edu at the end. COLTON OGDEN: Right. DAVID MALAN: If you wanted to support educational, we can go down this road. Notice that we could do, edu(cational), put that in parentheses, add a question mark and make it optional-- COLTON OGDEN: Nice. DAVID MALAN: --so that it's there or not there. COLTON OGDEN: OK. And I think she was saying also, educational-edu, like have edu kind of by itself after another string. Like educational edu. DAVID MALAN: No. So if you have dot edu, dollar sign, you will literally match only those four characters-- dot edu dollar sign. Anything more expressive than that, you're going to need to lengthen the string. COLTON OGDEN: Right. We can have gmail, yahoo, et cetera. How to check that after the at symbol-- we have to have one dot character. DAVID MALAN: Say that once again? COLTON OGDEN: We can have Gmail or Yahoo, et cetera. How do we check after the at symbol to have only one period character? DAVID MALAN: Oh. So right now, we would have to jettison our subdomain here. So right now we are allowing for cs50.harvard.edu. Sorry. We're allowing for CS50 dot to either be there or not be there. That's, of course, where potentially our second dot is coming from. So we could certainly get rid of that second dot by just no longer supporting subdomains within harvard.edu. And what's nice about this now is that we could support even other universities. So for instance, we could add Stanford in there or MIT or any number of others without worrying about the subdomain, so long as they all end in dot edu. Or-- let me just free up some space-- I could even be a little crazier, and it's going to look a little ugly, but I could do this, in parentheses, and then I could say something like, or gmail.com and actually support either harvard.edu or yale.edu or gmail.com, so long as you build these up a nested fashion. So this is kind of like arithmetic with parentheses. Growing up, if you did lots of additions and subtractions and multiplications, divisions inside parentheses, order of operations matters most. So when reading these things, you're going to want to look for the most deeply nested parentheses. For instance, harvard.edu. then work your way out from those, thereby looking at this. Then you could notice, oh, here's a vertical bar. So that means this thing to the left or this thing to the right is what's going to have to match. COLTON OGDEN: Nice. And Bollco87, thank you so much for following us. Let me make sure that we didn't miss any other questions. Wait, is this the professor from CS50 at Harvard? Says, HomeLine. DAVID MALAN: I think so. COLTON OGDEN: Yes. This is David Malan B, Jedi master, the master Yoda. DAVID MALAN: Named by someone else. COLTON OGDEN: And, yep. And they're saying that a carrot is the Vim shortcut to beginning of the line. DAVID MALAN: So this is turning into a Vim chat. COLTON OGDEN: A little bit, yeah. That's kind of the direction a lot of streams go. ShaneHughes1972, is it advantageous to wait for the next calendar year to start the course? DAVID MALAN: No. If you have time now, start now. There's always going to be something new on the horizon. Companies release new hardware every year. So I think the same logic you might apply to buying a laptop or a computer or a game console or whatnot applies here. Yes, you could wait for the next one, but you're then missing out on the next few weeks, months, or whatever that duration is. So if you want to start something, whether it's CS50 or something hardware or some other course, start when you have the time. COLTON OGDEN: That being said, we are currently in the process of getting our January 1 release for CS50 on EdX 2019, which uses the 2018 material up and online if folks want to put that in their calendar. But folks can go on YouTube just to see CS50 2018's lectures right now if you want to get a head start on all the material, and then sort of do the work and then submit your content, submit your work for the 2018 content to start the calendar year. DAVID MALAN: Absolutely. You're certainly welcome to wait. Brenda, yes we can see you. So for all the extensions, we have to hard code the subdomain. Short answer, yes. You could generalize this with a function. You could build up your string using even rows from a database. But short answer, yes. In the simplest form, you just order them all together using the vertical bar. Of course at some point it becomes less readable. So frankly, from a design perspective of my code, if I want to handle both cs50.harvard.edu and yale.edu, I might leave this regular expression now alone. And if I want to actually support another domain, I might do something like this and say, you know what, I'm also going to support gmail.com or, you know what, let's go ahead and support gmail.com or outlook.com to support two dot coms. And you could start to bucketize your conditions into, these are the edus, these are the dot coms. It's adding some redundancy and it's indeed being a little more wasteful, because you might be checking the strings more times than you need to, but you're probably over optimizing. If you care about that, you're running this code in a loop that's just executing so many times that those milliseconds add up. Frankly I would find something like this probably more maintainable, more readable, even if you're paying a minor performance price. COLTON OGDEN: This would be definitely, if you have some sort of semantic value associate with different domains-- DAVID MALAN: Yeah. COLTON OGDEN: --but a lot of websites probably just have generic-- you can have any valid email from any given website or domain-- DAVID MALAN: Indeed. COLTON OGDEN: --and it will probably, I'm guessing, just use the same pattern that we showed at the very beginning of the A to Z. DAVID MALAN: Essentially. You can actually be more fancy than that. And in fact, let me undo this so that we can build this out. Many of you online probably have email addresses that have, for instance, dots in them, dashes in them, underscores in them. It turns out character classes are wonderfully receptive to that. You can literally just put in underscore. You can put a escaped dot, and you can put an escaped dash. The escaped dash is super important now because, in the context of the square brackets, it represents a range character as well. So now that's even more expressive than it was before. COLTON OGDEN: Vullem, thank you for joining us. He says, I can listen to Mr. Malan for hours-- great teacher. DAVID MALAN: Thank you. COLTON OGDEN: If you want to do so, again, all of our lecture videos are on CS50's YouTube channel, which you might be watching it right now if you're watching this Twitch video on YouTube. DAVID MALAN: Oh, I like what Bhavik Knight has proposed here. Your regular expression's a little fancier. You're supporting not only dot edu and dot come and dot org, but this backslash W is kind of interesting. COLTON OGDEN: Yeah, at the start of the string, which will allow us-- I'm assuming it would allow us to put some spaces beforehand, before the email? So that way if the user accidentally hits space or whatnot in the field, it won't say that there's an error. DAVID MALAN: Good hypothesis, but not quite, if I can clarify. COLTON OGDEN: If I'm wrong, I apologize. DAVID MALAN: That's OK. I'm actually going to pull up the documentation here, because I think it might help to see a more thorough listing of the various symbols that are allowed. So let me go ahead and search Python-- COLTON OGDEN: It's any non-white space character. Is that correct? DAVID MALAN: There you go. It's literally the opposite of what you are saying. COLTON OGDEN: I forgot about the slash. By the way, missed, TulioNoguera, thank you very much for the follow. And BJeff, I'm not sure if I caught that one. Thank you very much for the follow as well. DAVID MALAN: So here I am on Python's regular expression operations. So you can see this on docs dot Python dot-- whoops. Docs.Python.org/3/library/re.html. So here you'll see Python documentation for regular expressions. And it's way more verbose than we need to get into just now. But let me start to scroll down to regular expression syntax. You'll see some nice introductory explanations of what these things are, though learning from the Python docs is probably to be easier said than done. But here's a list of the special characters. So dot, we already discussed, meaning any character, except for a new line. So I did say there's some corner cases with the whitespace, and that's indeed one of them. Carat, which matches the start of the string. Dollar sign, which matches the end of the string. Star and plus and question mark-- man, we've actually bit off a lot of these for now. COLTON OGDEN: Right. Yeah. DAVID MALAN: Let's keep scrolling further. I'm going to wave my hands at some of these, because there's some fanciness you can get into that, honestly in my life, I've not had terribly many occasions to need greedy matches or-- sometimes greedy matches, but you can also do look ahead and some other fancier features. And I don't think we'll get too into depth on that. But rest assured that if you ever encounter a problem that you're struggling with regular expressions, odds are you can solve it. So dive back into the documentation to find some additional feature that they might have. We didn't look at some of these. Curly braces actually have special significance. So let's actually come back to the code we were writing earlier. And suppose, like, someone was proposing we support edu, com, and org, like Bhavik Knight was proposing. Suppose we just kind of generalize that. Well I could say, com or edu or org. Or, you know what, those are three letters. We could maybe, a little lazily, just say, you know what, go ahead and just support three letters, dot, dot, dot. Very lazy. It's going to allow for weird domains that don't actually exist, but so be it. But you could also express that by doing this. So now things are getting really cryptic. But if we parse this, you see harvard or yale. Then you have a literal dot, because it's escaped. Then you have a wild card, any character, and then three copies of any character. Now this doesn't have to be the same character. It's just any character, any character, any character. COLTON OGDEN: And is this 3 referring to the dot that you put before those brackets? Or is it by default brackets? DAVID MALAN: No. It's literally referring to the dot beforehand. So if you wanted to refer to the same letter again and again and again, then it has to be here. So if you wanted to have the letter A, it would be A, A, A. Or dot would be something, something, something, but different something's. COLTON OGDEN: Makes sense. DAVID MALAN: And, if for whatever reason, you want to support, say, two characters or three, which you might with domain names-- like country codes are two letters or more traditional TLDs are 3, you could do 2 comma 3 and do a range. So syntax is getting really crazy now. But again, if you just focus on what the basic definition is, it all works out pretty cleanly. COLTON OGDEN: Code Beastie and Stimpy, thank you very much for the Follows DAVID MALAN: Oh, nice to see you as well. And let's finish this thought with backslash W. So let me go back to the documentation. And now I'm just curious. Let's just start going and going and going. And let's see. Here we go. So in the discussion of character classes, as denoted by square brackets, you see a whole bunch of bullets here explaining things. I'm going to fast forward to this one. Character classes, such as slash W or slash S, capital S, are also accepted inside a set. So let's find the documentation. It says, define below. So let's keep going. Excuse me. Let's keep going. There's a lot of features or regular expressions, but you'll use these less frequently, some of them. OK. Here we go. Now we're getting to the special characters. And here, that's our slash W. So for Unicode or str patterns it matches Unicode word characters. This includes most characters that can be part of a word in any language, as well as numbers and the underscore. And you can see, if you actually use Ascii, which is a subset of Unicode, just fewer characters that have been around since the beginning of computers, notice that it's using almost the same pattern that we were using, except that I added in dots and dashes to support things like gmail. COLTON OGDEN: When I started the first time I thought it was a W for whitespace, which is why-- DAVID MALAN: No. Totally reasonable. COLTON OGDEN: --I had that instinct. DAVID MALAN: But we got something for you too. COLTON OGDEN: Uh oh. DAVID MALAN: Backslash s. Lowercase s does indeed match any space, which includes a literal space here, a tab character, a new line, carriage return, form feed, and vertical feed as well. COLTON OGDEN: I don't actually know what a form feed is. What is a form feed? DAVID MALAN: A form feed is sort of old school, where it moves down to the next line, I think, from typewriter days, essentially. COLTON OGDEN: OK. It makes sense. It makes sense. DAVID MALAN: I think that's what it is. It's been some time since I needed it to work. But notice-- let me point out one other thing. If we keep scrolling, you'll see kind of the opposite, backslash capital S, which matches the opposite of backslash s. So if you want, not whitespace, but anything other than whitespace with some of these character symbols, you can actually just capitalize it. The same thing for backslash w. If you don't want a word character, you want everything else, all the funky characters on the keyboard, then you can do backslash capital W, all within those character classes, or even outside, if you just want to match one such thing. COLTON OGDEN: So a lot of learning regular expressions is kind of dialing into this documentation and memorizing what all these, sort of symbols mean. But the logic's for it is actually very simple. DAVID MALAN: Yeah. And in fact, unfortunately regular expressions syntax is kind of hard to google, because you're typing in crazy sequences of symbols. So just express it in English or any language you speak to see if you can find your way to Stack Overflow or someone's explanation. COLTON OGDEN: Like regular expressions, avoid whitespace. DAVID MALAN: Yeah, exactly. That's a good one. Should we take a look at the chat again? COLTON OGDEN: Yeah. TwitchHelloWorld says, do some people who have or rent their own server also create email addresses using these? Is there a point at which one might simply codes some wild cards like Colton is saying, then have a program that simply tests that the email goes through without receiving an error message? DAVID MALAN: That's a really good question. And I would make a distinction between syntactically valid and actually valid, where actually valid in my mind would mean it's a real email address that belongs to a real human, that's ideally checking that email account. We're just talking about syntax today. Regular expressions cannot tell you if cogden@cs50.harvard.edu actually exists or if malan@harvard.edu actually exists. All it can tell you is that yes or no, this email address is structured in a way consistent with the formal definition of an email address. COLTON OGDEN: Makes sense. OK. DAVID MALAN: So yes, you would need to use, like a cloud-based service or your own server to actually send a verification email to the human, like all of us are in the habit of receiving when we sign up for new accounts on websites, to actually see if the human responds and confirms the existence. COLTON OGDEN: Got it. Bhavik Knight says, do we need to escape in a group? I think it doesn't need to escape in a group. DAVID MALAN: In a group-- in a capture group, yes, you would still need to escape, if that's what you mean. COLTON OGDEN: How difficult would it be to create a RegEx parser from scratch? DAVID MALAN: That's a good question. How to create a regular expression parser from scratch? It depends on how many features you want to support. To be honest, most of the features that we have just discussed can be implemented relatively simply. And in fact, if we can get all academic on you-- can I pull the whiteboard out for a moment? COLTON OGDEN: Yeah, absolutely. DAVID MALAN: So if you've never-- COLTON OGDEN: I'll step out of your way. DAVID MALAN: Sure. So here we have an actual whiteboard, no technology here. I've pulled it onto the screen and I've got my black marker here. So it turns out that regular expressions map to, academically, something called the class of regular languages that can be expressed in special syntax. That is regular expression syntax. But it turns out they map directly to what are called DFAs or deterministic finite automata, which are very simple machines that you can implement on Mac or PC or even on a whiteboard, that represent that particular language. So for instance, the way you would typically draw a DFA or deterministic finite automaton is with states. So I might draw a circle here. And hopefully everyone can see this from afar. That circle-- I'm just going to put a little caret symbol there to imply that this is the first state. And if I want to ultimately draw a picture that represents an email address, I'm going to essentially do something like this. I'm going to think of the email address as having three parts-- the beginning, the middle, and the end. And the end will be my final state here, just denoted with a slightly different symbol. And what I want each of these states to represent is something. So I might here think of this state as representing the at sign. At that point I've read in an at sign. So before that is the user name. And after that is the end of my expression. Now how am I actually going to do this? So here I might draw a picture that says something like this. In order to start from this state and move from this state, I have to consume some number of letters. And let's keep it simple and let's just say that it's alphabetical letters for now. So I have to consume, either an a through a z, for simplicity in all lowercase. This though transition means that you would only consume one letter at a time. So at the moment, this picture represents an email address that only has a single letter in it. And in fact I'm going to have to draw another dot here to follow this pattern, so that I have two states here. This is before the at sign. This is after the at sign. So if I draw another transition or edge here, that represents the at sign. And then the end of my email address, let's say, has another alphabetical letter, which is a through z. So short of it is now, I have built a machine, or rather a picture of a machine that says, you can have any character that's alphabetical, then you have an at sign, then you have another letter as your domain. This is obviously incomplete, because A@A is not an email address, at least as we've defined it thus far. So we would need to start to enhance this picture a bit more. So what would that actually mean? Well if I want to support one or more a to z's, I need to enhance this picture. I need to add another state to my machine. And so I'm going to move the start of the machine over here, which you can still now see over here. I'm going to go ahead and have an edge going to this state, which is a through z. But then I'm going to have another transition that allows me to, for instance, go back and forth on a to z as well. And notice, this is a deliberate loop. I can consume an a or z for my string. And then if it's two letters, I can do it again. If there's three letters, I can do it again. Four letters, I can do it again. And when I'm ready to read the at sign, I can then-- whoops. Oh, whoops, whoops, whoops. David messed up. Sorry. This is why we don't do things on the fly. Here we have-- sorry. We have the at sign there. So here we have-- here we go. I drew it in the wrong place. My apologies. I can consume a to z here. And now let me go ahead and draw the original dot here, a through z here. My apologies. So I consume the first letter. Then I can immediately consume the at sign from the user's input, and then another letter, thereby putting me from user name, at sign to domain. Or I can consume one letter for the user name, consume another, another, another. And now I have a user name that is one or more characters. And so you see this beautiful mapping here. This now represents, essentially, the block that we described as dot plus earlier, if again, we're keeping it simple with just letters of the alphabet. So you have this direct mapping now between the syntax we've been talking about and the machine, at least pictorially that you might build. And so implementing a parser for a regular expression really amounts to implementing code that does this. And you'll see some familiar constructs. Obviously if you're doing something again and again, this connotes a loop. And all of you know how to implement a loop probably, using a for loop, a while loop, or maybe even recursion to do something again and again. So you could imagine writing code that just has different states or constants that represent each of these states, where one of your variables might mean, I am reading the user name. Then another state that means, I have read the at sign. Then a final state that means, I have read the domain name. And as soon as you end up with a value in that variable that represents the so-called end state, you have parsed an email address. So that's like a whole, let's say week in cs theory. But yes, implementing a parser with a regular expression really boils down to just thinking about how you model that regular expression using a certain syntax, map it to a picture of a machine, and then implement that machine in software. COLTON OGDEN: Let us know if you want David to teach a theory course, because I think that'd be pretty cool. But yeah, that was a cool-- DAVID MALAN: I hope that wasn't too much of a tangent there. But that stuff is really quite fun, and it really does bridge the theory and the practical world. COLTON OGDEN: No, no, that was great. Some people had some comments. They said, can you join the first and the second circles? Instead of two just make one? I think maybe make the first one a loop? Or does that first need to be a separate node? DAVID MALAN: Really good question. And that's why I was struggling under pressure. The first state needs to lead to another state, because you have to consume at least one symbol. And if we had put the loop on that first state, we could accidentally never go around that loop, immediately start with the at sign, and that's going to give us an invalid email address. COLTON OGDEN: Because it's looking for those, sort of, what are they called? Transitions. DAVID MALAN: Exactly. COLTON OGDEN: And the transition-- it can take those as paths-- DAVID MALAN: Exactly. COLTON OGDEN: --and execute on them. DAVID MALAN: Exactly. COLTON OGDEN: We can continue the same mapping after the at state also to get more than one a to z. Yes. And that's because for brevity, you drew it. But you can have another loop at the end, after the at. DAVID MALAN: Yes. I didn't finish the story. We'd need an actual loop or cycle to do multiple letters. And, frankly, we'll want additional states if we want have a dot and then a TDL like edu or dot org or whatever. COLTON OGDEN: Yeah, that's super cool. It reminds me of-- is this a finite state machine as I think? Yeah, because we use that, like in the games-- DAVID MALAN: Games, yeah, absolutely, all the time. And honestly, if you're familiar-- they're getting a little dated these days, but soda machines. If you've walked up to a soda machine and put in coins, no matter the country you're in, a soda machine is a finite state machine or a deterministic finite automaton. Deterministic in the sense that no matter how many times you put it in the right amount of money, it will behave exactly the same way, assuming there's still soda left. And the way you can think about a soda machine is being similar, just as with regular expressions, each of those states represented where you are in the string. I have read a character. I have read multiple characters. I have read an at sign. I have read the domain name. Each of those circles on the board represented something conceptually. A soda machine, in the US, for instance, is going to have different states as well. You can insert a dime, a nickel, and a quarter, but not pennies, for instance, typically. So there is probably a $0.05 state, there's a $0.10 state, there's a $0.15 state, there's a $0.25 state, a $0.30 cent state, but there's not a $0.31 or a $0.32 or a $0.33 state, because every time you drop a coin in the machine, it's as though the soda machine is following a transition, following a transition. And as soon as you get to the dollar state, or however expensive the soda is, then the soda pops out. COLTON OGDEN: Can you describe computer programs therefore as being deterministic finite automata? DAVID MALAN: Some programs, yes, if they are indeed behaving completely deterministically. If they're behaving non-deterministically-- COLTON OGDEN: That's true. DAVID MALAN: --you might have some randomness. But even randomness in computers is deterministic, at the end of the day. So, short answer, yes. COLTON OGDEN: OK. That makes sense. That makes sense. Let's make sure we're OK-- keep with the chat here. By the way, thank you to axXelus for following us. I saw that pop up during the whiteboard session. DAVID MALAN: Oh, I think we just signed ourselves up for us a course on finite state matches. COLTON OGDEN: I thought that was actually really cool. I liked the whiteboard. Next time we'll try to get the drop 50 in integration to the setup. DAVID MALAN: Indeed. If you saw one of our previous streams with Colton and Dan Coffey, we have the beautiful screen and web-based software that Dan wrote, via which we can draw pictures as well. Much better than old school. COLTON OGDEN: And David has an amazing new-- was is this thing called? DAVID MALAN: Oh, well we'll see here. Just off screen is a beautiful new tablet that we can draw on, which will allow us to draw pictures and diagrams much more easily. COLTON OGDEN: Yeah. We'll try to get that set up for our next stream together. Let me see where we are. Yes, a theory course, hearts, say AllProgrammers. Is there a term for this type of diagram? Yes, a theory course by him would be fun because he's always so practical regarding applications too and scaffolds nicely. And I think you did say, a finite state machine or a deterministic finite automata was the name for that. DAVID MALAN: Yep. COLTON OGDEN: Yeah, we need that course, says Osman. Anything taught with passion will be interesting, so keep the lessons coming, says PresidentOfMars. David is very good at that. Wow, that's an old school way of doing it, says Bhavik Knight. DAVID MALAN: Thank you. COLTON OGDEN: Old school, and when we bring the tablet, we'll bring the old school with the new school. DAVID MALAN: True. Though DFAs have been around for a long time too, so maybe that's the old school way of doing it. COLTON OGDEN: When did it first come out? You think like the 50s or 40s? DAVID MALAN: Yeah, around there. It derives from math and discrete math. COLTON OGDEN: Turing, yeah? DAVID MALAN: Yep. Mm hm. COLTON OGDEN: In terms of design, since you have to program the verification code too, to check the email address is a real email, is there any reason not to be somewhat loose and broad in this RegEx code, such as using wild cards and hard coding each of the dot com, dot TV, et cetera? DAVID MALAN: Short answer, yes. Your odds are these days, you're not going to hard code all of the TLDs because there's an atrocious number of them-- hundreds, probably, maybe approaching a thousand or something crazy. So yes, you're probably to focus more on syntax and not on the validity of those top level domains. Because as someone alluded to earlier, you're probably-- and actually I think it might have been Twitch Hello World, you yourself-- you're probably going to send the user a confirmation email. And if the email bounces, because the domain doesn't exist, you've got your answer, and you don't have to infinitely, exhaustively check whether or not the email address itself was a valid domain. And if you get the user to click a link, confirming the existence, then you're OK. So, yes. So you'd probably want to do a high pass at the email address, using a regular expression like we are. Or better yet-- and we'll end on this note too-- using a library that comes with Python or any number of other languages, just to do that initial validation. Because humans-- at least here on campus, among Harvard University students, we can tell you that about 10% of them miss-type their email address, if we just ask them on a Google form to type it in, unless we pre-populate it, which we do instead. COLTON OGDEN: Makes sense. And I think if we did have a, sort of a limited list of TLDs to choose from-- for example, whatever happened back in, I think it was 2013 when they added a bunch of new ones-- if that were to happen again, well, then the code would break for-- DAVID MALAN: Exactly. COLTON OGDEN: --new registrants that use those TLDs. DAVID MALAN: But there's a lot of websites out there, especially if you're a student, where you're in the habit of signing up for free stuff because you have a dot edu account. So if you're at a university or a high school that gives you a dot edu address or something more local to your own country, you might use a regular expression. Just make sure that it's an actual student eligible for free software or whatever, so they still have their value. COLTON OGDEN: Right. DAVID MALAN: Certainly. COLTON OGDEN: I can make my own Colton Ogden dot edu and get some free stuff. DAVID MALAN: There you go. COLTON OGDEN: All programmers, can you possibly add DFAs and FSMs into CS50? Maybe just an introduction? There's probably not enough room in the course to integrate those, you think, right? DAVID MALAN: Realistically, no. But that's why we have these live streams and other forms of seminars and suck COLTON OGDEN: I've personally wanted a follow on for a long time. This would be a great-- I think RegEx and these together would make a great lecture though. DAVID MALAN: Oh, thank you. COLTON OGDEN: Yeah. We should think about that a little bit maybe. Can draw 50 print a PDA and add pages to it? From what I understand, not PDF, but to PNG files, yes. DAVID MALAN: They can, yeah. COLTON OGDEN: Oh, does PDF work? DAVID MALAN: Well, I mean, draw 50 is just a web-based application. So you could literally go to your own browser's file print menu and generate a PDF of it, which would actually work for you. And speak of the devil! Cs50's own Dan Coffey from last week's stream is here to take your feature request. Dan, come on screen. DAN COFFEY: Obviously, Dan Coffey. DAVID MALAN: So, Dan, we were just asked, can draw 50 print to PDF and add pages to it? And if not, how quickly could you add that feature? DAN COFFEY: So we could download a PNG at the moment. We can easily do an SVG. DAVID MALAN: Ooh. DAN COFFEY: Scalable Vector Graphic. DAVID MALAN: Thank you. DAN COFFEY: I don't think it would be-- well. I don't know. Do you need a server side to convert to PDF? DAVID MALAN: Yes, to generate the download trigger, yeah. DAN COFFEY: Cause to generate the PNG download, we just change the headers. DAVID MALAN: Yeah. PDFs are annoying. We might be able to do it. But honestly, using the browser's built-in mechanism is probably the simplest way. COLTON OGDEN: Is something broken? Is that why you came in? DAN COFFEY: I heard that the tablet wasn't working for drawing. DAVID MALAN: Oh, we just haven't connected it. COLTON OGDEN: It's disconnected. DAVID MALAN: That's OK. COLTON OGDEN: I disconnected it to get the stream set up. It had all the other stuff hooked into it. Functional, I'm sure, but not plugged in. DAVID MALAN: We're just here talking about regular expressions, if you'd like to talk about your favorite features or regular expressions-- character crosses or? DAN COFFEY: Reverse search is always fun. DAVID MALAN: Oh, reverse search. Nice. Yeah. COLTON OGDEN: I'm actually not too familiar with that one. DAN COFFEY: I love using just multiple capture groups. It's like the most-- to get everything you need in one search. DAVID MALAN: What a perfect segue to capture groups. COLTON OGDEN: Doing a segue for that. DAVID MALAN: So yeah, it turns out that we've been using these patterns thus far to just check whether or not the string matches a pattern. But sometimes you want to extract information from strings. You could use split, as we started. You could use substring. But you could also use what Dan described just a moment ago is capture groups. And a little confusingly, they too tend to use parentheses, but we'll distinguish exactly what's going on here as follows. So how do we go about doing this? So it turns out suppose that we wanted to ask the question, are you from Harvard or are you from Yale, and did you type in your email address? Well let me rewind to a simpler RegEx, which is where we were before. It turns out that every time we use parentheses in this way, we are using what's called a capture group, where we are telling the library, the re library in this case, go ahead and capture those substrings for us. So don't just match on them, but allow me to do something interesting with them. So we can't quite see this now, because we are treating the return value of re search as being a Boolean, which it's technically not. It's actually going to return to us a list that's empty if there's no match, or is non empty if there are matches. So let me go ahead and do this, and say, matches gets re search. And then the equivalent code would just be matches. So I've not done anything new or interesting just yet. Let me get rid of the colon at the end there. But now I'm actually storing the return value. Let's just poke around and see what Dan was alluding to by printing out those actual matches, actually only in the case of it being non-null. COLTON OGDEN: That they exist, yeah. DAVID MALAN: Exactly. So we're going to see, thanks for the email, and then the actual contents of the return value of re research. Let me go ahead and save that, go over to our other terminal window, and let's go ahead and do malan@harvard.edu, Enter. And you see some interesting fanciness here. And it's not obvious what's inside of that, because it's actually a whole object, an object belonging to a certain class called re match. But we can actually check this. Let's go to Python-- Python 3, re search. And let's see if we can't find ourselves to the capture groups. Let me search for a capturing group. Let's see, not on that page. Let's do re capture group to find the right documentation, just so folks can consult it later. We use search because that searches any part of the string. Oh, you don't see him, but Dan's still over here, everyone. COLTON OGDEN: Also, shout out to Kareem in the chat. Kareem Zidane. DAVID MALAN: Nice to see you, Kareem. So you'll see here a discussion in this link here, which is on docs.Python.org/3/howto/regex.html, which is more of a discussion of how to use regular expressions. That parentheses also indicate capture groups. And we can actually use some functions that come back as methods inside of that re match object that's returned. So what does that actually mean? I'm going to focus on using group as follows. I'm going to go into my code again, and instead of just printing that out, I'm going to go ahead and say, you know what, let's go ahead and print out the first group that matches one indexed. And let's see what happens. I'm going to go ahead and save that, reload my program, type in malan@harvard.edu, Enter. And we see none came back there, which is interesting. But that's OK. Let's poke around a little further. Let's look at the second group, not one but two. Let's go ahead and run this again, malan@harvard.edu. Interesting. COLTON OGDEN: OK. DAVID MALAN: So why do you think we've captured Harvard the second time but nothing the first time? COLTON OGDEN: What's the capture group over? It looks like the CS50, right? And it was a harvard.edu without a subdomain. DAVID MALAN: Exactly. COLTON OGDEN: So there was no subdomain. DAVID MALAN: Right. So because, in my regular expression I have two sets of parentheses, a.k.a. capture groups as Dan called them, this one in parentheses actually captures, either CS50 dot or nothing at all, because the question mark can mean 0 or 1. The second group of parentheses here captures Harvard or Yale. Because both of the parentheses are there, I'm going to get that group one and group two, it's just one of them might actually be none if the CS50 dot is not actually present. So I could now do something more conditionally. I could do something like this. If matches.group(2) equals equals harvard, I could say something more precisely like, thanks for the Harvard email. Else I could say something more like, print, thanks for the Yale email, just thereby distinguishing the type of email address I got back. COLTON OGDEN: And we can appreciate just how much complexity, in terms of the iterative logic or the imperative logic that we would have had to incur to get to this point here. DAVID MALAN: Absolutely. Indeed. So let's go ahead and do this. So let me go ahead and rerun this program once more, malan@harvard.edu. Oh, thanks for the Harvard email. But, notice, we're still supporting CS50. Thanks for the harvard.edu email, but if I go to Yale's email address, now I've distinguished these two. So the capture group, as Dan referred to it is a perfect name, because you're capturing some part of the substring. You're being handed it back. And in Python you can get at that value by using the group method. COLTON OGDEN: And 0 is the whole string, right? DAVID MALAN: 0's going to be the whole string, which is why I deliberately started at one index. It tends not to be that useful. But it does ensure that if there's a match, the list is going to be non-empty-- COLTON OGDEN: Sure. DAVID MALAN: --which is handy. So what if I didn't care about the CS50? I was using the parentheses because I wanted them, but I actually don't want to capture those specifically. It turns out that we can actually tell Python, use these parentheses for grouping, and to actually have or not have CS50 dot there, but I don't necessarily have to specify if they're going to be in the capture group. COLTON OGDEN: Is this Python RegEx syntax specific in this case? DAVID MALAN: It is. So you can think of this as the ternary operator, where in C and in other languages you can use a question mark, then a colon to say if or else. You can also use that as syntax-- it's crazy, ugly looking here, but what this is going to do is as follows. Let me go out and save this. Now let me change the group to 1, because it's now going to allow me to use parentheses to say 0 or 1 instances of CS50 dot. But it's not going to return them as a capture group. So I'm using them syntactically, but not to capture as Dan proposed earlier. So let me go ahead and rerun this, malan@harvard.edu. And voila, still detecting. And we're just not unnecessarily capturing stuff we don't want. But again, I can't emphasize enough-- I mean, this looks like a train wreck of syntax now. It's just so confusing, certainly if you're new to regular expressions. But the key is that we started, what, 90 minutes ago building up with just looking for the at sign, then looking for the user name, then looking for the TLD. Really take these baby steps and make your use of RegExes really incremental. COLTON OGDEN: Yeah. And I think once you've looked at it a few times, [INAUDIBLE] a few of them. Like this kind of stuff no longer really seems too intimidating. DAVID MALAN: Yeah, absolutely. COLTON OGDEN: Definitely compared to the monstrous block of code you would need to do the same thing, right-- DAVID MALAN: Indeed. COLTON OGDEN: --without it. DAVID MALAN: Now it turns out, we can do this way more simply by not doing any of this at all. And if I Google, Python validates email address, you'll see, as someone mentioned in the documentation a bit ago, there's actually libraries that will allow you to do this quite simply. So if you actually use pip or pip3 to install validate email, you can really simplify your life by just saying this. So you can ignore this entire conversation about validating email addresses, for instance, and just use a library. But you're assuming that that library is correct, and hopefully it is, if it's open sourced and lots of people have commented on it and provided feedback and pull requests, did the repo. But this is generally the way to do things, not to reinvent the wheel yourself. COLTON OGDEN: So the TLDR for today's stream is, download library for it? DAVID MALAN: For email addresses, yes. But regular expressions are so much more powerful, right. Because suppose you just have a messy data set, right. Humans are in the habit of typing their mailing addresses differently, their phone numbers differently. You can actually use the symbology that we introduced today in regular expressions to get rid of, maybe all of the parentheses, all of the dashes and a phone number, so that you're left with just the decimal digits. You can do this to clean up street addresses. If someone typed in, 33 Oxford Street, Cambridge, Mass, 02138 all on one line, you could use regular expressions to extract the state, maybe, then hopefully the city, maybe the street address, and with high probability maybe clean it up, ultimately too. COLTON OGDEN: That one sounds like it'd be a little rough. DAVID MALAN: It is. It is. It's better to ask the user from the get-go, what's your street address, what's your city and state? COLTON OGDEN: That's probably why they do it in separate fields in most forms. DAVID MALAN: Absolutely. But these days too, if you need to clean up data, which is not uncommon, if you're inheriting a data set, if you're doing something data science-y, if you've just got a messy data set from another company or colleague, you can clean it up using regular expressions, by just matching or massaging the data the way you want it to be. COLTON OGDEN: Awesome. DAVID MALAN: Should we see if there's any other questions? COLTON OGDEN: Yeah. Yeah, yeah, yeah. That was awesome. Also, shout out to Brian Rodriguez for the follow. Thank you very much. DAVID MALAN: What is it? Sure. Yeah. COLTON OGDEN: I think All Programmers is asking, can you talk a bit about the complexity and age-revered time complexity of using the RegExes? DAVID MALAN: Oh, the time complexity-- you can come up with very perverse regular expressions that are incredibly expensive to use. I am not savvy enough to be able to cite a few such examples offhand, to be honest, because it's been some time since I had to think about this. For the most part you don't have to worry about this, at least for a reasonable length of regular expressions like the ones we have been doing here. Honestly, rule of thumb is, if it kind of fits on the screen-- and my font size is pretty big-- you're probably fine. It's only when you start using lots of capture groups, lots of look ahead, which is not a topic we've looked at today where things can get computationally more expensive, because at some point you introduce a bit of non-determinism and you need to figure out what transition to follow, because the state machine you've implied with your regular expression just becomes a lot harder to execute. COLTON OGDEN: Awesome. Parset was saying, you can use Python3 dash i, which will leave the Python process live so you can continue testing. DAVID MALAN: True. COLTON OGDEN: Group one's a whole group. Group two is-- the open parentheses, says Bhavik Knight. Yeah, I think that's what we've mentioned earlier. Brown Rodriguez says, a question for both Colton and Professor Malan. I'm in the process of making a video course for people at work to take independently. Since both of you have done this with great success, do you have any tips or advice? DAVID MALAN: Hmm. Tips or advice-- I think you want to make sure you know your audience. And you don't want to teach or introduce the material at such a high level that you're more technical colleagues are kind of bored with it. I think you want to be careful not to speak at too sophisticated a level technically that you're less comfortable colleagues are sort of lost by it. So I would try to find that balance. And a technique we have adopted, at least here on campus is to have material and problems and questions for those less comfortable and more comfortable. So you introduce the sort of standard set of material, but you allow the more comfortable colleagues to dive in deeper, and the less comfortable people to remain comfortable with whatever questions or exercises you're actually challenging them with. COLTON OGDEN: The scaffolding that someone alluded to previously in the chat. DAVID MALAN: So that's another good one too. And hopefully this came across with what we were doing today. We've got a fairly sophisticated regular expression on the screen now, a bunch of conditional logic. We didn't start with that. The very first lines we wrote today were calling input, storing it in a variable, and just printing it out. And so that was an example, albeit a short one of scaffolding. Start here and then go here and here and here and here. And hopefully if your audience is following along and the way, they end up on the top floor, even though you started with them at the base. COLTON OGDEN: Exactly. Exactly. Totally agree. Lots of hair gel, says Asley. He's going to start an extreme. OK. Fatma, would RegEx be used for interpreting regular messages? How Google scans our emails, et cetera? So I guess, for parsing email-- the bodies of emails? DAVID MALAN: Yeah, absolutely. I mean it depends what you mean by this, but if Google or other companies are searching for keywords, they could be using regular expressions. They might just be using simple string matching, but it's probably implemented in regular expressions, though they do it on such volume that they might need to be fancier than just using, say, Python RegExes for performance's sake. COLTON OGDEN: Spam filtering type of thing. DAVID MALAN: Spam, yeah, that's another good one. Yeah. COLTON OGDEN: I was making a simple compiler for a course of mine and I used RegExes for removing the comments. DAVID MALAN: Yeah, that's a good one too. And actually we had some code years ago where, in CS50 I tend to write examples for lecture that have comments. Unfortunately if I show those examples in class, it kind of spoils the questions I'm asking. Because if I ask students, all right, what is this line of code do, the problem is, if the comment is right there, they don't have to think too hard about it. So I used to run a command that used a regular expression to get rid of all of the comments, just as you proposed right before class. COLTON OGDEN: I did that on accident for a lecture for games during the summer. DAVID MALAN: The day before you lost all of your comments? COLTON OGDEN: It was kind of a point of fact-- yeah, well, thereby I was a little bit more careful with how clear my comments were. DAVID MALAN: Good thing's version control, which brings us to gets stream too. COLTON OGDEN: Oh, yeah, yeah. Kareem Zidane and I did-- Kareem hosted or led the get up stream that we did. And also, I missed it, Brutus Harvenius, thank you for the follow. Is there a way of capturing multiple occurrences of substrings that match a string? DAVID MALAN: Multiple occurrences of substrings-- COLTON OGDEN: I'm guessing if you have a line of text that's like, hello, hello, hello, hello, hello, capturing the same one multiple times? DAVID MALAN: Yes. Usually you have to use another function. And I don't know offhand, so I'm checking Stack Overflow here. COLTON OGDEN: Here we go. This is how real programming is done, everybody. DAVID MALAN: So, yeah. It looks like the re library has find all, which does exactly as I think you're describing. COLTON OGDEN: I've had to use that function for something before. DAVID MALAN: Yeah. Now that I see it, I think I have too. But when in doubt, Google and see what comes back. But now that you have the right mental model for what these are, you'll find that the syntax is going to be pretty much on point to today's discussion. COLTON OGDEN: Yeah, exactly. Yale rhymes with email. Harvard does not. Further proof that Yale is better than Harvard, says Blue Booger. DAVID MALAN: OK, we'll let them know. COLTON OGDEN: I think you have might have seen this earlier. You mentioned randomness, and I inquired if it is accurate that true short randomness can be incorporated in cs by detection of the noise at the time? DAVID MALAN: So short answer, yes. This is the closest approximation that computers tend to have these days for true randomness, where you take an ambient sound, temperature, movement-- things that really are not tied to something very deterministic like a computer's clock. Even then there's certainly periodicity in, like, the wind, I presume. I am no expert on wind, but things like that. So just taking ambient noise and environmental data might not necessarily be truly random. The best we typically can do with computers is find a distribution of information that appears to be random. But physical inputs are the closest we can get. COLTON OGDEN: Cool. Makes sense. HomeLine says, that is still a seed to the PRNG, the pseudo random number generator. DAVID MALAN: Mm hm. Yep. COLTON OGDEN: Never comment in your code-- problem solved. DAVID MALAN: There you go. COLTON OGDEN: All right. I think that's the end of the comments. I don't know if you'll be able to stick around for a couple of minutes, just to get some last questions. But that was an awesome, awesome tutorial on regular expressions. DAVID MALAN: Yeah, I think Dan-- CS50's own Dan Coffey is about to pop back in with some tips for next steps. So that if you'd like to practice with this and learn more, you have some tools to use. DAN COFFEY: I just wanted to share one tool that I found game-changing when I was exploring regular expressions. And if you want to just Google, RegEx tester? I think it's RegEx pal or RegEx 101. COLTON OGDEN: It's RegEx pal, right. DAN COFFEY: Yep. Either one. The first one or either one is great. And so if you want to copy-paste your regular expression into here. COLTON OGDEN: The return one might not work if this is Python-specific. DAVID MALAN: No, it's not. COLTON OGDEN: Oh, OK. DAN COFFEY: And then you can put all the cases you want to try to test against down here. So you can do a bunch of emails. COLTON OGDEN: Ooh. And it shows you capture group. DAN COFFEY: It shows you the capture group it's being captured in-- COLTON OGDEN: That's amazing. DAN COFFEY: --the math. And it also will explain on the right, what the actual breakdown in the top right here explanation-- DAVID MALAN: That is awesome. DAN COFFEY: --which is under the chat. COLTON OGDEN: Why is the second one-- oh, is it just different color every other line? The blue? DAN COFFEY: Yeah. COLTON OGDEN: OK. DAN COFFEY: Because you've got the global modifiers on at the moment. So. DAVID MALAN: This is awesome. So here, let me zoom in on the address for everyone online. This is regex101.com, brought to you today by CS50. DAN COFFEY: But it's very helpful if, like, instead of having to constantly keep testing in your terminal window-- DAVID MALAN: Yeah. DAN COFFEY: --but quickly to see what matches. This was helpful for me. DAVID MALAN: Yeah. And you'll see here, we're actually using the PHP flavor of this. We can switch to Python, though you shouldn't find really any differences versus what we did. You do see here subtly, the raw string that we alluded to earlier, that just ensures that certain characters don't trip you up when they're not escaped. And you can see JavaScript and Go also has implementations Here too COLTON OGDEN: That's awesome. That's a great tool. DAVID MALAN: Yeah. Thank you so much, Dan Coffey. DAN COFFEY: No problem. DAVID MALAN: Any final questions in the stream here? COLTON OGDEN: We got-- Irene says, thank you, David. RegExes are brilliant but always daunting. Building them up little by little makes a lot of sense and makes them much clearer. DAVID MALAN: Absolutely. I think that's by far the biggest takeaway. We only scratched the surface of some of the functionality. Though frankly I think we probably hit some of the most useful features, most commonly used. So that should be a pretty powerful technique. COLTON OGDEN: Dan's the hero. DAVID MALAN: Thanks, Brenda, for all the effort we put into it here. COLTON OGDEN: I used RegEx for my CS50 final project, but now I finally understand what I Googled up from Stack Overflow. DAVID MALAN: Nice. Glad to hear it from MKloppenburg. COLTON OGDEN: Yeah, that was an awesome tutorial. Twitch Hello World-- since you're teaching an HLS course, is it just me or is RegEx really similar to the old school Lexis and Westlaw search terms using star, bang, et cetera, and how of you think of the terms to search? DAVID MALAN: You know, I don't know if there's a connection between tools like that. Something tells me, no, maybe, though star has historically tended to mean the wild card character. Exclamation point has less of a history, I think. So I don't know. I would honestly pull up the Wikipedia article myself on both of those to see what their etymology is of their syntax. COLTON OGDEN: Fetzenrndy has followed. Thank you very much. David, would you use Python for RegEx nowadays or would you ever go to Perl nowadays, says Andre. DAVID MALAN: Personally, no. I mean, PHP and Python essentially inherited Perl syntax for regular expressions, I believe. Don't quote me on that, but I'm pretty sure that's some of the etymology there. And PC-- I think that's even PCR. What does that stand for again? Perl compatible regular expressions, and I think Python 2 essentially adopted the same syntax maybe with slight differences. Perl was actually the first interpreted language that I learned years ago. It's the first language I had to learn web programming. Frankly, it's fallen out of favor. People still use it. Scripts still exist in it. It's not a language I would typically reach for. Frankly I think it's very easy in Perl to write code that you yourself don't understand the next day or weeks later. I think PHP and Ruby and-- well, PHP and Python have done a better job at readability. Ruby is perhaps a little reminiscent in my mind of Perl in its syntax. So personally, nah. I wouldn't really pick up Perl. You can use it's Perl compatible regular expressions in bunches of languages. COLTON OGDEN: Are RegEx still considered slow as in performance, says HomeLine? DAVID MALAN: It depends on how complicated they are. Someone in the chat alluded to look ahead earlier. There are ways to over-engineered them, such that they are so complicated that you do essentially introduce non-determinism. The computer has to try this branch or this branch. However, any non-deterministic machine can be converted to a deterministic machine. The problem is, you might get exponential blow up and just the complexity of it, and therefore the runtime. So short answer, yes. But honestly, unless you are using regular expressions to manipulate or pattern match against huge data sets or some data set, again and again and again in a loop, or many, many, many, many times, don't worry about it. Use the regular expression until you find it to be an actual performance problem. COLTON OGDEN: Xomoo, thank you for the follow. Let's go back up here in the chat. It looks like it's not visible. Thank you for the stream, with a happy crying face says All Programmers, like a crying face with joy. DAVID MALAN: Nice. COLTON OGDEN: Sort of. We got another-- oh, that was Xomoo on there. Bella says, thank you. Thank you, Bella. Asley says, Dan's appearance. Thank you so much, David and Colton-- Dan. This was very informative and easy to follow. DAVID MALAN: Nice. COLTON OGDEN: Thank you for joining us. Very good intro to RegEx, says Bhavik. Thanks, David and Colton. Thank you. Brenda-- this has been great. My RegEx knowledge has now grown a lot. Thanks, David and Colton. DAVID MALAN: Nice. COLTON OGDEN: Osman, thank you so much. I took CS50x in 2013. Until now I can't get enough of CS50. DAVID MALAN: Nice. COLTON OGDEN: And I think Brenda also said she took CS50x in 2012. She says, we're oldies. I think I first looked at it in 2010. So-- DAVID MALAN: Oh. Brenda, he just one upped you. COLTON OGDEN: That puts me up there. DAVID MALAN: I think Dan-- Dan Coffey also took it in 2012? DAN COFFEY: 2010. DAVID MALAN: '10, damn it. That's right. Dan was my colleague in 2012. COLTON OGDEN: If someone very experienced in AI suggested that I not wait to learn it and instead seek and take funding for an app, I think you could eventually code a simple MVP, although not the full iteration. I don't know how to code. Yeah, do you agree with Paul Graham that it is not good-- do you agree with Paul Graham that it is not a good idea to start a tech startup if you don't code well enough to select and know if you're tech programmers are coding well? DAVID MALAN: I think there are many examples of folks who have started companies who don't necessarily know how to code well themselves. COLTON OGDEN: Apple. DAVID MALAN: I mean, Apple and even Bill Gates very quickly stopped writing code, shortly after founding Microsoft is my understanding. So while I do think there is some guidance to be taken from comments and sentiments like that, where the reality is, you will have a leg up if you could just better understand what your team members are doing and what your colleagues are doing. You can hold your own in a conversation. You can participate in the conversation. You can provide inputs and provide better direction. I think different people have different skills, and you certainly shouldn't not do something, just because you think you're not as strong as someone else. COLTON OGDEN: No hard, fast rules. Just be sensible, basically? DAVID MALAN: Yeah. Indeed. COLTON OGDEN: OK. Now it looks like we've caught up with all the comments. DAVID MALAN: Thank you so much, everyone. COLTON OGDEN: Thanks so much, everybody. Thanks to David for his awesome RegEx tutorial. DAVID MALAN: Thanks to Dan for his pop-ins today. COLTON OGDEN: Thank you, Dan, yeah, for his first contribution. Tomorrow we have a super secret stream that we're not going to spoil. DAVID MALAN: Oh, yeah. I hear good things about this one. COLTON OGDEN: This one's going to be great. Tune in for that one. David and I will be here for that one tomorrow at 1:00 PM. DAVID MALAN: 1:00 PM Eastern time. COLTON OGDEN: So, yeah, no spoilers. 1:00 PM Eastern Standard time. So thanks again, everybody, so much. Final word, closing word? DAVID MALAN: This was CS50 on Twitch.
B1 中級 REGULAR EXPRESSIONS TUTORIAL - CS50 on Twitch, EP.15 (REGULAR EXPRESSIONS TUTORIAL - CS50 on Twitch, EP. 15) 5 0 林宜悉 に公開 2021 年 01 月 14 日 シェア シェア 保存 報告 動画の中の単語