Placeholder Image

字幕表 動画を再生する

  • [MUSIC PLAYING]

  • BRIAN YU: OK, let's get started.

  • Welcome, everyone, to the final day of CS50 Beyond.

  • And goal for today is going to be to take a look at things

  • at a bit of a higher level.

  • There is going to be less code in today's lecture.

  • The focus of today is on two main topics--

  • security and scalability-- which are both important as you

  • begin to think about, you're writing all this code for your web application.

  • You're ready to deploy it so that people can actually use it.

  • What are the sorts of considerations you need to bear in mind?

  • What are the security considerations in making

  • sure that wherever you're hosting the application, you and the application

  • itself is secure and that your users are secure from potential vulnerabilities

  • or potential threats?

  • And also, from a scalability perspective,

  • we've been designing applications that so far probably only you

  • or a couple other people have been using.

  • But what sorts of things do you need to think about

  • as your applications begin to scale, as more and more people begin to use it,

  • and you have to begin to think about this idea of multiple people trying

  • to use the same application at the same time?

  • So a number of different considerations come about there.

  • We'll show a couple of code examples.

  • But the main idea of this is going to be high level, just thinking abstractly,

  • sort of trying to design the product, trying to design the project,

  • trying to figure out how exactly we need to be adjusting our application

  • to make sure that it's secure and to make sure that it's scalable.

  • So we'll go ahead and start with security.

  • And on the topic of security, we're going

  • to look at a number of different security considerations

  • as we move all throughout the week, from the beginning of the week

  • until the end of the week, thinking about the types of security

  • implications that come about.

  • And so one of the first things we introduced in the class was Git,

  • the version control tool that we were using

  • to keep track of different versions of our code

  • in order to manage different branches of our code, so on and so forth.

  • And so a couple of important security considerations to be aware with

  • regards to Git.

  • You all probably created GitHub repositories

  • over the course of this week, maybe for the first time.

  • And GitHub repositories by default are public.

  • And this is in the spirit of the idea of open source software, the idea

  • that anyone can see the code.

  • Anyone can contribute to the code.

  • And that, of course, comes with its trade offs.

  • On one hand, everyone being able to see the code certainly

  • means that anyone can help you to find bugs and identify bugs.

  • But it also means that anyone on the internet can see the code,

  • look for potential vulnerabilities, and then

  • potentially take advantage of those vulnerabilities.

  • So definitely, trade offs, costs, and benefits that

  • come along with open source software.

  • And another thing just to be aware of, we mentioned this earlier in the week,

  • but your Git commit history is going to store the entire history of any

  • of the commits that you have made, as the name might imply.

  • And so if you make a commit and you do something

  • you shouldn't have done, for instance-- you make a commit that accidentally

  • includes database credentials inside of the commit somewhere

  • or includes a password inside of the commit

  • somewhere-- you can later on remove those credentials

  • and make another commit and remove the credentials.

  • But the credentials are still there inside of the history.

  • If you go back, you could still find the credentials

  • if you had access to the entire Git repository

  • and could go back and find that point in Git's history.

  • So what are the potential solutions for if you do something like this,

  • accidentally expose credentials at some point in the repository

  • and then remove them?

  • What could you do?

  • Yeah?

  • AUDIENCE: Change the credentials.

  • BRIAN YU: Certainly.

  • Changing the credentials, something you should almost definitely do.

  • Change the password.

  • It's not enough just to remove them and make another commit.

  • And there's also something you can do known as Git purge, where

  • you can effectively purge the history of commit, sort of overwrite history,

  • so to speak, in order to replace that, as well.

  • But even that, if it's been online on GitHub,

  • who knows who may have been able to access the credentials?

  • So definitely always a good idea to remove those, as well.

  • On the first day, we also took a look at HTML.

  • We were designing basic HTML pages.

  • And there are a number of security vulnerabilities

  • you could create just with HTML alone.

  • Perhaps one of the most basic is just the idea that the contents of a link

  • can differ from where the link takes you to.

  • There's probably a pretty obvious point where you often

  • have text that links you to a particular page.

  • But this can often be misleading and is commonly

  • used in phishing email attacks, for instance,

  • whereby you have a link that takes you to URL one,

  • but by default, it shows you URL two, which can be misleading, for sure.

  • Or I can have situations where I could--

  • let's go into link.html--

  • I have a link that presumably takes me to google.com.

  • But if I click on google.com, it could take me anywhere else--

  • to some other site, for instance.

  • And the way that it does that is quite simply by just

  • having a link that takes you to a URL, but the contents of that URL

  • are something different or something else entirely.

  • And so that alone is something to be aware of.

  • But that problem is compounded when you consider the idea

  • that even though your server-side code-- application code

  • you write in Python and Flask, for instance--

  • you can keep secret from your users, HTML code is not

  • kept secret from users.

  • Any users can see HTML and do whatever they want with it.

  • And so on the first day, you may have been

  • trying to take a look at an HTML page and try and replicate it

  • using your own HTML and CSS, for example.

  • The simplest way to do something like that

  • would just be to copy the source code.

  • So I could go to bankofamerica.com, for instance, Control-Click on the page,

  • view the page source, and all right.

  • Here's all the HTML on Bank of America's home page.

  • I could copy that, create a new file, and call it bank.html.

  • Paste the contents of it in here.

  • Go ahead and save that.

  • And now, open up bank.html.

  • And now, I've got a page that basically looks like Bank of America's website.

  • And now, I could go in.

  • I could modify the links, change where Sign In takes you to,

  • make it take you to somewhere else entirely.

  • And so these are potential threats, vulnerabilities,

  • to be aware of on the internet that are quite easy to actually do.

  • So this is less about when you're designing your own web applications

  • but, when you're using web applications, the types of security

  • concerns to definitely be aware of.

  • So let's keep moving forward in the week-- yeah, question?

  • AUDIENCE: Can you copy JavaScript source code in the same way?

  • BRIAN YU: Yes.

  • Any JavaScript code that is on the client, you can access

  • and you can modify.

  • You can change variables and so on and so forth.

  • And this is actually a pretty easy thing to do.

  • So if I go to like, I don't know, The New York Times website, for instance,

  • and I look at the source code there--

  • let me go ahead and inspect the element, and I'll

  • try and hover over a main headline.

  • OK.

  • This is the name of a CSS class.

  • You could access any JavaScript.

  • You can also run any JavaScript in the console arbitrarily.

  • So I could say, all right, document.query selector all let's

  • get everything with that CSS class.

  • Or maybe it's just the first one, because it's two CSS classes.

  • All right.

  • Great.

  • I'll take the first one, set its inner HTML to be,

  • like, welcome to CS50 Beyond.

  • And you can play around with websites in order to mess around, change them.

  • So all of the JavaScript CSS classes, all of that,

  • is accessible to anyone who is using the page, for example.

  • Other questions before I go on?

  • Yeah.

  • AUDIENCE: Any thoughts on JavaScript obfuscation?

  • BRIAN YU: JavaScript obfuscation-- certainly something you can do.

  • So since JavaScript is available to anyone who has access to the web page,

  • there are programs called JavaScript obfuscators gators

  • that basically take plain old looking JavaScript

  • and convert it into something that's still JavaScript

  • but that's very difficult for any human to decipher.

  • It changes variable names and does a bunch of tricks in JavaScript

  • to still execute the exact same way but that looks quite obscure.

  • Definitely something you can do.

  • Still not totally foolproof, because there are ways

  • of trying to deobfuscate JavaScript code, at least to some extent.

  • So it's not perfect, but definitely something that you can do.

  • Other things?

  • All right.

  • Let's take a look at--

  • OK, when we were writing Flask applications,

  • we were writing web servers.

  • And so one thing that's just good to know from a security perspective

  • is the difference between HTTP, the Hypertext Transfer Protocol,

  • and the secure version of it, HTTPS.

  • And that has to do with the idea that on the internet,

  • we have computer servers that are trying to communicate

  • with each other that are trying to send information back and forth.

  • And when these computers are trying to send information back and forth,

  • we would like for that to happen securely,

  • that when one computer is sending information to another computer,

  • that information is going through a number of different routers.

  • And each of those routers could hypothetically

  • have information that's intercepted.

  • Someone could try and intercept a package on its way from computer number

  • one to computer number two.

  • So how do we securely try and transfer information from one location

  • to the other?

  • And this has to do with the entire field of cryptography,

  • which is a huge field that we're only going to be

  • able to barely scratch the surface of.

  • But the basic idea here is that we would like some way

  • to encrypt our information, that if I have some plain text that I would like

  • to send from my computer to someone else's computer,

  • I would like to encrypt that plain text, send it across in some encrypted way,

  • such that the person on the other end could decrypt it.

  • And so this is perhaps a more sophisticated version

  • of what you might have done in CS50's problem set two

  • when you were using the Caesar or the Vigenere cipher

  • in order to encrypt something.

  • The ciphers that are used in computing on the internet, for instance,

  • are just much more secure, for example.

  • But they follow a similar principle.

  • And so one form of cryptography is called secret-key cryptography,

  • where the idea is that if I am a computer up here

  • and I have some plain text that I want to encrypt,

  • I also have some key that only I know.

  • And I can take the plain text, and I can take that key

  • and run an algorithm on it.

  • And that generates some ciphertext, some encrypted version of the plain text

  • that was encrypted using the key.

  • I can then send that ciphertext along to the other person.

  • And so long as the other person has both the ciphertext and the key

  • to encrypt it, they can do the same process

  • and just decrypt it, generating the plain text from it.

  • That way, the ciphertext is transferred, not the plain text,

  • from one side to the other side of this communication.

  • And so long as both parties in this instance have access to the same key,

  • they can encrypt and decrypt messages at will.

  • Why doesn't this quite work on the internet, though?

  • What is the problem with this model?

  • Yeah?

  • AUDIENCE: If you're sending the key as well as the ciphertext,

  • then it's just revealed as sending the plain text that you have one.

  • BRIAN YU: Exactly.

  • When we transfer the ciphertext across, the other person

  • also needs access to the key.

  • We need to transfer the key across the internet,

  • as well, to give it to the other person.

  • And so anyone who is intercepting the ciphertext

  • could also have intercepted the key and therefore could

  • have decrypted the information and gotten the plain text

  • as a result of it.

  • So this secret-key cryptography, ultimately, it

  • doesn't work in the context of the internet

  • if it needs to be the case that the key is just

  • transferred across the internet.

  • Now, you could try encrypting the key, for example.

  • But then whenever key you used to encrypt the key,

  • that also needs to be sent across the internet,

  • and you end up with this problem where you can never figure out a way in order