Placeholder Image

字幕表 動画を再生する

  • KENNY YU: So, hi, everyone.

  • I'm Kenny.

  • I'm a software engineer at Facebook.

  • And today, I'm going to talk about this one problem--

  • how do you deploy a service at scale?

  • And if you have any questions, we'll save them for the end.

  • So how do you deploy a service at scale?

  • So you may be wondering how hard can this actually be?

  • It works on my laptop, how hard can it be to deploy it?

  • So I thought this too four years ago when I had just

  • graduated from Harvard as well.

  • And in this talk, I'll talk about why it's hard.

  • And we'll go on a journey to explore many

  • of the challenges you'll hit when you actually

  • start to run a service at scale.

  • And I'll talk about how Facebook has approached some of these challenges

  • along the way.

  • So a bit about me--

  • I graduated from Harvard class 2014 concentrating in computer science.

  • I took C50 fall of 2010 and then TF'd it the following year.

  • And I TF'd other classes at Harvard as well.

  • And I think one of my favorite experiences at Harvard was TFing.

  • So if you have the opportunity, I highly recommend it.

  • And after I graduated, I went to Facebook,

  • and I've been on this one team for the past four years.

  • And I really like this team.

  • My team is Tupperware, and Tupperware is Facebook cluster management

  • system and container platform.

  • So there's a lot of big words, and my goal

  • is by the end of this talk is that you'll

  • have a good overview of the challenges we face in cluster management

  • and how Facebook is tackling some of these challenges

  • and then once you understand these, how this relates

  • to how we deploy services at scale.

  • So our goal is to deploy a service in production at scale.

  • But first, what is a service?

  • So let's first define what a service is.

  • So a service can have one or more replicas,

  • and it's a long-running program.

  • It's not meant to terminate.

  • It responds to requests and gives a response back.

  • And as an example, you can think of a web server.

  • So if you're running Python, you're a Python web server

  • or if you're running PHP, an Apache web server.

  • A response requests, and it gives you get a response back.

  • And you might have multiple of these, and multiple of these

  • together compose your service that you want to provide it.

  • So as an example, Facebook users Thrift for most of its back end services.

  • Thrift is open source, and it makes it easy to do something

  • called Remote Procedure Calls or RPCs.

  • And it makes it easy for one service to talk to another.

  • So as an example, service at Facebook, let's take the website as an example.

  • So for those of you that don't know, the entire website

  • is pushed as one monolithic unit every hour.

  • And the thing that actually runs the website is hhvm.

  • It runs our version of PHP, called Hack, as a type-safe language.

  • And both of these are open source.

  • And the way to website is deployed is that there

  • are many, many instances of this web server running in the world.

  • This service might call other services in order to fulfill your request.

  • So let's say I hit the home page for Facebook.

  • I might want to give my profile and render some ads.

  • So the service will call maybe the profile service or the ad service.

  • Anyhow, the website is updated every hour.

  • And more importantly, as a Facebook user, you don't even notice this.

  • So here's a picture of what this all looks like.

  • First, we have the Facebook web service.

  • We have many copies of our web server, hhvm, running.

  • Requests from Facebook users-- so either from your browser or from your phone.

  • They go to these replicas.

  • And in order to fulfill the responses for these requests,

  • it might have to talk to other services-- so profile service or ad

  • service?

  • And once it's gotten all the data it needs,

  • it will return the response back.

  • So how did we get there?

  • So we have something that works on our local laptop.

  • Let's say you're starting a new web app.

  • You have something working-- a prototype working-- on your laptop.

  • Now you actually want to run it in production.

  • So there are some challenges there to get that first instance running

  • in production.

  • And now let's say your app takes off.

  • You get a lot of users.

  • A lot of requests start coming to your app.

  • And now that single instance you're running

  • can no longer handle all the load.

  • So now you'd have multiple instances in production.

  • And now let's say your app-- you start to add more features.

  • You add more products.

  • The complexity of your application gets more complicated.

  • In order to simplify that, you might want

  • to extract some of the responsibilities into separate components.

  • And now instead of just having one service in production,

  • you have multiple services in production.

  • So each of these transitions involves lots of challenges,

  • and I'll go over each of these challenges along the way.

  • First, let's focus on the first one.

  • From your laptop to that first instance in production,

  • what does this look like?

  • So first challenge you might hit when you

  • want to start that first copy in production

  • is reproducing the same environment as your laptop.

  • So some of the challenges you might hit is let's

  • say you're running a Python web app.

  • You might have various packages of Python libraries

  • or Python versions installed on your laptop,

  • and now you need to reproduce the same exact versions and libraries

  • on that production environment.

  • So versions and libraries-- you have to make sure

  • they're installed on the production environment.

  • And then also, your app might make assumptions

  • about where certain files are located.

  • So let's say my web app needs some configuration file.

  • It might be stored in one place on my laptop,

  • and it might not even exist in a production environment.

  • Or it may exist in a different location.

  • So the first challenge here is you need to reproduce

  • this environment that you have on your laptop on the production machine.

  • This includes all the files and the binaries that you need to run.

  • Next challenge is how do you make sure that stuff on the machine

  • doesn't interfere with my work and vice versa?

  • Let's say there's something more important running on the machine,

  • and I want to make sure my dummy web app doesn't interfere with that work.

  • So as an example, let's say my service--

  • the dotted red box--

  • it should use four gigabytes of memory, maybe two cores.

  • And something else in the machine wants to use

  • two gigabytes of memory and one core.

  • I want to make sure that that other service doesn't take more memory

  • and start using some of my service's memory

  • and then cause my service to crash or slow down and vice versa.

  • I don't want to interfere with the resources used by that other service.

  • So this is a resource isolation problem.

  • You want to ensure that no workload on machine

  • interferes with my workload and vice versa.

  • Another problem with interference is protection.

  • Let's say I have my workload in a red dotted box,

  • and something else running a machine, the purple dotted box.

  • One thing I want to ensure is that that other thing doesn't somehow

  • kill or restart or terminate my program accidentally.

  • Let's say there's a bug in the other program that goes haywire.

  • The effects of that service should be isolated in its own environment

  • and also that other thing shouldn't be touching important

  • files that I need for my service.

  • So let's say my service needs some configuration file.

  • I would really like it if something else doesn't touch that

  • file that I need to run my service.

  • So I want to isolate the environment of these different workloads.

  • The next problem you might have is how do you ensure that a service is alive?

  • Let's say you have your service up.

  • There's some bug, and it crashes.

  • If it crashes, this means users will not be able to use your service.

  • So imagine if Facebook went down and users are unable to use Facebook.

  • That's a terrible experience for everyone.

  • Or let's say it doesn't crash.

  • It's just misbehaving or slowing down, and then restarting it might help--

  • might help it mitigate the issue temporarily.

  • So what I really like is if my service has an issue,

  • please restart it automatically so that user impact is at a minimum.

  • And one way you might be able to do this is to ask the service, hey,

  • are you alive?

  • Yes.

  • Are you alive?

  • No response.

  • And then after a few seconds of that, if there's still no response,

  • restart the service.

  • So the goal is the service should always be up and running.

  • So here's a summary of challenges to go from your laptop

  • to one copy in production.

  • How do you reproduce the same environment as your laptop?

  • How do you make sure that once you're running on a production machine,

  • no other workload is affecting my service,

  • and my service isn't affecting anything critical on that machine?

  • And then how do I make sure that my service is always up and running?

  • Because the goal is to have users be able to use your service all the time.

  • So there are multiple ways to tackle this issue.

  • Two typical ways that companies have approached this problem

  • is to use virtual machines and containers.

  • So for virtual machines, the way that I think about it is you

  • have your application.

  • It's running on top of an operating system

  • and that operating system is running on top of another operating system.

  • So if you ever use dual boot on your Mac,

  • you're running Windows inside a Mac-- that's very similar idea.

  • There are some issues with this.

  • It's usually slower to create a virtual machine,

  • and there is also an efficiency cost in terms of CPU.

  • Another approach that companies take is to create containers.

  • So we can run our application in some isolated environment that

  • provides all the guarantees as before and run it directly

  • on the machine's operating system.

  • We can avoid the overhead of a virtual machine.

  • And this tends to be faster to create and more efficient.

  • And here's a diagram that shows how these relate to each other.

  • On the left, you have my service--

  • the blue box-- running on top of a guest operating

  • system, which itself is running on top of another operating system.

  • And there's some overhead because you're running two operate systems

  • at the same time versus the container--

  • we eliminate that extra overhead of that middle operating system

  • and run our application directly on the machine with some protection around it.

  • So the way Facebook has approached these problems is to use containers.

  • For us, the overhead of using virtual machines

  • is too much, and so that's why we use containers.

  • And to do, this we have a program called a Tupperware agent

  • running on every machine at Facebook, and it's

  • responsible for creating containers.

  • And to reproduce the environment, we use container images.

  • And our way of using container images is based on btrfs snapshots.

  • Btrfs is a file system that makes it very fast to create copies

  • of entire subtrees of a file system, and this makes it very fast

  • for us to create new containers.

  • And then for resource isolation, we use a feature of Linux

  • called control groups that allow us to say, for this workload,

  • you're allowed to use this much memory, CPU, whatever resources and no more.

  • If you try to use more than that, we'll throttle you,

  • or we'll kill your workload to avoid you from harming

  • the other workloads on the machines.

  • And for our protection, we use various Linux namespaces.

  • So I'm going to not go over too much detail here.

  • There's a lot of jargon here.

  • If you want more detailed information, we

  • have a public talk from our Systems of Scale Conference in July 2018

  • that will talk about this more in depth.

  • But here's a picture that summarizes how this all fits together.

  • So on the left, you have the Tupperware agent.

  • This is a program that's running on every machine at Facebook that

  • creates containers and ensures that they're all running and healthy.

  • And then to actually create the environment for your container,

  • we use container images, and that's based on btrfs snapshots.

  • And then the protection layer we put around

  • the container includes multiple things.

  • This includes control groups to control resources and various namespaces

  • to ensure that the environments of two containers

  • are sort of invisible to each other, and they can't affect each other.

  • So now that we have one instance of the service in production,

  • how can we get many instances of the