Placeholder Image

字幕表 動画を再生する

  • [? RAIF LEVINE: All ?] right.

  • Thanks everybody for coming today.

  • I'm [? Raif ?] Levine of the Android UI Toolkit Team.

  • And it is my pleasure to introduce today Alex Crichton

  • of the Mozilla Research.

  • And he is a member of the Rust core team.

  • And he is here to tell us about Rust,

  • one of the more exciting and interesting languages,

  • I think, in the past few years.

  • ALEX CRICHTON: Thank you, [? Raif ?].

  • So I'm going to give just a whirlwind tour of what Rust

  • is, why you might feel like using it,

  • and what are the unique aspects of it.

  • So the first thing that I normally

  • get if I ever talk about a programming language is

  • why do we have yet another one?

  • So if you take a look around, we have this whole landscape

  • of programming languages today, they all fill various niches.

  • They solve lots of problems.

  • But it turns out that you can organize these languages

  • along a spectrum.

  • And the spectrum is this trade-off

  • between control and safety.

  • So on one end of the spectrum, we have C and C++,

  • which give us lots of control.

  • We know exactly what's going to run our machine.

  • We have lots of control over memory layout.

  • We don't have a lot of safety.

  • We have segfaults, buffer overruns, very common bugs

  • like that.

  • Whereas on the other side, we have JavaScript.

  • We have Ruby.

  • We have Python.

  • They're very safe languages, but you don't quite

  • know what's going to happen at runtime.

  • So in JavaScript, we have JITs behind the scenes,

  • or you really don't know what's going to happen.

  • Because it'll change as the program is running.

  • So what Rust is doing is completely

  • stepping off this line.

  • Rust is saying, we are not going to give you

  • a trade-off between control and safety.

  • But rather, we're going to give you both.

  • So Rust is a systems programming language which is kind

  • of filling this niche that hasn't been filled by many

  • of the languages today, where you get both the very low-level

  • control of C and C++, along with the high-level safety

  • and constructs that you would expect from Ruby,

  • and JavaScript, and Python.

  • So that might raise a question.

  • We have all these languages.

  • What kind of niches will benefit from this safety

  • and this control?

  • So, suppose you're building a web

  • browser, for example, Servo.

  • Servo is a project in Mozilla Research

  • to write a parallel layout engine in Rust.

  • So it's entirely written in Rust today.

  • And it benefits from this control, this very high level

  • of control.

  • Because browsers are very competitive in performance,

  • as we all very much well know.

  • But at the same time, all major browsers today are written

  • in C++, so they're not getting this great level of safety.

  • They have a lot of buffer overruns.

  • They have a lot of segfaults, memory vulnerabilities.

  • But by writing in Rust, we're able to totally eliminate

  • all these at compile time.

  • And the other great thing about Servo

  • is this parallelism aspect.

  • if you try and retrofit parallelism onto, for example,

  • Gecko, which is millions of lines of C++,

  • it's just not going to end well.

  • So by using a language which from the ground up

  • will not allow this memory unsafety,

  • we're able to do very ambitious things like paralyzing layout.

  • And on the other spectrum of things,

  • let's say you're not a C++ hacker.

  • You're not a browser hacker.

  • You're writing a Ruby gem.

  • Skylight is a great example of this.

  • It's a product of Tilde, where what they did

  • is they have a component that runs inside of the customer's

  • Rails apps will just kind if monitor how long it takes

  • to talk to the database, how long it takes for the HTTP

  • request, general analytics and monitoring about that.

  • But the key aspect here is that they have very tight resource

  • constraints.

  • They're a component running in their client's application.

  • So they can't use too much memory,

  • or they can't take too long to run.

  • So they were running into problems,

  • and they decided to rewrite their gem in Rust.

  • And Rust is great for this use case because with the low level

  • of control that you get in C+, C, and C++,

  • they were able to satisfy these very tight memory constraints,

  • the very tight runtime constraints.

  • But also, they were able to not compromise the safety

  • that they get from Ruby.

  • So this is an example where they would have written their gem

  • in C and C++.

  • But they were very hesitant to do so.

  • They're a Ruby shop.

  • They haven't done a lot of systems programming before.

  • So it's kind of tough, kind of that first breach into systems

  • programming.

  • And this is where Rust really helps out.

  • So I want to talk a little bit about what I mean by control

  • and what I mean by safety.

  • So this is a small example in C++.

  • Well, the first thing we're going to do

  • is make a vector of strings on the stack.

  • And then we're going to walk through and take

  • a pointer into that.

  • So the first thing that we'll realize

  • is all of this is laid out inline, on the stack,

  • and on the heap.

  • So, for example, this vector is comprised

  • of three separate fields, the data, length,

  • and capacity, which are stored directly inline on the stack.

  • There's no extra indirection here.

  • And then on the heap itself, we have

  • some strings which are themselves

  • just wrapping a vector.

  • So if we take a look at that, we'll

  • see that it itself also has inline data.

  • So this first element in the array on the heap

  • is just another data, length, and capacity,

  • which is pointing to more data for the string.

  • So the key here is that there's not these extra layers

  • of indirection.

  • It's only explicitly when we go onto the heap,

  • we're actually buying into this.

  • And then the second part about control in C++ is you have

  • these very lightweight references.

  • So this reference into the vector,

  • the first element of the vector is just this raw pointer

  • straight into memory.

  • There's no extra metadata tracking it.

  • There's no extra fanciness going on here.

  • It's just a value pointing straight into memory.

  • A little dangerous, as we'll see in a second,

  • but it's this high level of control.

  • We know exactly what's going on.

  • And then the final aspect of this

  • is we have deterministic destruction.

  • Or what this means is that this vector of strings,

  • we know precisely when it's going to be deallocated.

  • When this function returns is the exact moment at which

  • this destructor will run.

  • And it will destroy all the components

  • of the vector itself.

  • So this is where we have very fine-grained control

  • over the lifetime of the resources

  • that we have control of, either on the stack

  • or within all the containers themselves.

  • And what this mostly boils down to

  • is something that we call zero-cost abstractions, where

  • this basically means that it's something that at compile time,

  • you can have this very nice interface, very easy to use.

  • It's very fluent to use.

  • But it all optimizes away to nothing.

  • So once you push it through the compiler,

  • it's basically a shim.

  • And it'll go to exactly what you would

  • have written if you did the very low-level operations yourself.

  • And on the other this side of this,

  • let's take a look at Java.

  • So if we take our previous example of a vector of strings,

  • then what's actually happening here

  • is the vector on the stack is a pointer to some data,

  • the length, and some capacity, which itself

  • is a pointer to some more data.

  • But in there, we have yet another pointer

  • to the actual string value which has data, length, and capacity.

  • We keep going with these extra layers of indirection.

  • And this is something that's imposed on us

  • by the Java language itself.

  • There's no way that we can get around this, these extra layers

  • of indirection.

  • It's something that we just don't have control over,

  • where you have to buy into right up front.

  • Unlike in C++, where we can eliminate these extra layers

  • and flatten it all down.

  • And when I'm talking about zero-cost abstractions,

  • it's not just memory layout.

  • It's also static dispatch.

  • It's the ability to know that a function call at runtime

  • is either going to be statically resolved or dynamically

  • resolved at runtime itself.

  • This is a very powerful trade-off

  • where you want to make sure you know what's going on.

  • And the same idea happens with template expansion,

  • which is generics in Java and C++,

  • where what it boils down to is that if I have a vector

  • of integers and a vector of strings,

  • those should probably be optimized very,

  • very differently.

  • And it means that every time you instantiate those type

  • parameters, you get very specialized copies of code.

  • So it's as if you wrote the most specialized

  • vector of integers for the vector of integers itself.

  • And so, let's take a look at the safety aspect.

  • That's an example of what I mean by control.

  • But the safety comes into play especially in C++.

  • So this is a classical example of where

  • something is going to go wrong.

  • So the first thing that we do is we have our previous example

  • with the vector strings.

  • And we take a pointer into the first element.

  • But then we come along, and we try and mutate the vector.

  • And some of you familiar with vectors in C++,

  • you'll know that when you've exceeded the capacitive

  • of vector, you probably have to reallocate it, copy some data,

  • and then push the big data onto it.

  • So let's say, in this case, we have to do that.

  • We copy our data elsewhere, copy our first element

  • that's in our vector, push on some new data.

  • And then the key aspect is we deallocate the contents

  • of the previous data pointer.

  • And what this means is that this pointer,

  • our element pointer is now a dangling pointer

  • in the freed memory.

  • And that basically implies that when we come around here

  • to print it out onto the standard output,

  • we're going to either get garbage, a segfault,

  • or this is what C++ calls undefined behavior.

  • This is the crux of what we're trying to avoid.

  • This is where the unsafety stems from.

  • So it's good to examine examples like this.

  • But what we're really interested in

  • is these fundamental ingredients for what's going wrong here.

  • Because this kind of example, you

  • can probably pattern match and figure it out.

  • It's very difficult to find this in a much larger program

  • among many, many function calls deep.

  • So the first thing we'll notice is there's

  • some aliasing going on here.

  • This data pointer of the vector and the element

  • pointer on the stack are pointing to the same memory

  • location.

  • Aliasing is basically where you have two pointers pointing

  • at a very similar location, but they don't know anything

  • about one another.

  • So then we come in and mix in mutation.

  • What we're doing is we're mutating

  • the data pointer of the vector.

  • But we're not mutating the alias reference

  • of the element itself.

  • So it's these two fundamental ingredients in combination--

  • aliasing and mutation.

  • It's OK to have either of them independently.

  • But when we have them simultaneously

  • is when these memory and safety bugs start to come up.