Placeholder Image

字幕表 動画を再生する

  • The following content is provided under a Creative

  • Commons license.

  • Your support will help MIT OpenCourseWare

  • continue to offer high-quality educational resources for free.

  • To make a donation or to view additional materials

  • from hundreds of MIT courses, visit MIT OpenCourseWare

  • at ocw.mit.edu.

  • CHARLES E. LEISERSON: Hi, it's my great pleasure

  • to introduce, again, TB Schardl.

  • TB is not only a fabulous, world-class performance

  • engineer, he is a world-class performance meta engineer.

  • In other words, building the tools and such to make it

  • so that people can engineer fast code.

  • And he's the author of the technology

  • that we're using in our compiler, the taper

  • technology that's in the open compiler for parallelism.

  • So he implemented all of that, and all the optimizations,

  • and so forth, which has greatly improved the quality

  • of the programming environment.

  • So today, he's going to talk about something near and dear

  • to his heart, which is compilers,

  • and what they can and cannot do.

  • TAO B. SCHARDL: Great, thank you very

  • much for that introduction.

  • Can everyone hear me in the back?

  • Yes, great.

  • All right, so as I understand it,

  • last lecture you talked about multi-threaded algorithms.

  • And you spent the lecture studying those algorithms,

  • analyzing them in a theoretical sense,

  • essentially analyzing their asymptotic running times, work

  • and span complexity.

  • This lecture is not that at all.

  • We're not going to do that kind of math

  • anywhere in the course of this lecture.

  • Instead, this lecture is going to take a look at compilers,

  • as professor mentioned, and what compilers can and cannot do.

  • So the last time, you saw me standing up here

  • was back in lecture five.

  • And during that lecture we talked

  • about LLVM IR and x8664 assembly,

  • and how C code got translated into assembly code via LLVM IR.

  • In this lecture, we're going to talk

  • more about what happens between the LLVM IR and assembly

  • stages.

  • And, essentially, that's what happens when the compiler is

  • allowed to edit and optimize the code in its IR representation,

  • while it's producing the assembly.

  • So last time, we were talking about this IR,

  • and the assembly.

  • And this time, they called the compiler guy back,

  • I suppose, to tell you about the boxes in the middle.

  • Now, even though you're predominately dealing with C

  • code within this class, I hope that some of the lessons from

  • today's lecture you will be able to take away into any job that

  • you pursue in the future, because there are a lot

  • of languages today that do end up being compiled, C and C++,

  • Rust, Swift, even Haskell, Julia, Halide,

  • the list goes on and on.

  • And those languages all get compiled

  • for a variety of different what we

  • call backends, different machine architectures, not just x86-64.

  • And, in fact, a lot of those languages

  • get compiled using very similar compilation technology

  • to what you have in the Clang LLVM compiler

  • that you're using in this class.

  • In fact, many of those languages today

  • are optimized by LLVM itself.

  • LLVM is the internal engine within the compiler

  • that actually does all of the optimization.

  • So that's my hope, that the lessons you'll learn here today

  • don't just apply to 172.

  • They'll, in fact, apply to software

  • that you use and develop for many years on the road.

  • But let's take a step back, and ask ourselves,

  • why bother studying the compiler optimizations at all?

  • Why should we take a look at what's

  • going on within this, up to this point, black box of software?

  • Any ideas?

  • Any suggestions?

  • In the back?

  • AUDIENCE: [INAUDIBLE]

  • TAO B. SCHARDL: You can avoid manually

  • trying to optimize things that the compiler will

  • do for you, great answer.

  • Great, great answer.

  • Any other answers?

  • AUDIENCE: You learn how to best write

  • your code to take advantages of the compiler optimizations.

  • TAO B. SCHARDL: You can learn how

  • to write your code to take advantage of the compiler

  • optimizations, how to suggest to the compiler what it should

  • or should not do as you're constructing

  • your program, great answer as well.

  • Very good, in the front.

  • AUDIENCE: It might help for debugging

  • if the compiler has bugs.

  • TAO B. SCHARDL: It can absolutely

  • help for debugging when the compiler itself has bugs.

  • The compiler is a big piece of software.

  • And you may have noticed that a lot of software contains bugs.

  • The compiler is no exception.

  • And it helps to understand where the compiler might have made

  • a mistake, or where the compiler simply just

  • didn't do what you thought it should be able to do.

  • Understanding more of what happens in the compiler

  • can demystify some of those oddities.

  • Good answer.

  • Any other thoughts?

  • AUDIENCE: It's fun.

  • TAO B. SCHARDL: It's fun.

  • Well, OK, so in my completely biased opinion,

  • I would agree that it's fun to understand

  • what the compiler does.

  • You may have different opinions.

  • That's OK.

  • I won't judge.

  • So I put together a list of reasons

  • why, in general, we may care about what

  • goes on inside the compiler.

  • I highlighted that last point from this list, my bad.

  • Compilers can have a really big impact on software.

  • It's kind of like this.

  • Imagine that you're working on some software project.

  • And you have a teammate on your team

  • he's pretty quiet but extremely smart.

  • And what that teammate does is whenever that teammate gets

  • access to some code, they jump in

  • and immediately start trying to make that code work faster.

  • And that's really cool, because that teammate does good work.

  • And, oftentimes, you see that what the teammate produces

  • is, indeed, much faster code than what you wrote.

  • Now, in other industries, you might just sit back

  • and say, this teammate does fantastic work.

  • Maybe they don't talk very often.

  • But that's OK.

  • Teammate, you do you.

  • But in this class, we're performance engineers.

  • We want to understand what that teammate did to the software.

  • How did that teammate get so much performance out

  • of the code?

  • The compiler is kind of like that teammate.

  • And so understanding what the compiler does

  • is valuable in that sense.

  • As mentioned before, compilers can save you

  • performance engineering work.

  • If you understand that the compiler can

  • do some optimization for you, then you

  • don't have to do it yourself.

  • And that means that you can continue

  • writing simple, and readable, and maintainable code

  • without sacrificing performance.

  • You can also understand the differences between the source

  • code and whatever you might see show up in either the LLVM

  • IR or the assembly, if you have to look

  • at the assembly language produced for your executable.

  • And compilers can make mistakes.

  • Sometimes, that's because of a genuine bug in the compiler.

  • And other times, it's because the compiler just

  • couldn't understand something about what was going on.

  • And having some insight into how the compiler reasons about code

  • can help you understand why those mistakes were made,

  • or figure out ways to work around those mistakes,

  • or let you write meaningful bug reports to the compiler

  • developers.

  • And, of course, understanding computers

  • can help you use them more effectively.

  • Plus, I think it's fun.

  • So the first thing to understand about a compiler

  • is a basic anatomy of how the compiler works.

  • The compiler takes as input LLVM IR.

  • And up until this point, we thought of it

  • as just a big black box.

  • That does stuff to the IR, and out pops more LLVM IR,

  • but it's somehow optimized.

  • In fact, what's going on within that black box

  • the compiler is executing a sequence

  • of what we call transformation passes on the code.

  • Each transformation pass takes a look at its input,

  • and analyzes that code, and then tries

  • to edit the code in an effort to optimize

  • the code's performance.

  • Now, a transformation pass might end up running multiple times.

  • And those passes run in some order.

  • That order ends up being a predetermined order

  • that the compiler writers found to work

  • pretty well on their tests.

  • That's about the level of insight that

  • went into picking the order.

  • It seems to work well.

  • Now, some good news, in terms of trying

  • to understand what the compiler does,

  • you can actually just ask the compiler, what did you do?

  • And you've already used this functionality,

  • as I understand, in some of your assignments.

  • You've already asked the compiler

  • to give you a report specifically

  • about whether or not it could vectorize some code.

  • But, in fact, LLVM, the compiler you have access to,

  • can produce reports not just for factorization,

  • but for a lot of the different transformation

  • passes that it tries to perform.

  • And there's some syntax that you have

  • to pass to the compiler, some compiler flags

  • that you have to specify in order to get those reports.

  • Those are described on the slide.

  • I won't walk you through that text.

  • You can look at the slides afterwards.

  • At the end of the day, the string that you're passing

  • is actually a regular expression.

  • If you know what regular expressions are,

  • great, then you can use that to narrow down

  • the search for your report.

  • If you don't, and you just want to see the whole report,

  • just provide dot star as a string and you're good to go.

  • That's the good news.

  • You can get the compiler to tell you exactly what it did.

  • The bad news is that when you ask the compiler what it did,

  • it will give you a report.

  • And the report looks something like this.

  • In fact, I've highlighted most of the report

  • for this particular piece of code,

  • because the report ends up being very long.

  • And as you might have noticed just

  • from reading some of the texts, there are definitely

  • English words in this text.

  • And there are pointers to pieces of code that you've compiled.

  • But it is very jargon, and hard to understand.

  • This isn't the easiest report to make sense of.

  • OK, so that's some good news and some bad news

  • about these compiler reports.

  • The good news is, you can ask the compiler.

  • And it'll happily tell you all about the things that it did.

  • It can tell you about which transformation passes were

  • successfully able to transform the code.

  • It can tell you conclusions that it drew

  • about its analysis of the code.

  • But the bad news is, these reports

  • are kind of complicated.

  • They can be long.

  • They use a lot of internal compiler jargon, which

  • if you're not familiar with that jargon,

  • it makes it hard to understand.

  • It also turns out that not all of the transformation

  • passes in the compiler give you these nice reports.

  • So you don't get to see the whole picture.

  • And, in general, the reports don't really

  • tell you the whole story about what the compiler did

  • or did not do.

  • And we'll see another example of that later on.

  • So part of the goal of today's lecture

  • is to get some context for understanding the reports

  • that you might see if you pass those flags to the compiler.

  • And the structure of today's lecture

  • is basically divided up into two parts.