What's New in LLVM - WWDC 2012

Developer Tools • iOS, OS X • 58:05

The Apple LLVM compiler has evolved at a staggering pace, providing remarkably quick compile times and generating lightning-fast code. Learn about the latest LLVM technologies from improvements in the Static Analyzer, to better performance and optimizations, to the latest advancements in C++ support.

Speakers: Doug Gregor, Ted Kremenek, Bob Wilson

Unlisted on Apple Developer site

Downloads from Apple

HD Video (219.1 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good morning, my name is Bob Wilson, and together with a few of my colleagues, I'm going to talk with you today about some great new features in the LLVM compiler. So, a better compiler is really significant because we believe that a better compiler can help you guys to make better apps. And there's three ways we can do that: performance, productivity, and quality. Performance is really important because you want your apps to run quickly and be responsive. The compiler can help you by optimizing your code to run quickly.

compiler can help your productivity by making sure that it itself runs quickly, so that you don't have to spend a lot of time sitting around waiting for your builds. And we can also help your productivity by having the compiler support new language features that make it easier for you to write correct and efficient code.

And finally, quality is essential. Nobody wants to use a buggy app. Every time that you build your project in Xcode, the compiler is going to look for suspicious things, and it will warn you before you even hit those problems. You can also periodically run the Static Analyser, which is based on the compiler, to find more subtle problems in your code. So these are three areas where the compiler can help you.

Xcode 4.4 has a new version of the Apple LLVM compiler, version 4.0, and we're going to talk about some of those new features here today. Before we do that, I want to say a few things about LLVM. LLVM is represented by the Dragon logo here, it's an open source project to develop compilers and other low-level development tools.

One of the distinctive things about LLVM is its modular architecture, where it's built upon a set of reusable components that can be combined together in interesting ways. So some of these components are the Clang parser, which is a unified C, C++, and Objective-C parser. Runs really fast, and it has great informative diagnostics to help you understand the problems that the compiler finds.

We have a sophisticated optimizer to make your code run fast, and we have code generators, assemblers, and disassemblers for both Intel and ARM processors. So together with those components, we also have some runtime libraries. libc++, which we're going to talk about in a few minutes, is our C++ runtime library. And we also have a lower level compiler support runtime.

We can then combine all of those different components and runtime libraries to create stand-alone tools. So starting with the LLVM compilers is the most obvious one. The LLVM compiler builds upon all of those things shown below it, and you'll see this whenever you build your project in Xcode. And of course, you can also run the compiler directly from the command line. The Xcode Static Analyser is built on top of the Clang parser, combined with sophisticated analysis to understand what your code is really doing.

And this is also integrated into Xcode when you run the analyze step, in a nice way to actually be able to visualize and see the results of that analysis. and finally we have the LDB debugger. The debugger uses the Clang parser, the code generator and the disassembler and forms the basis of the Xcode debugger.

The Clang parser itself is integrated directly into Xcode to support features like indexing, code completion, live warnings and fixits. So I hope this gives you a feeling for how the architecture of LLVM really lets us do some powerful things. We can reuse these components, we get consistency across the tools and we get this really powerful integration into the Xcode IDE.

Now, if you're familiar with our tools, you may notice one thing that's missing from this list here, and that's LLVM GCC. LLVM GCC is a hybrid. It's a combination of the old GCC 4.2 parser with an older version of the LLVM Optimizer and Code Generator. This was a stopgap solution that we created to ease the transition from GCC over to LLVM.

We finished that transition now, and we've no longer been adding any new features or even fixing bugs in LLVM GCC. So it's really time for everyone to stop using that. And in fact, we're going to be removing that from Xcode in the near future. So please move to the LLVM compiler.

So with that context, I'd like to go on and talk about performance, which is the first of the focus areas we're going to cover today. As a compiler team, performance is near and dear to our hearts. Some of us stay up at night thinking about how we can squeeze a few more cycles out of some important code. And we also work really, really hard to make sure that the compiler itself runs quickly.

There's no way I can talk about all the performance enhancements we've added recently, so I've just picked three of them to highlight today. So let's start by talking about the Arc Optimizer. Arc, or automated reference counting, is this great feature we introduced last year to make you more productive. If you use Arc, you no longer have to manually insert retains and releases into your Objective-C code.

Let's look at how this works. So I've put up a simple example here of a debug logging method. It takes a string as an argument and just writes it out with NSLog. From the perspective of Arc, the important thing here is that from the time you enter the method to the point where you call NSLog, we have to make sure that nothing is going to release the string out from under us. So the compiler will just automatically insert a retain for that string S at the beginning of the method and then release it at the end. And in fact, if you build an unoptimized debug version of this code, that's basically what you'll get.

Now, as an experienced programmer, you may look at this and say, that's kind of silly. There's really no way that the string could be released out from under you because nothing happens before you call NSLog. If you run an optimized build of this code, the compiler will automatically run the Arc Optimizer. It analyzes it and realizes this is silly, and it optimizes the way that retain and release.

This kind of optimization has been in LLVM from day one with Arc. So let's talk about something new this year. We have a lot of improvements to this Arc Optimizer, and I can't unfortunately cover all of them. So again, I've just picked one example, and that's the case of nested retains.

And to illustrate that, let's look at a more complicated version of that debug logging method. Again, it takes a string, and so Arc will insert a retain for the string S. Now, unlike the previous version, there's a copy from S to a new string T, so a reference to the same string. And Arc will also insert a retain for T.

Now at this point, well I should say this is a contrived example. I'm sure you wouldn't actually write this code. But in a more complicated method, there may be reasons where you would insert a copy like that. And it can also happen when the compiler inlines functions. So the Arc Optimizer can look at this and realize, I've retained the same string twice. As long as the string is retained at least once, there's no way it can be released. So that second retain is completely unnecessary. And the Arc Optimizer, now in version 4 of the compiler, will optimize that away.

What about the other retainer? Can we get rid of that as well? In this case, unfortunately, you can't. If you look inside the method, it's now a little more complicated. It's checking a logging-enabled flag. And then, before it calls NSLog, it's sending an increment log count message. It's highly unlikely that increment log count is going to release the string. But the compiler can't know that. It's hypothetically possible that the string that was passed in is in a global variable.

Somebody could make an increment log count implementation that would actually release that string. And the compiler has to be correct. It has to be safe. So we can't get rid of the retain in this case. But that doesn't mean we can't optimize it. The compiler isn't smart enough to realize it can move the retain and release inside that conditional, so that if logging is disabled, you don't pay the cost of the retain.

So that's just one example of a lot of nice improvements to the Arc Optimizer that should make your code run faster than with previous versions of the compiler. Let's go on and talk about another performance area, which is support for Intel's AVX. AVX is a vector processing extension.

It's available in recent versions of Intel processors, the Sandy Bridge and Ivy Bridge processors. So basically, most of the new Macs from the last year or so will support AVX. And it's really powerful. Compared to the older SSE, the AVX vectors are twice as wide, 256 bits instead of 128.

So it's very powerful. It's not going to help every application, though. It's an entirely floating-point optimization. So it's a good fit for applications that are floating-point intensive and typically works best where there's a high ratio of computation to memory bandwidth, where you're actually doing a lot of computation. So if you think that your application may fit that profile, it might be worth taking a look at optimizing for AVX.

And let's look at an example to show how you can do that. This is just a simple matrix addition, and the code up here is going to step through two input matrices. It's going to load eight elements at a time into two vectors, A and B. It's going to add them to a new vector, C, and then store it back out to the result.

You can see it's using some intrinsic function calls here. Intel's defined a standard set of intrinsics, and LLVM 4.0 supports the full set of those standard intrinsics, which is really great. It's powerful because it lets you have full access to the functionality of AVX and a lot of control. It can be a little bit hard to read, especially if you're not familiar with all those intrinsics. So a nice option is to use the OpenCL vector syntax.

So you can directly dereference pointers to those vectors and just load them up. We can now do the add as just A plus B, and it's much easier for you to understand what's going on. You can mix and match the syntax with the intrinsic calls as well. So for the simple operations like adds, you can use the OpenCL syntax. For more complicated things that you can't directly represent with C operators, you can use the intrinsics.

How much performance does this buy you? So I took that same example, I implemented an SSE version and a scalar version that just adds one element at a time. So this graph is showing a speedup relative to the scalar version. The SSE implementation speeds up more than three times, so that's good.

And AVX version running on the same machine speeds up by about four and a half times. So it's a significant step up. This is kind of a small example. In your own code, you may see either larger or smaller speedups, but I think it's good enough to illustrate the point that there's potentially a really significant win here.

If you want to use AVX, there's one complication, and that's you will want your code to run on older Macs as well. And what that means is you need to implement not only the AVX optimized version, but an SSE version. And then check at runtime whether the hardware you're running on supports AVX, and if not, fall back and use the SSE version.

So to do that, put the AVX code into separate source files, compile just those source files with AVX enabled, and then insert a runtime check. You can use the syscontrol interface to check if AVX is supported, and then switch between those two implementations. But in the cases where it works, it's really a significant win for floating-point intensive applications.

One last performance area, and that's the integrated ARM assembler. Previous versions of the LLVM compiler for ARM would write out an assembly file, and then invoke the system assembler to read that file in and parse it, translate it to machine code, and write out an object file. There's a lot of overhead involved in that.

We want to cut out that overhead so the compiler runs faster. And now in version 4, the compiler can just generate that object file directly, so it runs faster. This is a feature we've had on Intel for a while now, and I'm really pleased to say it's supported for ARM as well.

Besides the build time improvements, a nice feature of this is you get better error checking if you have inline assembly. The compiler will look at that and understand it and report errors to you. And the only thing you really have to watch out for is if you've got some really old ARM assembly, this integrated assembler only supports ARM's modern unified syntax. So if you've got some really old stuff, you may need to either update the syntax or build with the integrated assembler disabled. But those are rare cases. For almost all of you, the result of this will just be that your builds run faster than they used to.

So those are three performance improvements that I wanted to highlight for you today. And now I'd like to invite Doug Gregor to come up and talk about some new language features. So, new language features. There's a bunch of new language features available in the Apple LLVM compiler version 4.

So we've talked a bit this week about the new Objective-C language features, like numeric literals, array literals, dictionary literals, and so on. These are all available in the Apple LLVM compiler version 4. But we've talked a lot about Objective-C. I don't want to talk about Objective-C anymore. I want to talk about C++.

Thank you. There's always a sum of people in the room like that. So there's a new C++ standard, the C++11 standard. It was ratified late last year by the committee. And it's been 13 years since the original 1998 C++ standard. And since then, there's been a lot of improvements in the C++ language and the standard library that comes with it.

Now, there are lots of improvements, but we can sort of pigeonhole them into a couple different places. So many of the improvements are targeted at simplifying common idioms. So the things that C++ programmers do day in and day out, C++11 tries to make them easier, more concise, more safe. C++11 is also about performance, so it introduces new features to help you make your code run faster.

And finally, one of the areas C++ has traditionally been very good is in describing expressive, efficient software lines and in describing the application software libraries. And so there's new features in the C++11 language to make it easier to build those libraries. Now, all of this is done with a very strong focus on backward compatibility so that your apps that build as C++98 apps today will compile and run the same way as C++11 apps.

Now, last year, we actually started introducing C++11 support with version 3 of the Apple LLVM compiler. And since then, we've been hard at work implementing more and more of this very large C++11 standard. With version 4 of the Apple LLVM compiler, we've introduced many of the highly requested features, such as generalized initializer lists, generalized constant expressions, or constexpr, and lambda expressions.

Now, we're going to talk about a couple of these features, skewing toward the ones that can improve productivity. That's what we find is most important for people. But before we dig into the language side of the equation, we're going to talk about the C++11 standard, the C++11 standard library.

Now you can't have a great C++11 solution unless you have both a C++11 compiler to support the languages and a C++11 standard library to provide all the library facilities. There's two general reasons for this. First reason is that the language and the library in C++ are somewhat intertwined.

So some of the features we think of as language features, like initializer lists, they're really only truly useful when you also have the C++11 standard library backing them up. Things like move semantics, you can use them without a C++11 library, but you're not going to see the benefits unless your library has been optimized for move semantics.

But the C++11 library isn't just a language support library. There's actually a lot of new great features in C++11, like smart pointers. So the smart pointers help automate your memory management for your C++ objects, kind of like Arc does it for your Objective-C objects, through use of-- by describing your ownership.

So you have a unique pointer to describe unique ownership, shared pointer for shared ownership, weak pointer to weakly point at a shared object. And these can help you eliminate the trouble of matching up all of your news and deletes. There's also regular expressions. There's concurrency support through threading and atomics, and a lot of other features to help you build good C++ apps in C++11.

Now, here at Apple, we've been working on a new implementation of the C++11 standard library. It's part of the LLVM project, and it's called libc++. So this is a ground-up implementation intended for complete C++11 standards conformance. Now, it also gracefully degrades for C++ 98 and 03. So you can migrate to the libc++ standard library with your C++ 98 app first, and then change into compiling for C++ 11 to get a complete C++ 11 solution.

Now, libc++, it's a new library. We've been working on it for several years, and it's a ground-up implementation where we got to focus on performance. And in focusing on performance, we've created better data structures that require less memory and are faster, and highly tuned algorithms, so that your uses of the standard library will be more efficient and take less memory. libc++ is a replacement for the current C++ standard library that you're probably all using. That old standard library is from GCC. It hasn't been touched in years. And so we want you to migrate forward to the new libc++ library.

The good news is that migrating to libc++ is very, very easy, because libc++ as a standard library implementation and C++11 as a new standard are both largely backward compatible. So for most applications, you just switch to libc++ and switch to C++11 at the same time. Your code will just compile and build.

So the only thing that we've seen trip up migration moving to libc++ is due to the library tr1 components. So these were some transitional components that the committee had worked on, and these features have actually moved into C++11. So if you're using them from the old GCC standard library, you just need to update your includes to remove the tr1 slash and update your namespace references to eliminate the tr1 there and use the new C++11 versions of these components. So that's the new libc++ C++ standard library. Let's talk about some language features and how they can actually help you be more productive when working in C++.

So, one of the major problems that C++ has is verbosity. So this is one of my least favorite loops, because all I want to do is walk over the views in this vector. Simple operation, and yet I have these two lines of for loop header just to do that. One of the big problems there is I have to write this iterator type. I write hundreds of these iterator types during the day, and they're all almost identical. So C++11 introduces this wonderful new feature of auto.

What auto does is it tells the compiler, "Please fill in the type for me. I don't want to write it." And this is not magic. It's just a placeholder that the compiler will use. And when the compiler sees it, it will go look at the initializer. So look at the initializer v is v stop again, compute the type, which it had to do anyway, and then fill it in for you. Great feature. Saves a ton of typing.

Now, by default, when you use an auto variable, you're going to get a copy of whatever you're initializing with. Almost always, that's the right answer. It's certainly the right answer here, because that's what we're iterating over. But if you actually want to reference instead of a copy, that's perfectly fine. Just use auto reference.

Our loop is better. Our loop is not perfect. C++ 11 also has the for range loop. For range loop should look very familiar to anyone who's used Objective-C because it's almost identical to a fast enumeration loop. We're using the colon rather than in, but otherwise the syntax is the same. We just say for every view in the container views, do something. Notice-- and notice that we're using auto here as well because, hey, why bother to write NSView star again? We know that when we're walking over views, we've got a view.

Now, the for range loop works with any of the standard library containers, and it also works with anything else that has begin and end functions that return iterators. So if you have containers that have been following conventions of the standard library, the for range loop just works with those too.

Now, we love using auto. However, in Objective C++, we have to be a little bit careful with auto, because sometimes using auto without care can change the meaning of your program in surprising ways. So here's a really simple method. All we're gonna do is grab a view out of an NSArray of views. Well, this typing NSVU is far too much typing for me. I want to use auto instead.

This program is still correct. It'll probably still compile, but we've actually changed the meaning here. And the reason is, ObjectAtIndex doesn't actually return an NSView. What it returns is ID. So when the compiler sees an ID on the right-hand side, it decides that View must have the type of ID.

This is correct behavior. Your program will probably still do exactly the same thing. But what you've actually done here is you've lost a little bit of the static types of information that you used to have in your program. Because uses of View used to know that it was an NSView.

And therefore, you get warnings if you accidentally convert that NSView over to an NSString. Or if you try to send a message that isn't available on NSViews, you would get a warning. You no longer get those warnings, because you can do anything with an ID. You'll also notice other features, like code completion won't be as precise anymore, because that static type information is gone.

So the moral here is that Auto is really great for C++ types. It can save you a lot of typing. But when you're using Auto in Objective C++ code, and you're working with methods that may return ID or you're not sure, don't use Auto, just use the typing that you've used before so you keep that static type information. So let's talk about some new features.

Initializing containers in C++, and formerly in Objective-C, is actually really, really painful. So here we have to declare the vector, and then we call the pushback over and over again, copying these pushback calls. It's really, really tedious. There must be a better way. In C++ 11, there is a better way, and that's to use the generalized initializer list syntax.

So obviously this is far cleaner than calling pushback a hundred times. It's also nice because it generalizes existing facilities of language. Initializer lists have always been there. If we were initializing an array, we could have used an initializer list. Now we can do it for arbitrary containers as well.

Of course, this works for any of the containers in the C++ standard library. So you can initialize a map with key value pairs very easily with these nested initializer lists. The caveat here, of course, is that you need a C++ 11 standard library for this to work. So if you want to use generalized initializer lists, you're going to need to use libc++, which provides the library facilities to make the initializer list feature work nicely. Now it turns out you can use initializer lists in other places.

This is then just after the equal sign when initializing a variable. For example, you can use them in an insert call into a map, where usually you'd have to write this big, long std make pair call to put the key and the value into a pair. You can just pass it almost like a tuple of a key and value using an initializer list.

So hold that thought of the tuple, because it turns out that with generalized initializer lists, and with the new features of the C++ 11 standard library, we've made it very, very easy to do something cool. Which is multiple return values with a very nice, clean syntax. So here our min-max function wants to return two values. How do you do that in C++?

Well, you don't. You have one pass-by reference. But in C++ 11, you can say return a tuple. And it can be a tuple of any number of elements. They can have any type, whatever you specify. And when you actually initialize that tuple, you just use an initializer list. Very clean, very natural syntax for returning multiple values. Well, what about using that?

Because our user probably doesn't actually want to get a tuple back. Well, there's also really nice syntax for this as well. And that's through the tie library facility. You can tie up a couple local variables and assign to the result of that tie from the min-max to say get the minimum, get the maximum, add separate variables, and continue on with our code. Very, very nice, natural way to do multiple return values in C++ 11. Let's talk about lambdas. This is a very, very, very highly requested feature.

And so lambdas are a nice way of packaging up code, in this case a comparison operation, and passing them off to another operation. The syntax should look very familiar to you if you've used Objective-C, because it's very, very similar to blocks. Lambdas and blocks are two different features that try to address the same problem.

Both of them are essentially creating anonymous functions or closures, we can pass off to another routine to do some operation. Now you see some syntactic differences here. We're using the open and closed square brackets rather than the caret. And the return type for a lambda is in a different place. It's this optional arrow type at the end. Other than that, the syntax and the usages are very similar.

However, the semantics, the behavior of blocks and lambda expressions are different in some subtle ways. So to understand those, we're going to go a little bit into how blocks work, and then we can contrast those with how lambdas work, so you can make an educated decision about what is best for your code.

So let's take our sorting example and let's make the assumption that this strings array that we were sorting is really, really, really big. It's so big that we really want to sort it on a different thread, and then later on we'll join up with our thread to actually look at the results. So we'll do that with Grand Central Dispatch. Call the dispatch async, pass it a block, and inside that block you can do the sorting of strings.

This code looks good, but this code actually has an error in it. And the reason for the error is that this block is referring to the vector strings. Now, what does that mean? Well, when you refer to something that's outside of the block, from inside the block, so refer to a variable, that variable has to be captured, meaning it has to be pulled into the block.

By default, a block is going to capture by value, which means that we make a copy of the data structure of that variable and put that copy inside the block. Now, immediately, this sounds like both a performance and a correctness problem. First, we're making this copy of something that we already said was large. But second, if we sort the copy, we're never going to see the results anywhere.

So that's not the reason for the compiler error. The reason for the compiler error is that because this is so dangerous to be working modifying a copy of something, by default, by-value captures are captured as const. You can't modify them. So the compiler error you're actually getting here is that you can't sort a constant vector of strings. Now, by-value captures within Objective C++ for blocks also have this nice other behavior that when you capture something that's actually an Objective C object, it's going to retain that object automatically for you, so you know it doesn't go away.

What we really want here is we want to fix this problem. We want to make this code correct and eliminate the compiler error. And the way we do that is with double_block variables. So if we mark the strings variable as double_block, what it does is it tells any blocks that refer to that variable to capture it by reference, so that all of those blocks and the main thread of control all see the same version of this strings vector.

Now, one thing you may not know is that by reference captured variables actually do get copied. They get copied once at the time when you pass off-- when Grand Central Dispatch takes in your block and does a block copy on it. Now, the reason for that copy is that when we first come into this function here, strings is allocated on the stack.

When we pass a block off to Grand Central Dispatch, we have no assurances that that stack is still gonna be there when we call the block. And so what the compiler does for safety reasons is it makes one copy of that strings variable out on the heap, and then everyone that refers to that strings variable now points at the heap copy, which can live longer than the stack frame. Altogether, this means that our application now actually does work because we're all referring to the same strings variable, we're allowed to sort it because it's mutable, and we will see the results later on once we sync up with GCD.

That's how blocks work. Let's look at the same issues for lambdas. So we're gonna change our block, caret, to open/close square brackets to introduce a lambda. First thing we're going to see is a compiler error. This is for a different reason than our first compiler error with blocks, though. See, the open and close brackets introduces a lambda, but what's really stating is, is this an empty lambda capture list?

And because it's an empty capture list, it means this lambda is not allowed to refer to any variables from the outside scope that have to capture. So when it tries to refer to strings, the error the compiler's going to tell you is, "Well, I can't capture strings because I'm not allowed to capture anything."

So let's tell it, capture strings. Well, we can do that by putting the name of the variable we want to capture in between those square brackets. So now we're capturing the variable strings, and we're going to get another error. Now this is similar to the first blocks error, because when you just name the variable strings here in the open square brackets, it's going to capture by copy.

Same semantics as blocks for the most part. We're going to copy the variable into the lambda. It's going to be const, so we can't modify it. Again, that's our compiler error. We can't sort something that's const. And the one difference in the by-value captures of lambdas versus in blocks is that for Objective-C objects that are captured by value, we do not retain them when creating the lambda. So they're retained by blocks, not by lambdas.

Again, let's fix this example. Let's actually make it work with lambdas, and we're going to do that with a by-reference capture. Same solution as we saw with the blocks, except different syntax and slightly different semantics. So the by-reference capture, you put an ampersand before the variable name within the capture list to say, "I want to capture this by reference." Variables captured by reference in a lambda are never copied. You're actually getting a direct reference to the element that is on the stack.

[Transcript missing]

Now, this is only available in Objective-C++. And the reason for that is that when we do this conversion from a lambda to a block, we create a new block. And that block needs to be managed. Its memory needs to be managed. And so we return it as retain-auto-released, using the Objective-C memory subsystem to give you back a block that's going to live long enough and, of course, can be block-copied if it needs to live longer.

Of course, the Arc Optimizer will kick in here to eliminate this retain-auto-release pair in many, many, many cases, but it still can be there in certain rare cases. So, let's step back and look at blocks and lambdas at a high level. They're intending to provide essentially the same feature of these anonymous functions or closures.

They both allow capture by copy, they both allow capture by reference. Blocks skew toward safety, so they'll make sure that they retain Objective-C objects when you do a by-value capture, so those things don't go away. They make sure to copy your by-reference captures out to the heap, so that you can't have dangling references within your blocks.

However, beyond that, when using our APIs, you have flexibility to use whatever is right for your problem, because both Blocks and Lambdas work with all of the Blocks-based APIs on our system. So what should you use? Should you use blocks, should you use lambdas? So our recommendation is generally for Objective C++ code, you should use blocks. The reason, they're succinct, they're well understood by the Objective C community, and they skew towards safety, so you're more likely to have a correct application if you just use the blocks, because it's going to make sure that your objects are retained, your references don't dangle.

Now, there are good reasons to use lambdas as well. You might be in a portable C++11 code base that already has lambdas. In that case, just continue using lambdas. You may want to have really precise control over how captures work. You want to write out that explicit capture list and say how each thing works. Lambdas are really good for that sort of thing.

And finally, if you're working with C++ and templates, the compiler can do a lot more optimization when you're passing a lambda expression to a C++ template than it can with a block passed to that same template, even though both will work. So with that, let's talk about the deployment story for C++11. So of course, C++11, it's available now. The language features work anywhere. For the libc++ library to get the full C++11 experience, you can build your apps and deploy them back to iOS 5 and Mac OS X Lion.

So we think C++ 11 is a really great revision of the C++ language and library. So we're moving the C++ defaults towards C++ 11. In Xcode 4.4, you'll see that any new projects start with C++ 11 as a language. In Xcode 4.5, we're going to move the default to libc++, so that by default, new projects going forward will use the full C++ 11 language.

Of course, there's absolutely no reason to wait until you start your next projects to do C++ 11. You can just go in and change your build settings so that you're using the GNU++ 11, which is the most compatible C++ 11 setting for the language, and use the libc++ C++ 11 library to give you complete, very useful C++ 11 solution for your apps. If you're a command line kind of person, flags are up here to use GNU++ 11 and libc++.

And with that, I'm going to turn it over to Ted Kremenek, who's going to talk about finding bugs using Clang Compiler and Static Analyzer. So just to recap, we've been talking about different ways that the compiler can improve the quality of your applications. We talked about performance enhancements to the compiler that we continue to do with every release to make your app run faster, so we encourage you to go and re-compile your app.

And then throughout the week, we've been talking about various language improvements we've been making. You know, we've talked extensively about Objective-C. Doug just talked about C++. And this is a trajectory that will continue, you know, going forward. So in this last part of the talk, let's talk about ways that the compiler can proactively find-- compiler-related tools can proactively find issues that are latent in your code.

The motivation is very simple. We have this very large ecosystem of applications, and users notice quality, right? Quality really matters. And they can be very vocal about what they think, you know, about your application. And quality can mean the difference between them buying your app and just steering clear of it, you know, forever.

So it leaves a very lasting impression in the App Store. And we want to help with this by making the compiler-related tools be able to be more proactive at finding issues before they ship. And this is something we care very much about and will continue to improve going forward.

So how can we do this? One very obvious way that we've been doing for a long time is finding issues using warnings, right? Warnings are awesome, right? I mean, as you're coding, they find bugs early, right? And when we engineered the Apple LLVM compiler from the ground up, we cared very much about building a system that could give you very clear, explanatory diagnostics that could work in the presence of macros, templates, just all the natural stuff that you use in your code.

And of course, we provide fixes in many cases when the compiler has fairly good intelligence about discovering, well, maybe this is what you meant. This is the likely fix. We've continued this trend in Xcode 4.4. We've added much deeper static analysis, which is the corollary of the compiler.

And we've added various compiler warnings and static analyzer checks that focus a lot on memory safety, security, just general correctness. And we've also improved the ways in which you can control warnings and you can tailor them more to your individual workflow because some warnings make more sense for other codebases versus others. So we're gonna touch on all these topics.

So for those of you not familiar with the differences between the compiler and the Static Analyzers, let's just step back and take a look at what they're intended to do. On one end of the spectrum, if you think about trading CPU time for finding issues, we can think of the compiler as being on one end, where it's fast, it's always available, and it's so fast, we use it for code completion, we provide live issues within Xcode. And so it's really great. It gives you instant feedback. But because it has to be so fast, it inherently does a more shallow analysis of your code.

And so we kind of look at it as, it can find some really important bugs, but they're the ones that are just harder, I mean, easier to see, kind of with the naked eye. On the other end of the spectrum, we have the Static Source Code Analyzer. It's basically compiler analysis, like on steroids, we trade CPU time for finding more issues, doing a deeper analysis of your code. And so we look at it as, like, kind of finding those bugs that are more subtle, harder to see.

You won't necessarily want to run this analysis all the time, but you're encouraged, obviously, to proactively do so often. Moreover, we can engineer intelligence into the Static Analyzer that we just can't do with the compiler. We can teach it about common APIs, like Grand Central Disk Ratch, Core Foundation, and so forth, things we just can't really do very well in the compiler.

So to kind of understand the differences between these tools on a more concrete level, let's look at one kind of bug that they both can find. And this is just a standard, you know, using an uninitialized variable. So if you run this code example through the compiler, this is the exact output you'll see on the command line. You'll get, you know, a diagnostic saying that you're using this variable uninitialized when you return it. It actually points to the line of code.

We show the warning flag on the right side of the diagnostic. And we even have a suggested fix-it here, you know, that you can silence the problem by initializing the variable. Now, if we make this code just slightly more, you know, complicated, right, and this is a little bit contrived. We're initializing a variable, you know, by reference. But we've all seen or written code that looks something like this, right? The compiler actually can't find a problem in this code because it just takes a little bit deeper analysis.

We run the compiler over it, and you're going to get no issues at all. You run the Static Analyser over it in Xcode 4.4, we get this very rich diagnostic saying, like, hey, you used this variable uninitialized when you took this particular path through your code, where you took the false branch and bar, you called foo, and then skipped over initializing the object.

So very clear, very explanatory. So let's look at the improvements in the Static Analyzr. In Xcode 4.3, which is our current shipping version of Xcode, the Analyzr works by essentially analyzing each file one at a time. And if we think of these as the functions in the file, it just iterates over those functions and does a deep code analysis.

And here we have these functions, foo and bar, and in previous versions of the Analyzr, they're just analyzed completely separately. So if we looked at the body of foo, we just see that, you know, this value is being written to some memory location. But if we don't really know anything about that memory location, there's no reason to assume there's anything wrong.

Similarly, if we analyze bar without any understanding of what the function foo does, there doesn't look like there's anything wrong here. So if you just imagine that the body of foo wasn't here, you yourself wouldn't really guess that you were doing anything wrong, right? So, like, there's obviously a problem here. But this is essentially, you know, the limited reasoning, you know, boundary that was in previous versions of the Analyzr.

In Xcode 4.4, we've enhanced it for the analyzer to look much more deeply into these cross-functional dependencies. So in this case, we start by analyzing the function bar, because we can see on a call graph that it lies at the top. And then when we contextually see it calling foo, it will flag a null to reference. This one change greatly amplifies the power of the Static Analyzers to find a lot more issues in your code. And we think that you're absolutely going to love this enhancement.

[Transcript missing]

Let's look at a cussed of memcopy, and that's memset.

Basically the same thing. Instead of copying, you know, some bytes from some other location, you're just setting a range of bytes to be the same value. Here it looks like we're doing everything right. We're computing the size of the destination object correctly. But if this is C++ code, this also might be subtly incorrect.

What if Y is some object that has, you know, virtual methods, right? That means there's a V pointer in that object. And so when you're memsetting here, you're not just, like, zero-initializing it. You're just nuking that V table pointer. And the compiler will actually warn about this now, too.

And we've heard from a very credible game developer that when they turn this warning on, they found a large number of bugs in their code base. So it might seem a little contrived, but this stuff happens all the time. Let's talk about some Static Analyser issues. I had mentioned before how the Static Analyser has a deep knowledge of many of our framework APIs. CF is something that's obviously used extensively on OS X and iOS. We have containers like CFArray, CFDictionary, and so on.

One issue we've seen is with portability between OS X and iOS. We can have the difference between 32-bit and 64-bit. And there's cases where people want to use their containers to store things other than objects. And they'll do clever things like, here we have an array of ints. Ints on a 32-bit architecture have the same size as a pointer value, so let's just do some tricks here so that we can stuff an array of ints into a CF set.

It turns out if you put this on a 64-bit machine, this is complete nonsense. This is garbage. And you're going to get completely unpredictable results. We strongly discourage writing this kind of code. It's a portability issue. It's just very brittle. That's not really how the APIs were intended. And on a 64-bit architecture, this would be a security or a correctness issue. Thank you.

So let's talk about the last bit about memory safety. And these are, again, just the highlights of some of the things that we have added to both the compiler and the Static Analyser. And that's mallocan-free. Mallocan-free are these low-level memory management APIs that we still use very frequently. Arc does a great job of managing your Objective-C objects, but for mallocan-free, you still need to manage that memory yourself.

Now, we have great tools like instruments and leaks to help you try and find these issues, but those tools are limited in various ways. First, you're going to only be able to find leaks on code paths that you yourself test, right? And memory leaks can often happen on corner cases that maybe it's, you know, just some case that you haven't tested but your users encounter. And also, it would be great if you could just find these issues proactively without having to do, you know, the extensive dynamic analysis later.

Oftentimes, we can find these memory leaks using static source code analysis. It's not perfect. You still should use tools like instruments and so forth, but we think this is going to greatly enhance your ability to find those problems early. So here's a real example. It's been slightly simplified to fit, you know, well on this slide. But essentially, it's a case where we're calling malloc and then we're returning early.

You run it through the Static Analyser in Xcode 4.4. You're going to get a diagnostics like this, and where we gave an explanation where, you know, how the memory was allocated. We even see that the pointer was checked for validity. And then on an early return path, we said that the memory was leaked, right?

This is a very localized, you know, problem, right? And you wouldn't have been able to find it with the dynamic analysis tool unless you were able to test the case where set of data failed, right? So very subtle, but you can find these issues proactively using the Static Analyser. So let's look at a real bug. This is something slightly more complicated. And it's essentially the same bug that I just showed you before. We're returning early, and we're failing to deallocate something.

The details of this code doesn't really matter. The thing to keep in mind is the very top we're calling this function parsePgArray, where we were turning allocated memory by reference. So if you looked at this like in isolation, you wouldn't necessarily know that that was the case, right? But because we have this new cross-functional analysis, the analyzer looks into the implementation of this function. And if you go to the navigation bar that shows up in the editor, you'll get a complete abstract call stack.

and you can dive in and see what's going on. If I click on that, I go into the body of parsePgArray. I see the allocation from malloc, and at the bottom we see that memory getting assigned by reference to the return value. So this allows you to dive in, write into the code to see what's going on. It's really, really powerful.

So we just highlighted a few things. We've added enhancements for finding issues with, you know, Cocoa Touch APIs, Grand Central Dispatch, portability issues, and a whole bunch of the low-level Unix APIs. And we'll continue to improve the Static Analyzers and the compiler in these areas and others going forward. So with all this new awesomeness, right, how do you go and tailor it to your workflow?

We talked about in the kickoff about how controlling warnings is really important. I'd like to divide that for the compiler into two different approaches. The standard approach you have right now is the additive approach to warnings. If you imagine that this gray bar is the set of all possible compiler warnings, there's some set of warnings that are enabled by default. These are the ones that we, in our divine wisdom, have decided that you should always see, or at least see by default.

Then there's all these, like, magical flags that exist to turn on additional compiler warnings. We have the misnamed WALL, which doesn't actually turn on all compiler warnings. This really just came about from historical expectations. You know, people started building their code with WALL and pairing it with WAR, which turns all warnings to errors. And then whenever compiler authors added new warnings to WALL that they didn't like, that those people didn't like, those authors got yelled at.

So we have to add things, new warnings to WL with a lot of care. And then there's other esoteric flags like -pandantic, and then just a whole smorgasbord of other compiler flags that you can pass to turn on additional warnings. So the problem is, how do you know what all these flags are?

I mean, it's a real discoverability issue. We have to document these, of course, but let's say you just want to use all the warnings that are available that make sense for you. Should you go and look at the release notes? And I mean, there's hundreds of compiler warnings. So there's an inverse approach, where you start with all warnings, and you turn off the ones you don't want.

With the new -weverything flag, or weverything, you can truly turn on all the warnings that are in the compiler. Now, the one caveat is-- well, there's two caveats. One, if you upgrade compilers, you should just basically expect your code not to compile anymore if you're passing w-error. This is really the intended workflow, right? And the idea is that you immediately are drawing attention to the new issues, and you can decide either to fix them or disable those warnings.

and you could do so simply by passing the -w, no, and the warning name flag to the compiler. So it's a very powerful workflow. And as we saw earlier in the diagnostics, the compiler will tell you what the warning flag is when you see the warning, right? So there's no discoverability problem. You know exactly how to turn the warnings off.

Now, the other caveat about this approach, besides, you know, the w-air issue, is that there are many warnings. Some warnings are more like coding-style conventions. For example, we have the new default synthesis feature in Xcode 4.4. There is a warning for transitioning from, you know, older code that's using explicit synthesis to default synthesis.

And some people would like to still be warned about explicitly, you know, not explicitly synthesizing your properties. That is a coding-style warning. If you don't want to see it, which shouldn't apply to most of you, just turn the warning off. So that's the one caveat, is, like, there's a lot of different warnings here. Just cherry-pick the ones that make sense for you. Beyond just the command line, you continue to have the power to control compiler warnings within a source file.

Many of you may not be aware, but we have these preprocessor pragmas, which allow you to conditionally-- to suppress a warning within a scope of text or even promote a particular warning to an error. And so this is documented on the LLVM open source web page, but the syntax is pretty simple, as you can see right here. You just say, "Map this warning to an error," and just give the warning name.

Finally, we have improved the ability to control analyzer issues. We've expanded the Xcode build settings to allow you to turn off and turn on and turn off various checkers. You can do this on a per-project and target level, and we will continue to enhance this workflow going forward. So to summarize, we're very passionate that a better compiler means better applications. We care very much about the quality of what you are producing and putting on the App Store. And your users obviously do too.

So in the LLVM compiler 4.0, we've improved the performance of the compiler for you to write better code. It's faster. The compiler itself is faster. Great language improvements. It reduces boilerplate. It lets you let more elegant code that's less error-prone. And we've improved its ability to find more issues early with improved compiler warnings and vastly improved static analysis. There's a lot of places you can look for more information.

Michael Jurowicz, our developer tools evangelist. The open source web pages have a lot of information that you go to. There's tips on using the Static Analyser, the Static Analyser open source page. And we are directly available on the developer forums. So we're happy to meet with you at the labs. But we post on the developer forums all the time. So if you have questions or concerns, you can reach many of us directly there. And with that, we hope you enjoy the rest of the conference. Thank you.