Video hosted by Apple at devstreaming-cdn.apple.com

Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2012-410
$eventId
ID of event: wwdc2012
$eventContentId
ID of session without event part: 410
$eventShortId
Shortened ID of event: wwdc12
$year
Year of session: 2012
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: [2012] [Session 410] What's New ...

WWDC12 • Session 410

What's New in LLVM

Developer Tools • iOS, OS X • 58:05

The Apple LLVM compiler has evolved at a staggering pace, providing remarkably quick compile times and generating lightning-fast code. Learn about the latest LLVM technologies from improvements in the Static Analyzer, to better performance and optimizations, to the latest advancements in C++ support.

Speakers: Doug Gregor, Ted Kremenek, Bob Wilson

Unlisted on Apple Developer site

Downloads from Apple

HD Video (219.1 MB)

Transcript

This transcript was generated using Whisper, it may have transcription errors.

Good morning. My name's Bob Wilson, and together with a few of my colleagues, I'm going to talk with you today about some great new features in the LLVM compiler. So a better compiler is really significant because we believe that a better compiler can help you guys to make better apps. And there's three ways we can do that. Performance, productivity, and quality. Performance is really important because you want your apps to run quickly and be responsive. The compiler can help you by optimizing your code to run quickly.

Compiler can help your productivity by making sure that it itself runs quickly so that you don't have to spend a lot of time sitting around waiting for your builds. And we can also help your productivity by having the compiler support new language features that make it easier for you to write correct and efficient code.

And finally, quality is essential. Nobody wants to use a buggy app. Every time that you build your project in Xcode, the compiler is going to look for suspicious things, and it will warn you before you even hit those problems. You can also periodically run the static analyzer, which is based on the compiler, to find more subtle problems in your code. So these are three areas where the compiler can help you.

Xcode 4.4 has a new version of the Apple LLVM compiler, version 4.0, and we're going to talk about some of those new features here today. Before we do that, I want to say a few things about LLVM. LLVM is represented by the Dragon logo here as an open source project to develop compilers and other low-level development tools. One of the distinctive things about LLVM is its modular architecture, where it's built upon a set of reusable components that can be combined together in interesting ways. So some of these components are the clang parser, which is a unified C, C++ and objective C parser. Runs really fast and it has great informative diagnostics to help you understand the problems that the compiler finds. We have a sophisticated optimizer to make your code run fast. And we have code generators, assemblers and disassemblers for both Intel and ARM processors. So together with those components, we also have some runtime libraries. libc++, which we're going to talk about in a few minutes, is our C++ runtime library. And we also have a lower level compiler support runtime.

We can then combine all of those different components and runtime libraries to create standalone tools. So starting with the LLVM compilers is the most obvious one. The LLVM compiler builds upon all of those things shown below it, and you'll see this whenever you build your project in Xcode. And of course, you can also run the compiler directly from the command line. The Xcode static analyzer is built on top of the Clang parser, combined with sophisticated analysis to understand what your code is really doing.

And this is also integrated into Xcode when you run the analyze step, in a nice way to actually be able to visualize and see the results of that analysis. And finally, we have the LDB debugger. The debugger uses the Clang parser, the code generator, and the disassembler, and forms the basis of the Xcode debugger.

The Clang parser itself is integrated directly into Xcode to support features like indexing, code completion, live warnings, and fixits. So I hope this gives you a feeling for how the architecture of LLVM really lets us do some powerful things. We can reuse these components, we get consistency across the tools, and we get this really powerful integration into the Xcode IDE.

Now, if you're familiar with our tools, you may notice one thing that's missing from this list here, and that's LLVM GCC. LLVM GCC is a hybrid. It's a combination of the old GCC 4.2 parser with an older version of the LLVM optimizer and code generator. This was a stopgap solution that we created to ease the transition from GCC over to LLVM. We finished that transition now, and we've no longer been adding any new features or even fixing bugs in LLVM GCC. So it's really time for everyone to stop using that. And in fact, we're going to be removing that from Xcode in the near future. So please move to the LLVM compiler. Lord.

So with that context, I'd like to go on and talk about performance, which is the first of the focus areas we're going to cover today. As a compiler team, performance is near and dear to our hearts. Some of us stay up at night thinking about how we can squeeze a few more cycles out of some important code.

And we also work really, really hard to make sure that the compiler itself runs quickly. There's no way I can talk about all the performance enhancements we've added recently. So I've just picked three of them to highlight today. So let's start by talking about the Arc Optimizer. Arc, or automated reference counting, is this great feature we introduced last year to make you more productive. If you use Arc, you no longer have to manually insert retains and releases into your Objective-C code.

Let's look at how this works. So I've put up a simple example here of a debug logging method. It takes a string as an argument and just writes it out with NSLog. From the perspective of Arc, the important thing here is that from the time you enter the method to the point where you call NSLog, we have to make sure that nothing is going to release the string out from under us. So the compiler will just automatically insert a retain for that string S at the beginning of the method and then release it at the end. And in fact, if you build an unoptimized debug version of this code, that's basically what you'll get.

Now, as an experienced programmer, you may look at this and say, that's kind of silly. There's really no way that the string could be released out from under you because nothing happens before you call NSLog. If you run an optimized build of this code, the compiler will automatically run the arc optimizer. It analyzes it and realizes this is silly, and it optimizes the way that retain and release. This kind of optimization has been in LLVM from day one with Arc. So let's talk about something new this year. We have a lot of improvements to this Arc optimizer. And I can't unfortunately cover all of them. So again, I've just picked one example. And that's the case of nested retains. And to illustrate that, let's look at a more complicated version of that debug logging. That debug logging method. Again, it takes a string. And so Arc will insert a retain for the string S. Now copy from S to a new string T, so a reference to the same string. And Arc will also insert a retain for T.

Now, at this point, well, I should say this is a contrived example. I'm sure you wouldn't actually write this code. But in a more complicated method, there may be reasons where you would insert a copy like that. And it can also happen when the compiler inlines functions. So the arc optimizer can look at this and realize, I've retained the same string twice. As long as a string is retained at least once, there's no way it can be released. So that second retain is completely unnecessary. and the Arc Optimizer, now in version 4 of the compiler, will optimize that away.

What about the other retainer? Can we get rid of that as well? In this case, unfortunately, you can't. If you look inside the method, it's now a little more complicated. It's checking a logging-enabled flag. And then before it calls NSLog, it's sending an increment log count message. It's highly unlikely that increment log count is gonna release the string.

But the compiler can't know that, right? It's hypothetically possible that the string that was passed in is in a global variable. Somebody could make an increment log count implementation that would actually release that string. And the compiler has to be correct. It has to be safe. So we can't get rid of the retain in this case. But that doesn't mean we can't optimize it. Compiler isn't smart enough to realize it can move the retain and release inside that conditional, so that if logging is disabled, you don't pay the cost of the retain.

So that's just one example of a lot of nice improvements to the Arc Optimizer that should make your code run faster than with previous versions of the compiler. Let's go on and talk about another performance area, which is support for Intel's AVX. AVX is a vector processing extension. It's available in recent versions of Intel processors, the Sandy Bridge and Ivy Bridge processors.

So basically, most of the new Macs from the last year or so will support AVX, and it's really powerful. Compared to the older SSE, the AVX vectors are twice as wide, 256 bits instead of 128. So it's very powerful. It's not gonna help every application, though. It's an entirely floating-point optimization.

So it's a good fit for applications that are floating-point intensive and typically works best where there's a high ratio of computation to memory bandwidth, where you're actually doing a lot of computation. So if you think that your application may fit that profile, it might be worth taking a look at optimizing for AVX.

And let's look at an example to show how you can do that. This is just a simple matrix addition. And the code up here is gonna step through two input matrices. It's gonna load eight elements at a time into two vectors, A and B. It's gonna add them to a new vector C and then store it back out to the result.

You can see it's using some intrinsic function calls here. Intel's defined a standard set of intrinsics, and LVM 4.0 supports the full set of those standard intrinsics, which is really great. It's powerful because it lets you have full access to the functionality of AVX and a lot of control. It can be a little bit hard to read, especially if you're not familiar with all those intrinsics. So a nice option is to use the OpenCL vector syntax. So you can directly dereference pointers to those vectors and just load them up. We can now do the add as just A plus B. And it's much easier for you to understand what's going on. You can mix and match the syntax with the intrinsic calls as well. So for the simple operations like adds, you can use the OpenCL syntax. For more complicated things that you can't directly represent with C operators, you can use the intrinsics.

How much performance does this buy you? So I took that same example. I implemented an SSE version and a scalar version that just adds one element at a time. So this graph is showing a speed up relative to the scalar version. The SSE implementation speeds up more than three times, so that's good. An AVX version running on the same machine speeds up by about 4 and 1/2 times. So it's a significant step up. This is kind of a small example. In your own code, you may see either larger or smaller speedups, but I think it's good enough to illustrate the point that there's potentially a really significant win here.

If you want to use AVX, there's one complication, and that's you will want your code to run on older Macs as well. And what that means is you need to implement not only the AVX-optimized version, but an SSE version. And then check at runtime whether the hardware you're running on supports AVX, and if not, fall back and use the SSE version. So to do that, put the AVX code into separate source files, compile just those source files with AVX enabled, And then insert a runtime check. You can use the syscontrol interface to check if AVX is supported and then switch between those two implementations. Thanks. But in the cases where it works, it's really a significant win for floating-point-intensive applications. questions. One last performance area, and that's the integrated ARM assembler.

Previous versions of the LLVM compiler for ARM would write out an assembly file and then invoke the system assembler to read that file in and parse it, translate it to machine code, and write out an object file. There's a lot of overhead involved in that. We want to cut out that overhead so the compiler runs faster. And now in version 4, the compiler can just generate that object file directly, so it runs faster. This is a feature we've had on Intel for a while now, and I'm really pleased to say it's supported for ARM as well.

Besides the build time improvements, a nice feature of this is you get better error checking if you have inline assembly. The compiler will look at that and understand it and report errors to you. And the only thing you really have to watch out for is if you've got some really old ARM assembly, this integrated assembler only supports ARM's modern unified syntax. So if you've got some really old stuff, you may need to either update the syntax or build with the integrated assembler disabled. But those are rare cases. And for almost all of you, the result of this will just be that your builds run faster than they used to.

So those are three performance improvements that I wanted to highlight for you today, and now I'd like to invite Doug Gregor to come up and talk about some new language features. So new language features. There's a bunch of new language features available in the Apple LLVM Compiler version 4. So we've talked a bit this week about the new Objective-C language features like numeric literals, array literals, dictionary literals, and so on. These are all available in the Apple LLVM Compiler version 4. But we've talked a lot about Objective-C. I don't want to talk about Objective-C anymore. I want to talk about C++.

Thank you. There's always a sum of people in the room like that. So there's a new C++ standard, the C++11 standard. It was ratified late last year by the committee, and it's been 13 years since the original 1998 C++ standard. And since then, there's been a lot of improvements in the C++ language and the standard library that comes with it. Now, there are lots of improvements, but we can sort of pigeonhole them into a couple different places. So many of the improvements are targeted at simplifying common idioms. So the things that C++ programmers do day in and day out, C++ 11 tries to make them easier, more concise, more safe. C++ 11 is also about performance. So it introduces new features to help you make your code run faster. And finally, one of the areas C++ has traditionally been very good is in describing expressive, efficient software libraries. And so there's new features in the C++11 language to make it easier to build those libraries. Now, all of this is done with a very strong focus on backward compatibility, so that your apps that build as C++98 apps today will compile and run the same way as C++11 apps. Now, last year, we actually started introducing C++11 support with version 3 of the Apple LLVM compiler.

And since then, we've been hard at work implementing more and more of this very large C++11 standard. With version four of the Apple LLVM compiler, we've introduced many of the highly requested features such as generalized initializer lists, generalized constant expressions or constexpr, and lambda expressions. Now we're gonna talk about a couple of these features, skewing toward the ones that can improve productivity, that's where we find is most important for people. But before we dig into the language side of the equation, we're gonna talk about the C++11 standard library.

Now, you can't have a great C++11 solution unless you have both the C++11 compiler to support the languages and a C++11 standard library to provide all the library facilities. There's two general reasons for this. First reason is that the language and the library in C++ are somewhat intertwined.

So some of the features we think of as language features, like initializer lists, they're really only truly useful when you also have the C++11 standard library backing them up. Things like move semantics, You can use them without a C++11 library, but you're not gonna see the benefits unless your library has been optimized for move semantics.

But the C++11 library isn't just a language support library. There's actually a lot of new great features in C++11, like smart pointers. So the smart pointers help automate your memory management for your C++ objects, kind of like Arc does it for your Objective-C objects, through use of, by describing your ownership. So you have a unique pointer to describe unique ownership, shared pointer for shared ownership, weak pointer to weakly point at a shared object. And these can help you eliminate the trouble of matching up all of your news and deletes. There's also regular expressions. There's concurrency support through threading and atomics, and a lot of other features to help you build good C++ apps in C++ 11.

Now, here at Apple, we've been working on a new implementation of the C++11 standard library. It's part of the LLVM project, and it's called libc++. So this is a ground-up implementation intended for complete C++11 standards conformance. Now, it also gracefully degrades for C++98 and 03. So you can migrate to the libc++ standard library with your C++98 app first, and then change into compiling for C++11 to get a complete C++11 solution.

Now, libc++, it's a new library. We've been working on it for several years, and it's a ground-up implementation where we got to focus on performance. And in focusing on performance, we've created better data structures that require less memory and are faster, and highly tuned algorithms so that your uses of the standard library will be more efficient and take less memory. libc++ is a replacement for the current C++ standard library that you're probably all using. That old standard library is from GCC. It hasn't been touched in years. And so we want you to migrate forward to the new libc++ library. The good news is that migrating to libc++ is very, very easy, because libc++ as a standard library implementation and C++11 as a new standard are both largely backward compatible. So for most applications, you just switch to libc++ and switch to C++11 at the same time. Your code will just compile and build. So the only thing that we've seen trip-up migration moving to libc++ is due to the library tr1 components.

So these were some transitional components that the committee had worked on, and these features have actually moved into C++ 11. So if you're using them from the old GCC standard library, you just need to update your includes to remove the tr1 slash and update your namespace references to eliminate the tr1 there and use the new C++ 11 versions of these components. So that's the new libc++ C++ standard library. Let's talk about some language features and how they can actually help you be more productive when working in C++.

So one of the major problems that C++ has is verbosity. So this is one of my least favorite loops because all I want to do is walk over the views in this vector. Simple operation, and yet I have these two lines of for loop header just to do that. One of the big problems there is I have to write this iterator type. I write hundreds of these iterator types during the day, and they're all almost identical. Okay, so C++11 introduces this wonderful new feature of auto.

What auto does is it tells the compiler, "Please fill in the type for me. I don't want to write it." And this is not magic. It's just a placeholder that the compiler will use, and when the compiler sees it, it will go look at the initializer. So look at the initializer, v is v stop again, compute the type, which it had to do anyway, and then fill it in for you. Great feature, saves a ton of typing. Now, by default, when you use an auto variable, you're gonna get a copy of whatever you're initializing with. Almost always, that's the right answer. It's certainly the right answer here, 'cause that's what we're iterating over. But if you actually want to reference instead of a copy, that's perfectly fine. Just use auto reference.

Our loop is better. Our loop is not perfect. C++11 also has the for range loop. For range loop should look very familiar to anyone who's used Objective-C because it's almost identical to a fast enumeration loop. We're using the colon rather than in, but otherwise the syntax is the same. We just say for every view in the container views, do something. Notice, yay!

Notice that we're using auto here as well, because, hey, why bother to write NSView* again? We know that when we're walking over views, we've got a view. Now, the for-range loop works with any of the standard library containers, and it also works with anything else that has begin and end functions that return iterators. So if you have containers that have been following conventions of the standard library, the for-range loop just works with those, too.

Now, we love using auto. However, in Objective C++, we have to be a little bit careful with auto because sometimes using auto without care can change the meaning of your program in surprising ways. So here's a really simple method. All we're gonna do is grab a view out of an NSArray of views. Well, this typing NSView is far too much typing for me. I want to use auto instead.

This program is still correct. It'll probably still compile, but we've actually changed the meaning here. And the reason is ObjectAtIndex doesn't actually return an NSView. What it returns is ID. So when the compiler sees an ID on the right-hand side, it decides that View must have the type of ID. This is correct behavior. Your program will probably still do exactly the same thing. But what you've actually done here is you've lost a little bit of the static types information. that you used to have in your program. Because uses of view used to know that it was an NSView, and therefore, you get warnings if you accidentally convert that NSView over to an NSString. Or if you try to send a message that isn't available on NSViews, you would get a warning. You no longer get those warnings 'cause you can do anything with an ID. You'll also notice other features like code completion won't be as precise anymore because that static type information is gone.

So the moral here is that auto is really great for C++ types. It can save you a lot of typing. But when you're using auto in Objective C++ code and you're working with methods that may return ID or you're not sure, don't use auto. Just use the typing that you've used before so you keep that static type information. So let's talk about some new features.

Initializing containers in C++ and formerly in Objective-C is actually really, really painful. So here we have to declare the vector, and then we call the pushback over and over again, copying these pushback calls. It's really, really tedious. There must be a better way. In C++11, there is a better way, and that's to use the generalized initializer list syntax.

So obviously, this is far cleaner than calling pushback 100 times. It's also nice because it generalizes existing facilities of the language. Initializer lists have always been there. If we were initializing an array, we could have used an initializer list. Now we can do it for arbitrary containers as well. Of course, this works for any of the containers in the C++ standard library. So you can initialize a map with key value pairs very easily with these nested initializer lists. The caveat here, of course, is that you need a C++11 standard library for this to work. So if you want to use generalized initializer lists, you're gonna need to use libc++, which provides the library facilities to make the initializer list feature work nicely. Now, it turns out you can use initializer lists in other places than just after the equal sign when initializing a variable. For example, you can use them in an insert call into a map, where usually you'd have to write this big, long STD make pair call to put the key and the value into a pair, you can just pass it almost like a tuple of a key and value using an initializer list. So hold that thought of the tuple, because it turns out that with generalized initializer lists and with the new features of the C++11 standard library, we've made it very, very easy to do something cool, which is multiple return values with a very nice, clean syntax. So here our minmax function wants to return two values. How do you do that in C++? Well, you don't.

You have one pass-by reference. But in C++11, you can say return a tuple. And it can be a tuple of any number of elements. They can have any type, whatever you specify. And when you actually initialize that tuple, you just use an initializer list. Very clean, very natural syntax for returning multiple values. But what about using that? Because our user probably doesn't actually want to get a tuple back. Well, there's also really nice syntax for this as well.

And that's through the tie library facility. You can tie up a couple local variables and assign to the result of that tie from the min-max to say get the minimum, get the maximum, add separate variables, and continue on with our code. It's a very, very nice, natural way to do multiple return values in C++ 11. Let's talk about lambdas. This is a very, very, very highly requested feature.

And so lambdas are a nice way of packaging up code, in this case, a comparison operation, and passing them off to another operation. This syntax should look very familiar to you if you've used Objective-C because it's very, very similar to blocks. So lambdas and blocks are two different features that try to address the same problem. Both of them are essentially creating anonymous functions or closures that we can pass off to another routine to do some operation. Now, you see some syntactic differences here. We're using the open and closed square brackets rather than the caret. And the return type for a lambda is in a different place. It's this optional arrow type at the end. Other than that, the syntax and the usages are very similar.

However, the semantics, the behavior of blocks and lambda expressions are different in some subtle ways. So to understand those, we're gonna go a little bit into how blocks work, and then we can contrast those with how lambdas work so you can make an educated decision about what is best for your code.

So let's take our sorting example and let's make the assumption that this strings array that we were sorting is really, really, really big. It's so big that we really want to sort it on a different thread and then later on we'll join up with our thread to actually look at the results. So we'll do that with Grand Central Dispatch. Call the dispatch async, pass it a block, and inside that block you can do the sorting of strings.

This code looks good, but this code actually has an error in it. And the reason for the error is that this block is referring to the vector strings. Now, what does that mean? Well, when you refer to something that's outside of the block from inside the block, so refer to a variable, that variable has to be captured, meaning it has to be pulled into the block. By default, a block is going to capture by value, which means that we make a copy of the data structure of that variable and put that copy inside the block. Now, immediately, this sounds like both a performance and a correctness problem. First, we're making this copy of something that we already said was large. But second, if we sort the copy, we're never gonna see the results anywhere.

So that's not the reason for the compiler error. The reason for the compiler error is that because this is so dangerous to be working modifying a copy of something, by default, by-value captures are captured as const. You can't modify them. So the compiler error you're actually getting here is that you can't sort a constant vector of strings. Now, by-value captures within Objective-C++ for blocks also have this nice other behavior that when you capture something that's actually an Objective-C object, it's going to retain that object automatically for you, so you know it doesn't go away.

What we really want here is we want to fix this problem. We want to make this code correct and eliminate the compiler error. And the way we do that is with double_block variables. So if we mark the strings variable as double_block, what it does is it tells any blocks that refer to that variable to capture it by reference so that all of those blocks and the main thread of control all see the same version of this strings vector.

Now, one thing you may not know is that by reference captured variables actually do get copied. They get copied once at the time when you pass off when Grand Central Dispatch takes in your block and does a block copy on it. Now, the reason for that copy is that when we first come into this function here, strings is allocated on the stack. When we pass a block off to Grand Central Dispatch, we have no assurances that that stack is still going to be there when we call the block. And so what the compiler does for safety reasons is it makes one copy of that strings variable out on the heap. And then everyone that refers to that strings variable now points at the heap copy, which can live longer than the stack frame. Altogether, this means that our application now actually does work because we're all referring to the same strings variable, we're allowed to sort it because it's mutable, and we will see the results later on once we sync up with GCD.

That's how blocks work. Let's look at the same issues for lambdas. So we're gonna change our block, caret, to open/close square brackets to introduce a lambda. First thing we're gonna see is a compiler error. This is for a different reason than our first compiler error with blocks, though. See, the open and close brackets introduces a lambda, but what's really stating is, is this an empty lambda capture list?

And because it's an empty capture list, it means this lambda is not allowed to refer to any variables from the outside scope that have to capture. So when it tries to refer to strings, the error the compiler's gonna tell you is, "Well, I can't capture strings "'cause I'm not allowed to capture anything."

So let's tell it capture strings. Well, we can do that by putting the name of the variable we want to capture in between those square brackets. So now we're capturing the variable strings, and we're going to get another error. Now, this is similar to the first blocks error, because when you just name the variable strings here in the open square brackets, it's going to capture by copy.

Same semantics as blocks for the most part. We're going to copy the variable into the lambda. It's going to be const, so we can't modify it. Again, that's our compiler error. We can't sort something that's const. And the one difference in the by-value captures of lambdas versus in blocks is that for Objective-C objects that are captured by value, we do not retain them when creating the lambda. Okay, so they're retained by blocks, not by lambdas.

Again, let's fix this example. Let's actually make it work with lambdas, and we're going to do that with a by-reference capture. Same solution as we saw with the blocks, except different syntax and slightly different semantics. So the by-reference capture, you put an ampersand before the variable name within the capture list to say, I want to capture this by reference. Variables captured by reference in a lambda are never copied. You're actually getting a direct reference to the element that is on the stack.

Good for performance. There's no copies. Possibly dangerous, however, because if your stack frame disappears before someone else is done with that lambda, they have essentially a dangling pointer into your stack, and they're very likely to crash. So it's more powerful, there's more performance, but you have to be very careful with this. Our example is okay because we're doing the dispatch sync at the end, but watch your by-reference captures very carefully if you're going to be using lambdas. Now, lambdas allow you to write these explicit capture lists things. Sometimes that's good because we want to be very careful about what we capture. We want to know what it is. However, there's also these capture defaults. So you can put just a lone ampersand to say capture everything by reference that I refer to or a lone equal sign to say capture everything by value that I refer to. And this can compose if you want to do something really fancy like capture everything by value except strings which we want to capture by reference. You can do that with the Lambda capture syntax. Now, some of you have probably noticed I've been very cavalier about using Dispatch Async, a blocks-based API, with these C++11 lambdas. Well, we intentionally made this work because on our platform, there are quite a number of blocks-based APIs, and they're very, very important. So in implementing C++11 in Objective-C++, we provide a conversion from the lambda expressions that you write in your code over to a block. So the way this works, it's almost entirely seamless. So here we're calling dispatch async, providing a lambda. Compiler sees that what is actually expected is a block. And so long as the parameter types of the lambda and the block and, of course, the return types match up, we'll just do that implicit conversion for you to turn the lambda into a block.

Now, this is only available in Objective-C++. And the reason for that is that when we do this conversion from a lambda to a block, we create a new block, and that block needs to be managed. Its memory needs to be managed. And so we return it as retainAutoReleased using the Objective-C memory subsystem to give you back a block that's going to live long enough and, of course, can be block copied if it needs to live longer.

Of course, the arc optimizer will kick in here to eliminate this retain-auto-release pair in many, many, many cases, but it still can be there in certain rare cases. So let's step back and look at blocks and lambdas at a high level. They're intending to provide essentially the same feature of these anonymous functions or closures. They both allow capture by copy.

They both allow capture by reference. Blocks skew toward safety, so they'll make sure that they retain Objective-C objects when you do a by-value capture, so those things don't go away. They make sure to copy your by-reference captures out to the heap so that you can't have dangling references within your blocks.

However, beyond that, when using our APIs, you have flexibility to use whatever is right for your problem because both blocks and lambdas work with all of the blocks-based APIs on our system. So what should you use? Should you use blocks? Should you use lambdas? So our recommendation is generally for Objective C++ code, you should use blocks. The reason, they're succinct, they're well understood by the Objective C community, and they skew towards safety. So you're more likely to have a correct application if you just use the blocks, because it's going to make sure that your objects are retained, your references don't dangle. Now, there are good reasons to use lambdas as well. You might be in a portable C++11 code base that already has lambdas. In that case, just continue using lambdas. You may want to have really precise control over how captures work. You want to write out that explicit capture list and say how each thing works. Lambdas are really good for that sort of thing. And finally, if you're working with C++ and templates, the compiler can do a lot more optimization when you're passing a lambda expression to a C++ template than it can with a block passed to that same template, even though both will work. So with that, let's talk about the deployment story C++ 11. So of course, C++ 11, it's available now. The language features work anywhere. For the libc++ library to get the full C++ 11 experience, you can build your apps and deploy them back to iOS 5 and Mac OS X Lion.

So we think C++11 is a really great revision of the C++ language and library. So we're moving the C++ defaults towards C++11. In Xcode 4.4, you'll see that any new projects start with C++11 as a language. In Xcode 4.5, we're going to move the default to libc++ so that, by default, new projects going forward will use the full C++11 language. Of course, there's absolutely no reason to wait until you start your next projects to do C++11. you can just go in and change your build settings so that you're using the GNU++11, which is the most compatible C++11 setting for the language, and use the libc++, c++11 library to give you complete, very useful C++11 solution for your apps. If you're a command line kind of person, flags are up here to use GNU++11 and libc++.

And with that, I'm going to turn it over to Ted Kremenek, who's gonna talk about finding bugs using Clang Compiler and Static Analyzer. So just to recap, we've been talking about different ways that the compiler can improve the quality of your applications. We talked about performance enhancements to the compiler that we continue to do with every release to make your app run faster, so we encourage you to go and re-compile your app. And then throughout the week, we've been talking about various language improvements we've been making. You know, we've talked extensively about Objective-C. just talked about C++. And this is a trajectory we'll continue going forward.

So in this last part of the talk, let's talk about ways that the compiler can proactively find-- compiler-related tools can proactively find issues that are latent in your code. The motivation is very simple. We have this very large ecosystem of applications. And a user is known as quality. Quality really matters. And they can be very vocal about what they think about your application, and quality can mean the difference between them buying your app and just steering clear of it, you know, forever. So it leaves a very lasting impression in the App Store. And we want to help with this by making the compiler-related tools be able to be more proactive in finding issues before they ship. And this is something we care very much about and will continue to improve going forward.

So how can we do this? One very obvious way that we've been doing for a long time is finding issues using warnings, right? Warnings are awesome, right? I mean, as you're coding, they find bugs early, right? And when we engineered the Apple LLVM compiler from the ground up, we cared very much about building a system that could give you very clear, explanatory diagnostics that could work in the presence of macros, templates, just all the natural stuff that you use in your code. And of course, we provide fixes in many cases when the compiler has fairly good intelligence about discovering, "Well, maybe this is what you meant. This is the likely fix." We've continued this trend in Xcode 4.4. We've added much deeper static analysis, which is the corollary of the compiler. And we've added various compiler warnings and static analyzer checks that focus a lot on memory safety, security, just general correctness. And we've also improved the ways in which you can control warnings So you can tailor them more to your individual workflow because some warnings make more sense for other code bases versus others. So we're gonna touch on all these topics.

So for those of you not familiar with the differences between the compiler and the static analyzer, let's just step back and take a look at what they're intended to do. On one end of the spectrum, if you think about trading CPU time for finding issues, we can think of the compiler as being on one end, where it's fast, it's always available, and it's so fast we use it for code completion, we provide live issues within Xcode. And so it's really great. It gives you instant feedback. But because it has to be so fast, it inherently does a more shallow analysis of your code. And so we kind of look at it as it can find some really important bugs, but they're the ones that are just to see, you know, kind of with the naked eye. On the other end of the spectrum, we have the static source code analyzer. It's basically compiler analysis, like, on steroids. We trade CPU time for finding more issues, doing a deeper analysis of your code. And so we look at it as, like, kind of finding those bugs that are more subtle, harder to see. You won't necessarily want to run this analysis all the time, but you're encouraged, obviously, to, you know, proactively do so often. Moreover, we can engineer intelligence into the static analyzer that we just can't do with the compiler. We can teach it about common APIs like Grand Central Disk Ratch, core foundations, so forth, things we just can't really do very well in the compiler.

So to kind of understand the differences between these tools on a more concrete level, let's look at one kind of bug that they both can find. And this is just a standard using an uninitialized variable. So if you run this code example through the compiler, this is the exact output you'll see on the command line.

You'll get a diagnostic saying that you're using this variable uninitialized when you return it. It actually points to the line of code. We show the warning flag on the right side of the diagnostic, and we even have a suggested fix it here, you know, that you can silence the problem by initializing a variable.

Now, if we make this code just slightly more, you know, complicated, right, and this is a little bit contrived, we're initializing a variable, you know, by reference, but we've all seen or written code that looks something like this, right? The compiler actually can't find a problem in this code because it just takes a little bit deeper analysis.

We run the compiler over it, you're going to get no issues at all. You run the static analyzer over it in Xcode 4.4, we get this very rich diagnostic saying, like, hey, you used this variable uninitialized when you took this particular path through your code where you took the false branch and bar and called foo and skipped over initializing the object. So very clear, very explanatory.

So let's look at the improvements in the static analyzer. In Xcode 4.3, which is our current shipping version of Xcode, the analyzer works by essentially analyzing each file one at a time. And if we think of these are the functions in the file, it just iterates over those functions and does a deep code analysis.

And here we have these functions, foo and bar, and in previous versions of the analyzer, they're just analyzed completely separately. So if we looked at the body of foo, we just see that, you know, this value is being written to some memory location. But if we don't really know anything about that memory location, there's no reason to assume there's anything wrong. Similarly, if we analyze bar without any understanding of what the function foo does, there doesn't look like there's anything wrong here. So if you just imagine that the body of foo wasn't here, you yourself wouldn't really guess that you were doing anything wrong, right? So, like, there's obviously a problem here, but this is essentially, you know, the limited reasoning, you know, boundary that was in previous versions of the analyzer.

In Xcode 4.4, we've enhanced it for the analyzer to look much more deeply into the-- you know, into these cross-functional dependencies. So in this case, we start by analyzing the function bar because we can see on a call graph that it lies at the top, and then when we contextually see it calling foo, it will flag a null to reference. This one change greatly amplifies the power of the static analyzer to find a lot more issues in your code, And we think that you're absolutely going to love this enhancement.

Now, the one caveat is the analysis is still restricted to looking at individual files. And so if you had a function baz that did essentially the same thing as the code above, we won't flag a warning yet. So let's talk about some of these new warnings. This is just a slight tour, a highlight of a few of those things. The idea is to give you a flavor of where we're trying to go, how we think these warnings are going to be really helpful for you, and how you should go best interpreting them. And again, the focus is going to be on memory safety and security, something I think matters a lot to the quality of your applications. So it was talked about in the kickoff. One major enhancement we've made to the compiler is to enhance it to do format string checking for Objective-C methods. This is actually a huge deal. When we added this to the compiler this year, we found a large number of questionable issues.

Here, this is a very common API string with format. Here we're constructing an NSString using a variety of other arguments. If you run the compiler over this, you're going to get a very precise diagnostic, and it says that you're trying to construct this NSString using another NSString, but you used the wrong format specifier. This is a really common mistake. And what's happening here is we use %s instead of %at, and we're saying that in that case it would be like, well, pretend that you're passing in a C string instead of an NSString. And so string format will then interpret that object reference as if you were pointing to an array of characters that was null terminated. In the worst case, I mean, in the best case, you're just gonna get a crash, right? Because you're just gonna start reading those bytes, and maybe you get a SVG fault. In the worst case, you're gonna just construct some really garbage string, and then this is gonna get consumed somewhere else. Maybe it gets starred into some, like, key value pair on iCloud. I mean, it's very easy to see that this garbage just can propagate throughout the rest of your application because of a one-character bug, right? So subtle things like this really do matter. Let's talk about some basic Unix APIs, things that we all might subtly be, you know, be using throughout our code. Here's our old friend memcpy, which does raw copies of bytes, and here we're just copying one NSRec to another. This code, if you, you know, just look at it, just glance at it, it looks perfectly fine. We have two--we have two pointers, a source and a destination, and we're computing the number of bytes we want to copy. The problem here is that there's a one-character bug in this example.

And the compiler will now actually warn that when you're computing the size of the number of bytes, that it looks like you computed the size of the pointer as opposed to the size of the thing it was meant to be pointing to, right? A really, really subtle issue. And the fix is very simple. You just add a reference and the size of. So it's just, like, little things like this The big difference between having a buffer overrun or failing to initialize an object.

Let's look at a cousin of memcopy, and that's memset. Basically the same thing. Instead of copying some bytes from some other location, you're just setting a range of bytes to be the same value. Here it looks like we're doing everything right. We're computing the size of the destination object correctly. But if this is C++ code, this also might be subtly incorrect.

What if y is some object that has virtual methods? that means there's a v pointer in that object. And so when you're memsetting in here, you're not just like zero initializing it, you're just nuking that vtable pointer. And the compiler will actually warn about this now too. And we've heard from a very credible game developer when they turn this warning on, they found a large number of bugs in their code base. So it might seem a little contrived, but this stuff happens all the time. Let's talk about some static analyzer issues. I had mentioned before how the static analyzer has a deep knowledge of many of our framework APIs. CF is something that's obviously used extensively on OS X and iOS. We have containers like CFArray, CFDictionary, and so on.

One issue we've seen is with portability between OS X and iOS. We can have the difference between 32-bit and 64-bit. And there's cases where people want to use their containers to store things other than objects. And they'll do clever things like, here we have an array of ints. Ints on a 32-bit architecture have the same size as a pointer value, so let's just do some tricks here so that we can stuff an array of ints into a CF set. It turns out if you put this on a 64-bit machine, this is complete nonsense. This is garbage, right? And you're gonna get completely unpredictable results. We strongly discourage writing this kind of code. It's a portability issue. It's just very brittle. It's not really how the APIs were intended. And it's--on a 64-bit architecture, this would be a security or a correctness issue.

So let's talk about the last bit about memory safety. And these are, again, just the highlights of some of the things that we have added to both the compiler and the static analyzer. And that's mallocan-free. Mallocan-free are these low-level memory management APIs that we still use very frequently. Arc does a great job of managing your Objective-C objects, but for mallocan-free, you still need to manage that memory yourself.

Now, we have great tools like instruments and leaks to help you try and find these issues, but those tools are limited in various ways. First, you're gonna only be able to find leaks on CodePaths that you yourself test, right? And memory leaks can often happen on corner cases that maybe it's, you know, just some case that you haven't tested but your users encounter.

And also, it would be great if you could just find these issues proactively without having to do, you know, due to the extensive dynamic analysis later. Oftentimes, we can find these memory leaks using static source code analysis. It's not perfect. You still should use tools like instruments and so forth, but we think this is gonna greatly enhance your ability to find those problems early. So here's a real example. It's been slightly simplified to fit well on this slide, but essentially, it's a case where we're calling malloc and then we're returning early. You run it through the static analyzer in Xcode 4.4, you're going to get a diagnostics like this, and where we gave an explanation where how the memory was allocated. We even see that the pointer was checked for validity. And then on an early return path, we said that the memory was leaked, right? This is a very localized problem, right? And you wouldn't have been able to find it with the dynamic analysis tool unless you were able to test the case where set of data failed, right? So very subtle, but you can find these issues proactively using the static analyzer. So let's look at a real bug. So this is something slightly more complicated. And it's essentially the same bug that I just showed you before where we're turning early and we're failing to deallocate something.

The details of this code doesn't really matter. The thing to keep in mind is the very top, we're calling this function parsePgArray, where we were returning allocated memory by reference. So if you looked at this in isolation, you wouldn't necessarily know that that was the case. But because we have this new cross-functional analysis, the analyzer looks into the implementation of this function. And if you go to the navigation bar that shows up in the editor, you'll get a complete abstract call stack.

And you can dive in and see what's going on. If I click on that, I go into the body of parsePgArray. I see the allocation from malloc. And at the bottom, we see that memory getting assigned by reference to the return value. So this allows you to dive in, write into the code to see what's going on. Really, really powerful.

So we just highlighted a few things. We've added enhancements for finding issues with, you know, Cocoa Touch APIs, Grand Central Dispatch portability issues, and a whole bunch of the low-level Unix APIs. And we'll continue to improve the static analyzer and the compiler in these areas and others going forward. So with all this new awesomeness, right, how do you go and tailor it to your workflow?

We talked about in the kickoff about how controlling warnings is really important. I like to divide that for the compiler into two different approaches. The standard approach you have right now is the additive approach to warnings. If you imagine that this gray bar is the set of all possible compiler warnings, there's some set of warnings that are enabled by default. These are the ones that we, in our divine wisdom, have decided that you should always see, or at least see by default.

Then there's all these, like, magical flags that exist to turn on additional compiler warnings. We have the misnamed WALL, which doesn't actually turn on all compiler warnings. This really just came about from historical expectations. You know, people started building their code with WALL and pairing it with WAR, which turns all warnings to errors. And then whenever compiler authors added new warnings to WALL that they didn't like, that those people didn't like, those authors got yelled at. So we have to add things, new warnings to WL with a lot of care. And then there's other esoteric flags like -pandantic, and then just a whole smorgasbord of other compiler flags that you can pass to turn on additional warnings.

So the problem is, how do you know what all these flags are? I mean, it's a real discoverability issue. We have to document these, of course, but let's say you just want to use all the warnings that are available that make sense for you. Should you go and look at the release notes? And I mean, there's hundreds of compiler warnings. So there's an inverse approach, where you start with all warnings, and you turn off the ones you don't want.

With the new -weverything flag, or weverything, you can truly turn on all the warnings that are in the compiler. Now, the one caveat is-- well, there's two caveats. One, if you upgrade compilers, you should just basically expect your code not to compile anymore if you're passing w-error. This is really the intended workflow, right? And the idea is that you immediately are drawing attention to the new issues, and you can decide either to fix them or disable those warnings.

And you could do so simply by passing the -w, no, and the warning name flag to the compiler. So it's a very powerful workflow. And as we saw earlier in the diagnostics, the compiler will tell you what the warning flag is when you see the warning, right? So there's no discoverability problem. You know exactly how to turn the warnings off. Now, the other caveat about this approach, besides, you know, the w-air issue, is that there are many warnings. Some warnings are more like coding-style conventions. For example, we have the new default synthesis feature in Xcode 4.4. There is a warning for transitioning from, you know, older code that's using explicit synthesis to default synthesis.

And some people would like to still be warned about explicitly, you know, not explicitly synthesizing your properties. That is a coding-style warning. If you don't want to see it, which shouldn't apply to most of you, just turn the warning off. So that's the one caveat, like there's a lot of different warnings here. Just cherry-pick the ones that make sense for you. Beyond just the command line, you continue to have the power to control compiler warnings within a source file. Many of you may not be aware, but we have these preprocessor pragmas, which allow you to conditionally suppress a warning within a scope of text or even promote a particular warning to an error. And so this is documented on the LVM open source web page, but the syntax is pretty simple. As you can see right here, you just say, map this warning to an error and just give the warning name. Finally, we have improved the ability to control analyzer issues. We've expanded the Xcode build settings to allow you to turn on and turn off various checkers.

You can do this on a per-project and target level, and we will continue to enhance this workflow going forward. So to summarize, we're very passionate that a better compiler means better applications. We care very much about the quality of what you are producing and putting on the App Store. And your users obviously do too.

So in the LVM compiler 4.0, we've improved the performance of the compiler for you to write better code. It's faster. The compiler itself is faster. Great language improvements. It reduces boilerplate. It lets you let more elegant code that's less error-prone. And we've improved its ability to find more issues early with improved compiler warnings and vastly improved static analysis. There's a lot of places you can look for more information. Michael Jurowicz, our developer tools evangelist. The open source web pages have a lot of information that you go to. There's tips on using the Static Analyzer, the Static Analyzer open source page.

And we are directly available on the developer forums. So we're happy to meet with you at the labs. But we post on the developer forums all the time. So if you have questions or concerns, you can reach many of us directly there. And with that, we hope you enjoy the rest of the conference. Thank you.