Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2009-419
$eventId
ID of event: wwdc2009
$eventContentId
ID of session without event part: 419
$eventShortId
Shortened ID of event: wwdc09
$year
Year of session: 2009
$extension
Extension of original filename: m4v
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: [2009] [Session 419] Objective-C...

WWDC09 • Session 419

Objective-C and Garbage Collection Advancements

Mac • 56:21

Objective-C for Mac OS X continues to advance at a rapid pace with the addition of properties, garbage collection, and now with Snow Leopard, blocks and other performance-oriented improvements. Learn to take full advantage of the modern Objective-C runtime. Explore how to utilize garbage collection in conjunction with Grand Central Dispatch to create lightning-fast applications all while the runtime manages memory for you. Get on the path to becoming an expert Objective-C developer.

Speakers: Blaine Garst, Patrick Beard

Unlisted on Apple Developer site

Downloads from Apple

SD Video (94.3 MB)

Transcript

This transcript has potential transcription errors. We are working on an improved version.

[Patrick Beard]

Good afternoon. I'm Patrick Beard. And during the first half of this session I'm going to be talking about advancements we made to the Objective-C and garbage collection runtime systems. Later on, Blaine Garst will be coming up and telling you about Blocks, a major advancement we've made to the C programming language.

So first up, I'm going to talk about performance. And this is the kind of performance that everybody likes. It's for free when you run your programs on Snow Leopard. So before Leopard, there was prebinding. Shared libraries had their own assigned address ranges, which were maintained by update prebinding, and they were all loaded one at a time, memory mapped, one at a time.

In Leopard, we added the shared library cache. This cache is copies of all the prebound shared libraries, and memory announced them all at once. And by doing that, we saved space, we saved time, we're able to share all these libraries across all the processes of a particular architecture.

And at launch time, performance was significantly improved. In Snow Leopard, Objective-C is now part of the shared cache. All of the selectors used by all of the Cocoa frameworks are stored in a unique table, and all the references to them are fixed up in all the shared libraries. That amounts to about 700 K of shared data in a 64-bit processd. That sharing allows processes to launch even faster. But we also sped up message send.

The Objective-C 64-bit Compiledr now knows about a set of 16 selectors, which it uses-it speeds up the calls to those by using a hybrid virtual function table. So there's a few of the selectors in that list, you know, the common ones you might expect. And we only use the top 16 for various, important reason. That speed up that you get by doing this comes at a cost, a space cost. It's an extra word per call site, and of course it's the 16 words for the Vtable.

But this set of selectors covers about 70% of all message sends that happen globally in all programs, all Cocoa programs. So we're able to speed up by about 40% all of these calls without increasing space usage to much. So we save space in the previous optimization, and here we're using a little bit of that space to get some speed.

Also, RV tables, unlike C++, are non fragile, just like we have non-fragile instance variables. So there's no C++ style release to release binary incompatibility. The Vtable slots are actually assigned at runtime, and can actually be unassigned and unused. You'll occasionally find that you have some methods you want to call that would be using a Vtable dispatch and retained release code, but not in GC. I'll tell you more about that in a minute. So to get a feel for the gist of how this stuff works, let's look at a simple program that sums an array of NSNumbers. Object at index is one of those accelerated selectors.

So let's zoom in to the message send. For every unique message send in a program and across all the frameworks, there is a message structure generated by the compiler, it's static, and it contains two entries. The selector and a function to call to actually do the message send. First time it's called, the message send function is message send fix up.

This is where the Vtable slot assignment it done, and it changes the actual message structure and converts it to the faster one. So all subsequent calls go through that. As I said earlier, this one takes about seven instructions to execute versus the 20 or so of the standard GCC message send. So it's fast, because it's a direct map, doesn't have to do any hashing.

When you're debugging Vtable dispatch code, you'll see some changes when you're looking at the symbols. You might encounter objc message send fix up, which I showed you. Objc message send fixed up will appear when you make a call that was supposed to be a Vtable dispatch, but it didn't actually end up using a Vtable dispatch. This would be like retain and release in garbage collection mode. The runtime says we don't need to accelerate those, they don't do anything. So when you see these symbols in the debugger, don't worry about it, don't panic, it's just like GCC message send.

You can step through your methods as ever. So that's Vtable dispatch. We've also enhanced the CrashReporter so it can give you more information in the unlikely event that your program crashes.

[ Laughter ]

When you call objc message send, the old style crash logs look like this. You just saw some obscure information, you go look at the register, you scratch your head, you go, what happened? Well, I know something from this method that caused the crash. The new one gives you the name of the selector you're trying to send.

[ Applause ]

Thanks a lot, Greg. So I don't think you're going to applaud for this, but the next feature I want to talk about is synthesized instance variables. So in Objective-C 2.0, you can actually synthesize the accessor methods for your properties. And if you're using the 64-bit ABI or the iPhone OS, you can leave the instance variables out.

Unfortunately, there was one glitch. The compiler would generate the instance variable, add it to the class, but you couldn't touch it. You couldn't actually reference it by name. So this is fixed in the Snow Leopard compiler, GCC 42, and Clang now, both can handle this. But if you're writing code that needs to compile for iPhone or the Simulator, just go ahead and declare the instance variable, it works just fine. So now I want to talk about a brand new feature of the Objective-C runtime in Snow Leopard.

It's called associative references. So imagine you're writing an application, and you get the bright idea to have whenever a window -- every unique window kind comes to front, plays a sound. So you could model this, using a category. You just say, well, all my Windows will have a property called sound.

And you could use key notifications and use that category, and see if the sound exists yet, dynamic, or lazily bind it to the window like this. So whenever the sound window comes from front, a sound is loaded and it's played. Whenever the window goes to the back or is no longer front, it stops.

So how would you write a category like this? When a language like JavaScript, dynamic scripting language, you can add instance variables on the fly, it's no problem. But Objective-C doesn't support instance variables being added to objects on the fly. So we have a couple of problems we need to solve to make this work. One is where are you going to store the sound, where do you put it, and then how do you retrieve is later.

So this seems like an ideal candidate for a map table, a global map table stored in a global variable, it's off to the side. And it would use weak keys and strong values. But if you're writing a garbage collected program, which I hope you are, this can cause subtle leaks if the value object points back to the key. It creates an uncollectable cycle.

That's because the map table is global. It's not a problem in this case, because the sound doesn't point to the window. The harder problem is life time. How do we find out that the window has gone away or is about to go away so it can clean up after it. In this particular case, we get a notification that windows close. But in general, you might want to try to solve this problem by subclassing. But this particular scenario we want to actually catch every window type, NS Windows and NS Panels, and Save Panels and things.

And some of them we're not going to be able to subclass. So the solution, it's very simple. We have a new API called Associative References. This is a complete implementation of that category. So notice there's a static variable called Sound Key. The address of this variable is what's important here. It is used as a globally unique identifier that represents the association between a window and a sound. So to create an association between two objects you call a set associative object.

You pass a pointer to an object, in this case, the window. The key, which is the unique identifier, the sound, which is the value object, and a final parameter that's called the policy. In this case, the property attribute said we're going to retain the sound. So this tells the associative reference system, go ahead, when you create the association, call retain.

So here's the API -- a piece of it, there is another function for getting. But you'll notice these policies are defined to correspond to the property attributes assign, retain, and copy. So it's very easy to implement properties in terms of associative references. In fact, I would say it would be a really good idea to be able to synthesize properties.

There's no syntax for this yet. So there you have it. In Snow Leopard, Objective-C programs launch faster, the common selectors are accelerated, and you now have the freedom to associate arbitrary objects with each other. Pretty cool stuff. So now I'm going to tell you about what we've done to improve garbage collection for Snow Leopard.

So in Snow Leopard, or since Leopard, all Cocoa frameworks and all Core Foundation frameworks have supported garbage collection. We added a ton more frameworks for you to use, we still support that, we're going to be supporting that from now on. What's more, 64-bit applications, the system applications, system references, and screen saver and automator are now using GC in 64-bit mode. So Apple is embracing 64-bit and GC at the same time, and we would like to encourage everybody to do the same.

In Snow Leopard, garbage collection runs even faster. In Leopard, the garbage collector runs in two modes. Full collections, which collect lots of objects, and generational collection, which doesn't collect as many, but it runs about twice as fast. The generational collector was designed with short-lived objects in mind. The thread-local collector is based on the hypothesis that there are lots of threads-there are going to be lots of threads in Snow Leopard that allocate lots of temporary objects.

This is actually the same rationale behind the Autorelease pool itself. You make an object on a thread, you put it in the autorelease pool, you're done with it, it goes away on that same thread. The thread-local collector provides even faster allocation because it uses a thread local cache.

It also collects faster, because all it has to do is look at one thread's stack and all the objects reachable from that. So the full and generational collectors run in about a time that's proportional to the size of your program's heap. The Thread Local Collector is more efficient.

It only has to collect as many objects as a particular thread is allocating. It also is able to return that memory that it collects directly back to its allocation cache. Which is a huge efficiency win. And it's more scaleable. As I said, the amount of memory it has to look at is proportional to the thread, the single thread. So each thread-local collector only has to worry about one thread.

The thread-local collector only manages a particular set of objects, a particular size range, between 1 and 96 bytes in 64-bit. But all the classes on Snow Leopard, 68% of them are in that size range, in fact, 99% of the objects we provide, their classes are less than 1 K.

So it simplifies the collector to only support the smaller size; it makes it faster. And also only supports objects that have retained counts of zero, which is what you get when you create Cocoa objects. CoreFoundation objects start off with retained count 1, and you have to see if you release them to get rid of them.

They're off the table for now. So the Thread Local Collector, or TLC for short, so I don't trip over it too much, works best if you trigger it explicitly. If you happen to know there are some objects that are ready to be collected, then you can use one of these APIs. You can use the Autorelease pool API. That works great if you're writing code like, well, a System Preferences pane, for example, which needs to be retained, released in 32-bit, and 64-bit GC. Otherwise, you can use the garbage collector, ask garbage collector API, or, obviously, collect.

When you do that, you're only actually giving a hint. If you want to force the collector to run and you want to see if an object will go away you can use exhaustive collection. And that is provided us also in two flavors. That will always run TLC on the current thread, and it will push the background collector to run even more aggressively.

Thread local objects-we call them local for short. If they escape the control of the thread local collector we say they go global. This happens in response to calling CFRetain, for example. When you call CFRetain, this makes an object ineligible for collection. It allows you to store that object anywhere. You can store it in malloced memory, and the collector wouldn't get rid of it.

Well then the thread local collector won't get rid of it either. Simply assigning it to a global variable does the same thing. We use a right barrier, and that tells the thread local collector where the objects are being stored, if it's about to lose control of them. Also, the weak or associative references, if you use that, objects stop being local. The reason is those systems use locks, and we can keep TLC really fast.

Finally, if you have a local object and you assign it into an instance variable of an already global object, that will make it go global. A good example is you create a new view, and you put it into the view hierarchy of a window. That causes all the objects connected to that view to go global.

So now a couple of cautionary statements. You probably know this, or maybe you learned it the hard way, but it deserves repeating. When you use the autorelease pool, objects might go away. Their dealloc methods might be called. Dealloc methods can take locks. If you happen to be holding locks from your code, it's probably a bad idea to call drain, because you might dead lock your own code.

The same is true, now, under garbage collection with thread local collection. Same thing can happen, a finalized method might be taking a lock. So if you hold locks, don't call into the collector. And better yet, don't write finalized methods at all, or finalized methods that take locks. So that's the collector, that's Thread Local Collection.

Now I want to talk about our performance tools. Instruments has been enhanced to know a lot more about Garbage Collection in Snow Leopard. The garbage collector instrument can actually show you a log of all the collections that have occurred, and it will tell you their kind and how much time they're taking.

So right here I'm showing a local collection has run, and it collected 100 objects, and it did that in about a third of the time of a generational collection, which also is collecting 100 objects. As I said before, the time to run these collections are proportional to the number of objects being managed.

Also over here on the right side is the stack crawl where that collection occurred. So you can monitor where you're calling into the collector. It's a great idea to call in on a shallow stack, because more likely, there's more garbage to collect. The object graph instrument tells you about how your objects are connected to each other and to the heap and to global variables.

So here I'm showing a path from a global object, the NSApplication, to its delegate. And here it actually shows the name of the instance variable that's connecting it. So you can use this to look at long chains of objects to get a grip on why an object isn't going away, what it's connected to.

You can get a snapshot of this connectivity. Finally, the leaks instrument works with garbage collection. The only objects you're going to see in this list are objects that have a retain count that's non-zero. Because those are the only objects that can leak in a garbage collection program. The other kind of leak is over rooting, but we don't show that here. And we don't really call it a leak.

You can go find it in the object graph. We also show the list of leaks and over here on the other side, we show the allocation stack crawl, which is a really important way to start finding leaks. Look where it's allocated. Now I want to talk about GDB. GDB is always available when you're debugging in Xcode or on the command line for you diehards. So we added three new commands to it.

They're all prefixed with info One is totally general purpose. It accesses the malloc history information. You set up a particular environment variable and the allocator stack crawls for all the allocations. Well, it also works in garbage collection mode. And it tells you about all the places objects were allocated and all the places the collector got rid of them.

So you can find out why an object that maybe is corrupted doesn't exist any more with it. Very handy for that kind of debugging. Info GC-references gives you all the pointers to a particular object that there are. And GC roots gives you all the pointers plus all the paths from the roots that lead to your object.

So this gives you similar information to Instruments. Also, there are some handy one-liners. You can call ob-c collecting enable from the debugger and find out, are we really running the collector? Sometimes things don't happen like you expect, because you're not actually running the collector. CFGetRetainCount tells you if an object has a retain, therefore, it's not going away. Malloc side, you can call on any kind of allocate pointer, malloc, or the collector's memory. It works with either one. Tells you how big the object is.

And objc is finalized, you can use during finalization and find out if a particular object, your object references, is about to be finalized, is going away. You can use that to find out whether it's safe to store that object or do other operations. So we have some environment variables to highlight the important ones. Auto-use guard is now available, which gives you guard pages and always allocates at least two pages, puts your object in the first one, and allocates a read-only page after it. Great for detecting buffer runs.

Got to have a lot of memory, but it can be really handy. There's also auto-reference count logging, which as I said earlier enhances the malloc history. It will give you all of the activity the collector observed for reference counting. And it only logs the information that the GC itself sees. When you're using CFRetain, only the first CFRetain on an object will actually talk to the collector. So it's actually kind of handy, because it's going to be less voluminus to go through those logs.

So the collector is now open source. We released the 10.5.3 version of it late last year. It's the version of the collector that was shipped as part of 10.5.3 Leopard. And here's the link to it. It's released under the Apache Open Source License Version 2. And it's not just for Objective-C. It's the collector's, actually, language diagnostic.

The open source project Mac Ruby is using it, and we'd like to encourage other projects to use it too. So if you're interested in doing that, please contact Michael Jurewitz. His contact information will come up at the end, and let us know. In March we presented our extensions to C, including Garbage Collection and Blocks in a paper to the C Standards Committee, to the working group.

And there was interest in it. And -- if you're interested in it yourself, I've published the link to that paper on there, on their web site. So thank you very much. I'd like to invite my colleague now, Blaine Garst, to come up and tell you all about Blocks.

[ Applause ]

[Blaine Garst]

Thank you, Patrick. So I want to tell you all about Blocks. I'm borrowing a few slides from a talk I gave a little bit earlier this afternoon, but here you're going to learn all about syntax, you're going to learn all about lifetime issues, you're going to learn more about Objective-C specialization. You're going to learn the full story. It's not that big, but it's a little bit bigger.

But if you didn't come to my talk earlier, I want to start over again. If you've ever programmed with lambdas, then of course you've been using either Scheme or LISP. If you programmed in Smalltalk, you see those things in square brackets, that funny syntax, and that was called a Block. It was actually an object for them.

Every if-then-else control structure kind of thing was an object, that code in the square brackets was actually a block. Interesting idea. In Ruby, we also see this concept coming back again. And in LISP and in Scheme this concept of having sort of this function that you pass to somebody was called closure.

And the reason for that was that you got to carry along the variables locally for you. But if you're a programmer in C, or one of its related families, Objective-C or C++, or the union of those two, Objective-C++, you've been out of luck for a long time. But no longer. In Snow Leopard, we have the same kind of a construct. We call it a Block and it looks like this.

It's the funny ^ symbol ahead of a curly brace, compound statement, expressionless. You're creating sort of a local function and passing it off to something that's going to do something more interesting with it that you want done to it. So in this case, what's going on is we're carrying along the value D in that local compound statement kind of block that you're passing along. And we do this very efficiently.

Those other languages were not only garbage-collected, but they were mostly interpreted. And that's a very big expense. And of course C is neither garbage collected nor interpreted. So we know how to do this in a highly compiled, highly optimized way. So let's see how that repeat function might be implemented. So the syntax we use to declare a block reference is just like function pointers, except we use the ^ symbol. So semi-similar syntax and form, it's not so bad.

Except as a parameter you've got to worry about the return type and the parameters, and when you're passing one or when you're trying to declare one. So we highly recommend that you use a Typedef to kind of hide the clutter. And so if you use the Typedef, the implementation of the repeat function becomes fairly simple. It's passed in a block, and then it simply calls it like a function pointer, like a function. So you know, you can pass parameters, get the results back, it's very natural to use as well as to create.

So I use Blocks. It leads to smaller, easier, more precise code. Who can argue with that? We found over a hundred uses inside Snow Leopard already, and boy, we've not closed the book on that yet. There's going to be even more coming along. For the most part, we use them for callbacks, we use them for concurrency, using Grand Central Dispatch, and we use them for all the 40 years of accumulated wisdom on how to use closures.

And that is for sort of these iterate, map, and reduce kinds of stuff. So I'm not going to touch on these hundred APIs, I'll give a couple examples here to give you a flavor of what's there. And you know, all the details of syntax and stuff. So my canonical example for this is qsortR, which is the existing way to get to 50 years worth of collected wisdom on sorting.

So if you have a particular problem where you need to say, sort according to different parameters, I mean, not always the same way, then what you have to do is supply a callback argument to the key start routine, such that whenever it calls the compare function, and the compare function takes the left and the right, it will also then pass you this callback parameter that you've supplied it. So in practice, let's say we're trying to sort some kind of array of people objects, and age first might be one of the options, it happens sometimes, and other times other things.

So the first thing you have to do is declare this custom structure somewhere in your code. You have to write the custom compare function, which takes that context pointer, does a lot of casting, gets to the thing of interest, and does its thing. You have to in the function that you're using for qsort, declare that custom structure, fill it in with the right values, and finally, somewhere, where you really want to use it, you get to call it.

Well, that's a lot of work, especially if you change your mind. You know, how often do you write your code once and then you're done, right? You're always modifying your code. If you have to modify the criteria by which you sort these things then you've got to go modify your code in three different places. Could be in pair files, could be in separate files, could be bunched here, I mean, it's just a pain.

So the better way to do it is with qsort_b, which we're introducing in Snow Leopard. And that is you just simply pass in the things that are important to qsort, and that is the array, the number of the elements, the size of each element, and the comparison block. And the comparison block just takes the left and the right. So to use this, you write your qsort_b with the natural reference to, you know, the age-first thing.

And if you need to change it, you change it. And there is no Step 2 of course. Your code is exactly what you want to use for sorting that. So in general, this is kind of, you know, the callback case. I spoke about concurrency. We have a brand new system in Snow Leopard called Grand Central Dispatch. And so the idea there is it's really easy to just kind of push your code off onto a queue, and then that queue gets serviced by some thread somewhere, sometime. And you don't have to worry about it.

So it's just one line of code you can push work off, and there's a ton of talks on how to program with GCD. It's really a fun system. Another one I think we'll all appreciate is the initialization of a singleton kind of situation. This is a case where you need to do some code once. Now how many people enjoy writing new text, P3 new text declarations.

How many people enjoy using them, how many people love to put the little if, not singleton, you know, do this kind of stuff for a little speed enhancement. Anyway, there's huge debates on this kind of stuff. Here is the code to get a singleton initialized. This is it. There's no fancy initializations, you just pass dispatch once, the address of the unique thing that guards your singleton, and then the block of code that is needed to initialize it. And the dispatch system inside Grand Central Dispatch makes it work. It's fast.

It's faster than thread new text is. So let's recap a little bit. What is a block. So a block is a data structure It's not a function alone, it is a data structure. And that data structure references that function expression, which in truth is, you know, sort of a hidden function we write for you. It references that function.

It acts like a function when you invoke it. It carries along constant versions of the stack local variables that you use within that block expression. So in the case earlier, the D variable was actually made, there was a local copy of it made when that block was constructed, when that expression was evaluated. So it holds const versions.

And that's very cool when you send that off to some other thread, you carry the memory with you and you get all kinds of thread core local kinds of benefits, performance benefits out of that. Sometimes, though, you do need to mutate a variable, and const variables if you try to mutate them are going to give you a compiler warning.

So in the cases where you need to mutate something, and carry, you know, carry the value back, we introduce a new construct with the under under block keyword, and it's called block storage It's storage class-like register and auto for stack variables, and static for sort of local globals. So it's a new storage class within C.

And you use it in that space. So I mentioned the qsort case, and that's a synchronous case. That's where you hand your work off and it comes right back. But in other callback situations, say with a timer, you need to hand some code off, and it will come back, and then that timer is going to fire, and it needs to execute your block. So in order to make that happen, the timer sub system has to make a copy of your block.

And that block copy moves that block onto the heap and it obeys heap lifetime rules. And so blocks start out on the stack, because that's very efficient. That's how, you know, a huge number of uses are made. They're synchronous uses, there's no need to copy it to the heap. But if there is, they are easily copied to the heap where they live for as long as you need them.

And so the block subsystem manages the lifetime of all the variables that it references, and that is a real win. Because there are many callback APIs that take these sort of context parameters, where the lifetime of that context thing, you know, could be malloced off the heap or who knows what its lifetime is and how to get rid of it, once you're done using that callback. With Blocks, we manage that lifetime in a very consistent and simple manner. So here's an example. I'm going to actually go through this line-by-line.

So the idea here is that we're going to write a function called find key for value. We're going to pass in the Cocoa dictionary, and we want to find the key that matches the value that is sought after. So we're going to return the results. So we need a block variable for that.

And we're going to pass a block of code to the enumerate keys in objects using block method-that's a new method in Snow Leopard. It will pass to the block the pairs, the key and object pairs, as long as the stop variable is left alone. So in this case, we want to see whether or not the object passed in matches the sought after object. And if it is, we set the result to key, we set the stop variable to say no more, we're done, quit now, we're happy. And enumerate keys and objects method completes, and we return the result.

So there are many different APIs we've added in Snow Leopard for different kinds of iteration. It's a great way to express styles of iteration. For example, with arrays you can pass in a version that illustrates rate or goes through the array backwards, and it passes the index in so you know which index it was. So we have several, quite a few different blocks APIs, including ones that will go and do some work for you concurrently in the background.

So there have been talks already on that, and I encourage you to go explore those to find out, you know, some really cool ways to use Blocks. Let's talk a little bit more about block variables. So they generally, you know, you allocate them inside, you know, local function context or actually inside a block itself. So they're mutable, as I said before. They're shared both with the stack or the scope of allocation, as well as any blocks that reference them.

They also start on the stack, so they're very fast. And we have a couple restrictions on them. So you can't use variable length arrays on them because it's hard for us to track that varying number. You get two kinds of varying arrays. And the address of currently is implemented, but not recommended because the address changes as we move it to the heap. So let me go back and let's talk about block literals.

But I'm going to talk about lifetime a little bit more also. But the key thing for creating blocks, this is the simplest syntax you can use. It's a ^ and a set of statements. Now we're trying to do something really nice for you here, and that is we're going to infer the return type of the block.

Because it's really kind of a function. And so we infer the return type simply by the presence of absence of a return statement. So that's fine. In this case, Print F, there's no return statement in the first one, so it returns a void. And in the second case, the return type is an integer.

Many blocks take parameters. No big deal. Same rules apply about, you know, inferring the return type of the thing. We do provide a syntax, though, where you can explicitly state the return type that you want out of this block expression. So in this last example, here's a block that takes a character and returns a character.

If a character is greater than 0.0, it's going to return C, which is a type of character. Otherwise, it's going to return 0. Well guess what, 0 is of type integer. So the compiler doesn't like this. It says you're returning character in one place, integer in the other, what's the deal. So it will complain.

There are a couple ways to resolve that dilemma, and one of them is to specify the return type explicitly, and that's with that ^char, that first char part is for that says this thing is returning a character. So then it uses normal compiler C language type conversion, implicitly cast that 0 into a character type as it returns. There are some subtleties, we have discovered, with inferred return types. It turns out that a single quoted character is an integer. Hmm. So if -- you're wanting this block to return a character, then you can cast the results, and that works.

Or you can supply that, you know, return type parameter. So these are the 2 styles for resolving these subtleties. I say subtlety number 1 because there's a subtlety number 2. A = B is not a bool, is not, you know, it's not a bool by the B O O L, you know, Cocoa convention, or a bool as in lower case, it's just an int. So if you want a bool back, you have to cast or again redeclare.

There is a subtlety number 3. This one stumped quite a few of us for a while. It turns out that enumeration, no matter how it's initialized, is always an integer. So if you want to return a long out of this expression -- out of this block, you again-sorry, solution's on the next slide. A very pragmatic problem comes in with real Cocoa API. We have added a new method called sort, using Comparator or a compensator, is one of those typedefs that takes a block, you know, comparing two items. And it returns an NS enumeration result.

Well, an NS enumeration results happens to be not always an integer. And so you can get a compiler error on this, on the 64-bit system. And so the solution in these cases, in the first case, you know, casting works pretty well. But in the second case, you really want to put in as comparison result, instead of repeating that cast 3 times. Wherever you use our recommended return results. So blockdeclare syntax.

I showed you that sort of function pointer style syntax? Well, okay, remember that. In Objective-C we use sort of this abstract declarer form that looks funny, because we don't normally pass function pointers in, and you just have to get used to it if you don't have a typedef handy. This is the C syntax for declaring an array of function pointers or blocks. It's not very pretty. It gets worse. If you try to return a function pointer or another block you've got really funny patterns of parameter lists of the thing returned, anyway, use typedefs.

We talk about lifetime once again. If we have a function that's going to take a block, then what happens? We declare some memory on the stack, there's room for local, there's the initial version of the under under block shared variable, and the block itself. So if it gets copied to the heap because it needs to be preserved, what's going to happen? Well, first of all we're going to move that under shared variable over. We have created, when we evaluated that expression, that block expression, a local copy of the value of local. Here I represent it as under-bar local. And that gets copied also.

So every version of the block carries that -- all the local variables that it's sort of snapped up. And so things are fine. We have two situations to think about in terms of how we get rid of that memory. So the first one is the function might return. So of course we let go of our explicit reference to that now heap version of under under block, and we return off the stack, and we let the heap version of the block expression keep track of the memory.

And so that's fine. Second case, say, the heap object goes away first. So if the heap's going to go away first, it lets go of its reference to the shared variable, the heap object is reclaimed. But because we're sharing that reference, the stack still has a valid reference to even the heap version of the under under block variable. And that's fine. When the stack's ready to return, it lets go of that. The heap memory is reclaimed, and the stack returns, leaving us in our initial state in both cases.

We take care of this for you. We never do this copy to the heap for you automatically. This is always an explicit action on your part so that you know what's going on. Now a subsystem might do it on your behalf, but they make that choice to copy it. But Blocks behave the same no matter whether they're still on the stock or whether they're on the heap.

That's a transparency thing for you. Objective-C special observations. All Blocks are objects. All the time. Even if they're created in C programs. So here I have a typedef for that generic work block kind of thing. So if you want to keep a block and you're in Objective-C, you can send the copy message to it. They're like IDs. You can,no matter what syntax you use, whether there's the funny caret style stuff or typedef, you can send the copy message to it directly.

We prefer you to use the copy method because if you use the system-provided primitive block copy or block release, the block copies capital block copies, and the capital block releases have to be matched. And under GC, they can cause unrecoverable cycles. That's not good. So we much prefer the copy method. Autorelease and release work on them as well, in the retain/release world.

So as I alluded, you can actually send a copy message directly to the ^ primitive, that's fine. The nice thing about being objects, being copyable objects, is that they participate as a first class citizen, even with our app property syntax. So this is a very natural way to just build one of these blocks as part of your, you know, as part of your object design.

It's fully supported, synthesized, all that sort of stuff works. So when you're in Objective-C, you can do a lot of fun stuff inside the block. And this example, we've got a local variable, we've got one of those under under block variables, and inside that block expression, which we're going to be copying, we use both the A local in the block bar and what are we doing? We're actually going to update an instance variable. Hmm, okay, what happens when that thing is copied. So A local will be retained if you're using retain release. That's fine.

That's the way -- that's what we've got to do to keep it alive if it's going to persist beyond, you know, the function thing. Implicitly, because you use my Ibar, we're going to retain self. So that let's you modify that instance variable within that block. You don't have to play under under block funny games or anything like that.

So instance variables are directly -- remain directly mutable. That under under block object is going to be left alone. Whatever value it had is going to be sort of bit copied into the heap. We've gone around a couple of times with what we should do there, but this has proved to be pragmatically the hands down winner as to how we want to operate on under under block variables when they're objects themselves.

Hmm. Some surprises you might run into. We did. A block reference isn't quite a void *. So I tell you, in my code I go %P and then the compiler gets mad at me all the time so I have to put void ** in there. Hmm, maybe that will get fixed someday. I don't know. We've implemented blocks such that the stack version of a block can be retained and released.

So you can stick it inside, you know, as a retained property of some other object; as long as you let go of it before you're done, this works. It's not that recommended though, because big -- lots of crashes and booms can happen if you're not very careful about that. For standard practice, you should do the copy. And if, you know, you're under-in this case, it's sort of like an instance variable, so release it before you reset it. So copy is the primitive to use. Copies of blocks that are already on the heap are cheap.

So don't be afraid to use copy here. Another subtlety that came up is reflected in this by analogy. So as C programmers, we're very conscious of stack-allocated memory. So if I had an array of pointers to things and I allocated inside that 4 loop, a structure, and remember the pointer, are we going to get 10 different versions of that? No. We're going to get a pointer to dangling memory. 10 copies of that pointer to dangling memory, and things are going to go boom. Or if they don't go boom, you wish they would have gone boom, because something funny is going to happen.

So the same thing is going to happen with Blocks. If you try to capture blocks inside either an if statement, you know, an if... then...else compound statement, or inside a for loop, it's going to sort of not work, you know. It's going to go boom. And the answer here, of course, is to make a copy, to try to capture all the locality of each version of that iteration.

You're going to need to copy them to preserve them. And if you're not using a Garbage Collection, you'll need to remember to do releases on them as well. I saw this once, thought it was worth telling you about. In this case, we're trying to build a smart kind of logging function.

And only sometimes do we use this boilerplate method, or a boilerplate object. But when this block that uses that gets copied, the runtime doesn't know that you're not going to use it sometimes. It's always going to try to copy the value of that boiler plate variable. In this case, it's uninitialized.

And so you get a big memory smash, or something happens that's bad. So you know, this might be obvious after 20 minutes of thinking about it. But if you remember this, and it happens to you, hopefully it will take less than 20 minutes. Debugging, I just have a really easy story to say. It just works.

These block expressions are as I said copied by the compiler into a function. It's very much like that sorting Example: the Compiler generates a custom structure, it allocates one right near where you use it, fills it all in. But from the debugging view point, you know, you can set break points in the middle of them, the local variables look like -- I mean, all the variables you capture show up just the way you want them to, and it's really nice. And Instruments does a great job of monitoring Blocks use within GCD.

So we've gotten full support of Blocks for C and Objective-C in both GCC 4.2, and the new compiler, Clang. Our support for C++ is not as complete. I'm sorry. I wish it were. But there's only so many hours in the day and there is preliminary support available, but don't start a huge project expecting - anyway, be careful.

We have published the specification externally as part of the Clang open source. As of four hours ago, we have published the runtime as well. We like the idea of people using blocks in places other than Apple, and we're trying to really put some meat behind that statement. So much so that in fact in addition to Garbage Collection, we presented Blocks to the C Standards Committee.

The C Standards Committee reaction was kind of interesting. They then -- taken a lot of work that's been falling out of the C++ standardization efforts, and there's a C++ kind of idea around closures, which they call lambdas, but lambdas can't really be copied and stored and used as callbacks, there's those timer things.

They only work in the synchronous cases. We think that Blocks are far more versatile, and it turns out they're kind of, you know, compatible and interoperable. So we don't see this as a collision. But from the C perspective, we can implement Blocks in C whereas the lambda proposal cannot be brought back to C.

So we'll see what happens with this. More information, of course, contact Michael Jurewitz, who is actually not going to be here today for this talk. Under documentation there's a Cocoa programming topics area, and then there's a sub-topic, the Objective-C language, and under there are 3 other docs that all describe stuff that Patrick talked about, garbage collection, the new runtime, we've got lots of APIs in our new runtime, and then of course we have a good document on Blocks themselves.