Concurrent Programming in Cocoa - WWDC 2009

Mac • 1:01:22

Snow Leopard has increased concurrency support in the Foundation and Application Kit frameworks. Find out how to make effective use of NSOperation to manage tasks and write multithreaded code to maximize your application's use of multiple CPU cores.

Speaker: Chris Kane

Unlisted on Apple Developer site

Downloads from Apple

SD Video (92.7 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Good morning and welcome to this session Concurrent Programming in Cocoa. My name is Chris Kane and I'm an Engineer on the Cocoa Frameworks team. So what we're going to be talking about today is first of all NSOperation and NSOperationQueue we'll review what we had in Leopard and talk about some new stuff in Snow Leopard.

Then I'll be talking about some new Collection APIs and NotificationCenter APIs which have to do with concurrency; and then we'll move on up to the AppKit and talk about Concurrent Document Opening and Concurrent Drawing. Now, except for NSOperationQueue and NSOperation which exist in Leopard, most of what I'm going to be talking about today is Snow Leopard specific. It's not available in Leopard and it's not available in iPhone OS.

So let me begin by talking a bit about Operations and Queues. When I talk about Operations and Queues I'm talking about a particular kind of concurrent programming model where you have work to do or small jobs or independent bits of work that need to be done; and you feed those bits of work in some way to some kind of execution engine which executes them on your behalf in the background usually.

In the case of NSOperation and NSOperationQueue, we introduce these in Leopard and they have become really fundamental to the concurrency story within Cocoa in just a short time. In fact already in Snow Leopard, we've seen NSOperation and NSOperationQueue, and Operations and Queues in general begin to replace threads as a principle concurrency mechanism within the operating system.

Operations and Queues as a concurrent programming model is so significant that in Snow Leopard we've introduced these two concepts down at basically the lowest API level, the C API level within the operating system. First, we've introduced Blocks; we've modified the compilers for the C, C++ and Objective-C languages to accept a new kind of syntax which we call Blocks; and blocks are essentially analogous within the context of concurrency; they're analogous to operations.

Blocks allow you to define bits of work very conveniently and locally at the place you need it done, and so operations now have an analog at the programming language level. For queues we've introduced Grand Central Dispatch, also called GCD and I'll be referring to it as GCD in this talk often.

Grand Central Dispatch introduces queues with a CAPI down at the lowest level of the operating system APIs. To align all of the various queue and operation implementations, NSOperationQueue and NSOperation as well have been re-implemented in terms of Grand Central Dispatch. So GCD is now essentially the central execution engine for all operation and queue APIs within Snow Leopard.

One side benefit that I'll mention at this point of that has been a good performance boost and I'll be talking about that more later, but also because NSOperationQueue has been re-implemented on top of GCD, we've also adopted a lot of the-- well I hate to call paradigm, but programming models that GCD uses, for example serial queues as a replacement for locking, and that itself has also given us a good performance boost.

So let's review where we stood when Leopard shipped about a little over 1-1/2 years ago. So first we had NSOperation, and NSOperation is an object which encapsulates and represents work to be done or a job if you will. Often an NSOperation will also encapsulate some data necessary for that work to do its thing. NSOperation as an abstract class also encapsulates various properties that are then common across all types of Operation objects.

An operation can tell you whether it's currently executing or whether it is finished; operations have an advisory Cancellation state and behavior; when you put an operation into an OperationQueue, the operation can specify what priority it should have within that OperationQueue; and operations also have a dependency mechanism whereby you can set up a situation where one operation will not run until another operation has finished.

To define an operation or to make use of an operation, you can use either of the two subclasses that we've built in for you-- NSBlockOperation which is new in Snow Leopard and I'll be talking about that in a short while; or NSInvocationOperation. If you need to though, you can subclass NSOperation yourself and unless you have fairly complex needs, you simply override the method called main and put all the code you want executed as part of that operation into main.

It's fairly straightforward for most kinds of NSOperations. To get an NSOperation then to be run, well you can if you want to execute operations directly, but normally you would then give the operation to an NSOperationQueue and the NSOperationQueue will be the engine that will cause your operation to be run at some point in the background.

So we have NSOperations. They are the mechanism for concurrently executing the operations. Normally an operation queue is executing operations concurrently; that is, as you feed in the operations and there is execution capacity on the machine, the operations are run. But you can also set up an operation queue to only execute one operation at a time. And this has various uses itself which I won't be going into within this talk, but you may have seen references to that last year in Ali Ozer's talk.

And for the queues that you can create within Grand Central Dispatch those are also serial queues. Once you have an NSOperationQueue you can do various things to it of course you can add operations; it wouldn't be much of a queue if you couldn't; you can get the list of operations out of the queue; you can adjust the width of the queue and this is what I was referring to just a moment ago. You can specify how many operations you want the operation queue to run at any one time.

If you need to you can suspend a queue and, for example, if you don't want the queue to run things for a little while, and later resume it; and finally, if you have to you can wait for all the operations that have been put into an operation queue to finish. So I'm going to illustrate this; I'm going to illustrate adding operations; suspending and resuming a queue and waiting for all the operations to finish. So over on the right there you see that I have three Cores.

For simplicity I'm just going to show three Cores and the operations that end up in those Cores and I have an operation queue that operations may appear in. Let's see what happens. So first of all, I'm going to suspend the operation queue-- that's what this little stop sign means here. So now if any operations happen to be put in the queue, they're just going to sit there.

You see the operations are not being executed because the operation queue is suspended; that's fairly straightforward. What we'll see then is that if I unsuspend the queue the operations will begin to be pushed, if you will, onto the different Cores and they'll be executed. Notice that the operations wait in the queue until one of the Cores becomes available and then they can be pushed onto that Core.

And now of course we're all waiting for the OperationQueue to get empty for all the operations to finish; so the talk is basically blocked until this animation is done. Now watch this. Notice there's a little flash of light here. I'm not promising that there's a little flash of light when all your operations finish.

Do not take this illustration too literally. What have we done to NSOperationQueue in 10.6? Well, we've added some new API as we often do. Two of the most important ones are a waitUntilFinished method. The waitUntilFinished method allows you to block if you need to, waiting for an individual, a single operation, to finish.

This is often convenient to be able to do this so we added that convenience in for you. And we've added a completionBlock property to NSOperation. The completionBlock property allows you to assign a block to any operation which will be executed when the operation finishes. Now I want to point out two important things about this while I'm talking about it.

The completion block is not run in any particular execution context. If you, for example, put the operation onto an operation queue and run it, the completion block is not necessarily run in the same execution context as the operation queue. The other is that the execution of the completion block is concurrent with execution of any dependencies that the operation may have.

So when an operation finishes, any dependencies that were waiting for it to finish of course, might become ready to run and they might begin running at that point. Well the completion block is running at the same time as those dependencies are starting up. As I mentioned earlier, we've added a new subclass of NSOperation called NSBlockOperation.

And the NSBlockOperation holds one or more blocks that you've given to it. These blocks are the work of the operation. So basically what an NSBlockOperation does is run the blocks that you've given it; and when those blocks are done the operation's finished-- fairly straightforward. If you give multiple blocks to NSBlockOperation, those blocks are executed concurrently with respect to one another. And the NSBlockOperation becomes finished when they have all finished. The blocks that you give an NSBlockOperation have no parameters and return no result.

So if you need parameters, need the blocks to be parameterized in some way, then those values, that information, needs to be captured when the block is created and possibly when it's assigned to the NSBlockOperation. What is a block with a type look like? Well here, I'm going to illustrate very briefly the block syntax just in case you aren't familiar with it or in case you didn't see it discussed yesterday in the What's New in Cocoa talk. On the left we have the return value.

These blocks have no return value so we have void there; and then we have the caret in the middle which indicates that this is a block type; and then on the right side we have the parameter list just like an ordinary function declaration. And there are no parameters accepted by these blocks and so we have a void there. Now I don't have time unfortunately to go into greater depth on what blocks are.

We'll see a couple of examples later in the talk, but if you want to know more about blocks I'll give you a brief pointer now to the Objective-C and Garbage Collection Advancements talk at 5:00PM later today and I'll have this again at the end of the talk in case you missed it now.

What have we done to NSOperationQueue? Well, again we've added some new API; we've added a convenience method called addOperationWithBlock that allows you simply to give a block to an operation queue, and what the operation queue will do then is create a block for you and add it to itself.

There's another method called addOperations:waitUntilFinished. This allows you to provide an NSArray of operations to an OperationQueue all at once; and optionally wait until all of those particular operations that you just gave it have finished. Finally, we've added an operationCount property to NSOperationQueue. We found that a lot of people were getting the entire list of operations out of a queue which is a fairly expensive operation.

Just to ask the result array for its count and then throwing the array away; so this is a much cheaper way of getting the count of operations currently in the queue. But of course just to remind you as with all things concurrent, by the time you've gotten the operation count, of course it may have changed because some other threads may have put operations in that queue or operations may have finished.

So operationCount is of course always an approximation. So as I mentioned earlier NSOperation, NSOperationQueue have been re-implemented in terms of Grand Central Dispatch. However, if you're familiar with, well let's see first I should say that Grand Central Dispatch, remind you that it also provides queues and they're called Dispatch queues.

If you're familiar with toll-free bridging, however, there is no toll-free bridging between NSOperationQueue's and dispatch queues; even though NSOperationQueue's are implemented on top of dispatch queues. There's also no 1:1 correspondence between them. You cannot convert between a dispatch queue and an NSOperationQueue. What actually happens is that the NSOperations you put it into an OperationQueue are run, that is they're started, on one of the special concurrent dispatch queues that Grand Central Dispatch provides. The exception to this, and of course there has to be an exception, is NSOperationQueue's main queue and Grand Central Dispatch's main queue. Now if you're familiar with NSOperationQueue in Leopard, you may be saying what the heck is the main queue? I never saw that before.

Well, that's another thing we've now added in Snow Leopard. There's a new class method called mainQueue which returns the queue that is the single queue which is bound to the main thread. Operations which are put into the mainQueue are executed on the main thread. Because there's only one thread which can be, if you will, the destination of the operations in this special queue, this is of course is a serial queue.

It runs one operation at a time. Because this is a shared queue, most of its properties cannot be modified. You cannot suspend the mainQueue. You cannot change the width; that is, the maximum concurrency count of the mainQueue. This corresponds, this mainQueue, corresponds to and uses GCD's main queue. So it's equivalent whether you put an operation in the main NSOperationQueue or you put a block or function pointer to be executed into Grand Central Dispatch's mainQueue.

Those things that are put in either of those two queues are run at the same time. Now the main thread in a Cocoa application usually is busy handling user input events and running the run loop. So how do these two main queues get serviced? Well, the main run loop on the main thread services these two main queues and it services them just as if they were a run loop source. Specifically it's as if they were a run loop source put into the "common modes", a special "common modes" constant.

So they are only serviced, these main queues, when the run loop is running in one of the common modes. And they are not serviced when the main run loop is serviced in a private mode. They are also not serviced re-entrantly; that is, while a block from the GCD main queue or an operation from NSOperationQueue's main queue are running, no other blocks will be started. Now some of this is a little esoteric, a little advanced what I'm mentioning here and if this is going over your head don't worry about it.

You probably won't have to worry about it at this point if you aren't already familiar with some of these terms that I'm using. But essentially what we have here with the main queues is a new way to get work back to the main thread which is often useful for interacting, for example, with the AppKit since we're talking about Cocoa. And it's basically similar in effect to using the performSelectorOnMainThread method with the waitUntilDone parameter NO, that is asynchronously. So work you put in these queues will happen asynchronously at some point later on the main thread.

One other last API I'm going to talk about that we've added to NSOperationQueue is another new class method called currentQueue. currentQueue returns the queue which is running the currently executing operation; that is, if you call currentQueue from within the execution context within the code of an NSOperation, the queue that is running that operation will be returned; and if there is no currently executing operation, this method is going to return nil. The exception, of course again there has to be an exception, is that on the main thread the mainQueue is always returned by this method. So the mainQueue is always returned by currentQueue and this is to align the behavior of NSOperationQueue with the behavior in Grand Central Dispatch.

Now this method can be a little tricky to use. If you call currentQueue expecting to do something with that queue, for example add operations to the queue, you might be surprised. If you get the current queue, well somebody created that queue of course, and that somebody is the logical owner of that queue. And so if that owner goes and suspends that queue and you've put work in whatever the current queue happened to be when you called this method, well your work isn't going to run.

Also if it isn't your queue, you shouldn't be messing with it. You should not be changing its properties; you should not suspend that queue; you should not change the maximum concurrency count of that queue because of course you may surprise the actual owner, the creator, of that OperationQueue. So the currentQueue method can be useful for debugging but you should think carefully before actually using it for any actual purpose.

The final thing I want to talk about with respect to NSOperationQueue's and the new stuff in Snow Leopard is their better performance. So we've improved the performance over Leopard and to a significant degree, this is a result of adopting Grand Central Dispatch. One thing for example that I give here is that adding operations, adding operations in a loop, many of them as fast as you can for example, is about 2.8 times faster on typical multicore Intel machines. So going through a loop and creating lots of operations is a fair bit faster.

But also the re-implementation in terms of Grand Central Dispatch has introduced greater asynchrony and simplified the concurrency safety within the implementation; so for example there's less contention on locks compared to Leopard. What this has allowed us to do is lower the execution overhead for each operation. And by overhead what you can think about this meaning is that if you have a do-nothing operation-- an operation whose main method, for example, is empty. It doesn't do anything. It's the amount of time for that operation to be executed by and processed through an operation queue.

Well I'm going to show a chart. Much easier to visualize what's going on. So first of all I'm going to show Leopard. This is the overhead in Leopard for 1, 2, and 8 Cores. And of course the lines are overlapping one another so it doesn't really matter which is which in this case because they're all basically the same.

But what we see here is that once you get to either 128- or 256,000 or more operations, you've begun to hit a wall in the overhead where it becomes impractical to try to execute that many operations within a short period of time. If you're trying to execute say, 256,000 operations over the course of a day, well about 220 seconds of overhead would be about a quarter of a percent of one day, and so the overhead in that case-- a quarter of a percent, is not really noticeable. But as you tried to get more operations through that queue of course, we see this sort of either exponential or quadratic and I don't have the patience to run larger numbers so I gave up at 256.

Here's what we see in Snow Leopard. So the lines are much lower, the curves are flatter for a longer period of time. One interesting thing to note is although in Leopard for example, the lines were fairly coincident; that is, they overlapped one another like they're diverging in Snow Leopard. So at least one good thing in Leopard was that the overhead wasn't getting worse as you added more Cores which could have happened.

But in Snow Leopard, something even better is happening. The top line is one Core, all these operations running on one Core, the middle line is two Cores and the lower line is 8 Cores. So what we see is the lower line of course is lower. As you add more Cores and are executing these operations on more Cores, the overhead is lower as well and the curve is more gentle. So what this is showing, of course, would be that the overhead of running operations is being distributed itself better among the various Cores that are available. So that's Operation Overhead. Now it's just Operation Overhead that I'm talking about.

I've put up this chart and I have it going from 32,000 to 8 million; that is not to say that you should run out and write applications which try to execute 1 million operations in a short period of time. Of course, the operations themselves require RAM just to represent the operation objects, and the operations-- your operations, as they run will probably use memory. So what's going to happen as you try to create more and more operations is you're going to begin to swap. Once you begin swapping, the Operation Overhead is really the least of your problems.

You have much bigger problems once you begin swapping, and that's in fact why the chart doesn't go off to 16 million because that's where it began swapping on the machine I was collecting these numbers on. Now let's go over to the collection APIs that we've added in Snow Leopard within Foundation. So on NSArray, NSDictionary, NSSet, and NSIndexSet, we've added some new APIs for enumeration and searching.

Well it used to be that to enumerate you used an NS enumerator or you used, in the case of Array, indexed enumeration, indexed access to enumerate over an Array. Then in Leopard we added the four-in syntax to allow you to enumerate over collections and that was faster and nicer. Now that we've introduced blocks into the languages, we've added yet more ways to enumerate your collections; and the particular benefit in this case is that we've added an option for doing this concurrently-- the NSEnumerationConcurrent option.

And when you provide that the block may be invoked concurrently. Now to illustrate a point that's going to come up a little bit later in the talk, I want to point out the example there in the middle. This is what the block signature of the NSArray method, enumerateObjectsWithOptions:usingBlock: looks like.

It gets the object being enumerated, it gets the integer index of that object and it gets a Boolean parameter that allows the block to stop the enumeration early if it wishes to. Now of course each collection has its own signature. For dictionaries, the dictionary gets the key object and the value object and the Boolean stop parameter instead of an object and an index.

Well, why do we pass in that index? Well if you've turned on concurrent enumeration or passing in this option to this method, there would be no way for the block that is practical at least, to find out what the index is if the block needs it. When you've turned on concurrent enumeration, the objects in the array will be enumerated in sort of an arbitrary order.

There's no defined order that they will be enumerated in and so there would be no practical way for the block to know what the index was. Calling index of object would just be horrendously expensive if we didn't provide this parameter. So that's why at least the array method takes the index of the object.

To illustrate concurrent enumeration, what I'm going to do is I'm going to implement a map function in a category on NSArray. A map is a function that's found often within functional programming languages, and what it does is it takes an input array or source array and processes each element in the array through a mapping function, and puts all the result objects into a result array and that looks something like this.

On the left side there I have my source array; each object is given to the mapping function. And what the mapping function is supposed to do is return a new object or potentially the same object, but an object which ends up then in the result array. And notice that in, for example as I'm illustrating this diagram, there's a correspondence here between the two arrays; that is, the result object from the mapper which is at say index 7, came from somehow the source object, the object which was in the source array at index 7. So there is a correspondence here between the source objects which are the inputs and the result objects which end up in the result array.

So what are my design points for the map function which I'm about to create? Well the caller is going to specify a mapping function using a block since we have this nice new block syntax. I'm going to give the block that you provide, the caller provides, the object and the index parameter; wouldn't necessarily need to give it the index but in case the mapping function wants the index I'm just going to pass it along since I already get it myself as you'll see.

I'm not going to give the mapping function the option to stop the mapping early though. I don't want to have to deal with and think about what it would mean for the mapping to end prematurely. The block, the mapping block should return a new object to of course put in the result array, and it should return that autoreleased and it must not return nil. I'm not going to handle nil in my mapping function. Because this is a talk about concurrency in Cocoa, of course I'm going to invoke the block concurrently and so that mapping block has to be concurrency safe.

And finally, the map method is going to return the new array, the new results array. Well this looks like this. I begin, of course, by declaring my method and I'm going to call it my_mapUsingBlock in this case and it only takes one parameter, the mapper function which takes an object and an index parameter. Now the source array in this case because this is a category in NSArray is the receiving object self.

So the first thing I do is get self's count and that's of course going to be the size of the result array as well since there's a correspondence between them. And I create a temporary C array using malloc of that size to hold the result objects. Well why am I doing this?

Why am I creating a temporary C array? Well, the mapper block is going to be invoked concurrently as I said, so in arbitrary order. It's not going to be invoked from index zero on up necessarily. So because I want the result array to have a correspondence to the source array, I would have to keep track of the order and the index of each result; but because I'm going to use a C array I can just store the new object at the specific index in the C array that the source object came from and you'll see that in a minute.

It also means that I don't have to thread-safely access any higher level collection. The C array, because I'm only going to touch and modify one element within that C array from each invocation of a block, and I'm not going to-- well and the block is only executed one time for each index.

And because the block is only executed one time for each index, then I don't need to do anything further for thread safety. So as I mentioned the mapper block should return an autoreleased object. That's conventional in Cocoa so we'll try to stick with that kind of convention for my API, and so the autoreleased object is being autoreleased because this is concurrency going on, on various threads.

Well the enumerateObjectsWithOptions method on NSArray and the other methods as well that we added, take care of putting the autorelease pool in place for you on those various threads that are in use, so it's going to be safe for the mapper to return an autoreleased object-- there's not going to be a leak.

But the problem then will be that I will have to retain that return value of the mapping function temporarily as I store it in my C array, to ensure that that object which is being created potentially on another thread, lives long enough so that I can store it then in my result NSArray. Having created my C array called temp what I'm going to do is call that enumerateObjectsWithOptions method and I'm going to pass in the NSEnumerationConcurrent option, and I'm going to pass in a block with the correct syntax.

For NSArray the correct syntax takes an object argument, a integer which is the index and the Boolean BOOL star stop parameter. What I'm going to do in my block that I'm going to pass to that enumerate method is I'm going to call the mapping block that I got as an argument, and I'm going to pass it the object in the index, that's fairly straightforward, and it's going to do its little computation and return a result. Then I'm going to have to retain, as I just said, that potentially new object to make sure it lives long enough, lives at least beyond the call to the enumerate method, and then I'm going to store that object at temp [idx].

So I'm going to store it at the specific index of the source object within the new temp array or the temporary C array. The enumerateObjectsWithOptions method is doing a bunch of concurrency because I'm passing in the concurrent option, within itself, within the execution of that method. But it itself is synchronous; that is, it's not going to return until all that concurrency, all those invocations of the block have actually finished.

So that will end up being very convenient because once that method returns, I know that all of the concurrency, all the new results have been stuffed into my temporary C array so I don't have to worry about doing any waiting for anything to happen or what have you.

So what I'll do, is I'll simply create a new array with an arrayWithObjects count using that new temp, that temporary temp buffer, and having created the array, what I'm going to have to do is then release all those temporary retains that I talked about that I put in my block on the objects I stored in that temporary array. So I'm just going to loop, sending each of them release, because they are now retained by that new array. Then I'm going to free my temp buffer. Ok so here is what that looks like then.

In the pink or purple or whatever you call that color, is the contents of that block I was talking about. I call the mapping function, I retain its return value to make sure it lives beyond the execution of the enumerateObjectsWithOptions method and then I simply store it in the C array at the proper index.

Then I create my new array, NSArray with that C array of objects full of objects, I release each of them as I mentioned because I need to balance my previous temporary retain, I free the temporary C array and I return the new result NSArray. That's what that looks like.

Now of course because I use malloc, I'll just mention in passing, I allocated there for that temporary C array, unscanned memory. So this is the retained release version of this method. Under Garbage Collection you need to allocate scanned memory for that temporary C array so the garbage collector doesn't come along and collect those result objects from the mapper block because of course retain does nothing under Garbage Collection; and the collector if it's unscanned memory won't see the object there in temp. So let's move on quickly to Concurrent Sorting.

We've added some new methods in NSArray and NSMutableArray that allow you to sort using a block so that's one benefit. We were taking advantage again of this new block syntax which is again extremely convenient for these sorts of things where you just need a little tiny method or a little tiny bit of code to do a comparison.

You pass in the NSSortConcurrent option to enable concurrent sorting. The comparator block looks like what you might expect. You have to return an NSComparisonResult-- one of those enumerated values from the block, and the block gets as its parameters the two objects to be compared. Now if you're going to use the NSSortConcurrent sort option, your block of course has to be concurrency safe. One other thing I want to mention here, I guess this is the last bullet, there's also an NSSortStable option that we've added in Snow Leopard so you can get stable sorting out of this.

But just because you're using concurrent sorting doesn't mean you can't also get stable sorting at the same time. Concurrent sorting is of course arbitrarily sort of invoking the comparator function across different pairs of objects, but stable sorting is also available at the same time. So you can combine these two options and still get concurrent yet stable sorting. So what does this look like in a graph? A graph of course being an easy way to visualize all this.

Well, what we have here is I wrote a little program to sort an array of random fairly short, about 20 characters ea ch, short strings. Doing a standard nonconcurrent sort we see a line like this and of course these specific values aren't really material, just how the different curves, and I'm going to show you, relate to one another.

Using concurrent sort with two Cores on the same machine, we see a curve like this much lower. And on 8 Cores we see a lower curve yet. So basically what this means is that the sorting is faster on more Cores or the latency of finishing the sort operation is less if you want to look at it that way.

Now let's go on to NSNotificationCenter. So in Snow Leopard we've added another new API and you would have seen this in the What's New talk, yesterday called addObserverForName object queue usingBlock. Now if you remember in your head a traditional API for adding an observer takes a target object which is the observer object and a selector to invoke. Well instead, this new method takes advantage of this new block syntax we've added and you can specify an observer block essentially-- a block to be invoked.

So now you don't have to define some private method or some method on some class somewhere to be invoked. You can simply put all the code you want to run when the notification happens or is posted, into that block. The other difference that I'll point out at this point is that the return value of this new method is id whereas the traditional method returns void; and I'll explain what the significance of that is in a second. The third parameter is also different-- that queue parameter there.

What does that mean? Well, if you give a parameter here that is a queue a non nil value, that block will be executed in the context of that operation queue. And so this can be a convenient way to get the handling of a notification onto the context of a particular operation queue.

Well the most obvious example would be is if you're using the main queue as the queue parameter here, you can get the notification handled on the main thread. If there are multiple observers registered for the same notification about the same object, via this method, those observer blocks are all where possible executed concurrently; that is, at the same time. So all the observers are handling the notification at the same time potentially, but the posting that is the thread or the queue or whatever it is the code that is posting the notification is still synchronous.

That is, the posting operation still blocks waiting for all the observers wherever it's going to be executed to have executed their handling of the notification before the posting method returns. This of course is crucial in the case where you have a notification like NSWindowWillClose. Well if the observers were simply processed asynchronously at some later point and the posting was allowed to continue, the posting would occur and the window would get closed, and potentially an hour later your observer might get executed saying hey the window's about to close but of course it's long since closed at that point.

So the posting still has to be synchronous. If you don't want to specify a queue, you can pass in nil and the handler block there will be executed on the posting thread just as is traditional with the old style addObserver method, where the observers were all handled, called synchronously on the thread which is doing the posting.

The other thing is that return value as I mentioned. That return value can be thought of as the new observer if you want, but basically it has to be retained and held on to as long as you want this notification registration to exist. When you want to remove the observer; that is, stop handling this notification, you need to pass this return object to the removeObserver method as its parameter. So essentially this return object is like the observer.

Well let's take a little break here and move on up to the AppKit. So in AppKit in Snow Leopard we added two new major functionalities that have to do with concurrency. The first of which is Concurrent Document Opening. NSDocument as you may have seen alluded to at least yesterday at the What's New talk, can now open files concurrently. And TextEdit does this for most document types. The sketch example is an example you can look at as well where the sketch has been modified to do concurrent document opening.

To enable concurrent document opening what you do is a fairly simple process. The first step is the hardest of course; it's to make your document reading code concurrency-safe. If your document reading code is doing things like modifying global data or just reading, accessing global data which itself might be modified by some other thread, you of course have to make that thread-safe. Once you've done that you simply override the canConcurrentlyReadDocumentsOfType method in your NSDocument subclass.

This is a class method that returns a BOOL. It gets the type of the document and for those documents whose type you recognize as now you're able to concurrently open them, you simply return YES. Once you've done that, documents of those types may then be loaded in the background.

The AppKit of course does not promise to load them in the background. It may not load them in the background for reasons of its own. The practical impact that you should be aware of, of course once you've done this is that some NSDocument Controller and some NSDocument methods will be invoked on nonmain threads. "Nonmain" isn't a word.

So another thing of course you should be doing and it's called out with its own bullet here is not doing UI from your document reading code. That's part of making it concurrency safe. If of course your document reading code ends up actually being invoked on another thread, it's not being invoked on the main thread of course, and so you shouldn't be doing UI from the document reading. You should only read the data in the file or files and create the model objects that represent the contents of your document.

If there is a failure, just create an NSError and stuff it in that out parameter to the various methods. Don't try to put up an alert panel from within your document reading code, because that document reading code is not necessarily running on the main thread. Another thing is you should disable undo registration during your document reading.

Just in general it's not particularly interesting to allow a user to be able to undo the opening of a document; they can simply close it. But particularly in the case of concurrent document opening because undo is essentially a UI facility with modifying the Undo and Redo menu items and so on and basically tracking what the user does to the documents as they're opened.

You should disable undo registration when you are doing the document reading so that undo items don't end up-- of course they'll be all scrambled if there are many documents being opened at the same time and the user will just be totally confused about what's going on. Then we have Concurrent Drawing.

This is a new facility on NSView that's been added in Snow Leopard. And views can be set so that they can be drawn concurrently; again this is an advisory state. You simply call the setCanDrawConcurrently method and pass it the Boolean value YES; and of course this is can.

This is just indicating to the AppKit that it can if it wants to draw the view that receives this concurrently with the drawing of other views when display happens. Subviews don't inherit this property from their superview. So each individual view that you want to enable as a concurrent drawing on, has to have this set on it. So once you've done that a view may be drawn on a background thread concurrently with other views which haven't had this set on them.

Those will be still drawn on the main thread. The point of course it to produce a performance improvement. If you have multiple slowly drawing views within a window, then you may see performance improvement if those slow views can draw concurrently with respect to one another. Views as I said not set to draw concurrently still draw on the main thread for backwards compatibility and display; the whole display process is still done on the main thread as it always has been and is still synchronous.

That is, display, the display process does not return until all of the drawing, if there's any concurrent drawing, has finished. But it also means that any other activity other than the display that wants to happen on the main thread is held up while the display is going on.

So if your model objects which you are displaying can only be modified say by a user input event like a button click on an Add button or that kind of thing. User input events are blocked while display is going on and so your model objects aren't going to be changed while display is going on in that case. And so the thread-safety burden on your model does not change just because you've enabled concurrent drawing. If it was already possible for your model objects to change on other threads, you already should have had to deal with making your model accesses thread-safe regardless of concurrent drawing.

That's the general rule. So what do you do to enable concurrent drawing? Well first you measure the draw timing. You should always measure first to see where you've ended up once you've done the change. And the point is to find those custom views of yours which are the expensive views.

If you need to, for example, if those views during their drawing have side effects like they're incrementing a global counter or they're modifying some data in their drawing which is maybe a little suspect to begin with, but if you're doing that, then you need to make the drawing concurrency-safe; you mark those views that you want to try out, concurrent drawing on, as being able to draw concurrently; and then you measure again. Now drawing views concurrently adds some overhead because some graphic state and what not has to be replicated across different contexts then which normally doesn't have to happen. And so you may not see a performance improvement so you should always measure.

If you don't see a performance improvement, don't turn this on. There are a couple of caveats that I want to point out with respect to Snow Leopard specifically. Layer-backed views are not drawn concurrently in Snow Leopard; overlapping sibling views are not handled correctly; that is, if you turn on concurrent drawing for one or more overlapping sibling views, that is not subviews of one another but they're just sibling views in the view hierarchy, you may see some drawing artifacts.

And as I alluded to earlier, this is mostly an option for your custom views. The AppKit's own views and controls have not yet been fully vetted to be concurrency-safe; and so this is an option at this point where you should just be looking at your own custom views for turning this on.

You should absolutely go and see the AppKit release notes for more information. I mentioned you should measure draw timing. Well how do you do that? Well the AppKit release notes have some helpful tips on how to do that and also some debugging tips for debugging potential issues with concurrent drawing.

So let's wrap up. For more information I'll point you to our Developer Tools and Performance Evangelist, Michael Jurewitz. You should absolutely go look at the Concurrency Programming Guide. This is a new document that we've just added now with Snow Leopard. And there is a link to that if you go to this session's site, if you will, on the attendee site there's a link to this new Concurrency Programming Guide. You can also get the AppKit and Foundation Release Notes, I think, from there. I haven't checked that specifically but they should also be available in Xcode or from Xcode.