Application Technologies • 1:03:59
Multithreading can boost the performance and responsiveness of your application. Learn how to use multithreading effectively in Cocoa to get the most out of dual-processor Macs.
Speaker: Chris Kane
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it may have transcription errors.
I'm Chris Kane. I'm a software engineer on the Cocoa team. You, hopefully, are here to hear about something to do with multithreading in Cocoa. I'm going to be talking a little bit today about a few general multithreading topics. And then I'll be getting into some of the Cocoa APIs that we provide for multithreading and to support you in your multithreading.
And I'll also be covering some of the issues that arise. I'm going to have to assume, though, because I'm very time constrained, that you come to the talk here with some background, that you have heard of locks before, you know roughly what a thread is. Perhaps you took a course in college or have read something about it in the meantime or have experience with Java or some other programming system in which you use threading. So I'm going to have to go very quickly and just touch very briefly on the issues that I'm going to try to address. There's a lot of depth in this topic and one talk. The talk that I have here, I cannot hope to cover anything in any real depth. Now, another thing I want to know right up front is that I'm not going to be talking about any particular processor architecture's behavior or how you do things in particular on a particular processor architecture. I'm going to be talking in general terms about how processor architectures work, not about the details of, say, how the Intel processor works or how the PowerPC processor works.
Well, of course, at the beginning, we have to give our definitions. And so here's my definition of what is multithreading. When I talk about multithreading, I'm talking about executing multiple chunks of code simultaneously in your program. Now, on a single processor machine, you can do this by using the preemptive multitasking that the operating system provides. And so it's simulated concurrency. But as we are moving forward here in the future, and in the future, we're going to be seeing a lot more true concurrency from these dual processor, now quad processor machines that we've been shipping.
Now, another way to achieve multithreading is, of course, to spawn other processes. But I'm not going to be talking about that. Of course, the user is running many apps while they're logged in and using the system. And each of those apps has their own set of threads and is doing its own thing. I'm not going to be talking about concurrency or multithreading via separate processes, but rather just about threads within the same process.
Now, the traditional approach, or the traditional reason for multi-threading was to avoid blocking the event loop. That is, if you had a long-running task that you wanted to do, say when the user hit a certain button or whatever, you could spawn off another thread, do that long-running processing in the background, and remain responsive to the user as the user continued to fiddle with your UI.
That was a traditional reason for multithreading. Sometimes multithreading is useful because it can simplify the code. Another approach to achieving apparent concurrency, I'll call it, to the user is to use asynchronous APIs and register to receive callbacks when things happen and so on. But that can cause you to have to chunk up the logic of your program across several different types of callback functions and whatnot. And so sometimes multithreading can simplify code. But the main reason that we're going to be seeing now and going into the future, it seems, is to achieve performance improvement.
There was a time when machines ran at 100 megahertz. And then the machines-- the next generation of machines came out and they ran at 150 megahertz. And so the existing applications got a little performance boost out of that. And eventually, there were 500 megahertz machines, and gigahertz machines, and 3 gigahertz machines, and so on. And the existing apps, the apps which didn't change, I mean, were getting performance improvements for free by way of the processor speed increasing.
What we see, it looks like going into the future, is that processors are going to be gaining multiple cores. And in order to take advantage of the multiple cores, you have to use threads. Because threads are the way that the operating system allows you to address each core, if you will. And so to put work on a core, you have to use multiple threads. I'm going to do a quick demo here. Now this demo, if we can go to the demo machine, oh, we already have.
All I've set up here is a very trivial app. And what I have is a bunch of work to be done on a bunch of images, an array of images I have. And the work is-- maybe it's some sort of warp or some sort of color changing operation to each image that I'm going to do. The details don't matter. What I have here in the left column, What I want to point out is in the parentheses, I've put in an estimate based on the size of each image of the amount of work it's going to take.
So first what I'm going to do, I've got a one up there in that top box. And what that top box is, is the number of threads that I'm going to use to process these images. So let me go ahead, I'm going to start this up, and of course it begins chunking away. As each image is finished, it shows up in the rightmost column.
Now at the same time now, I'm going to show what happens if I use instead three threads. And of course the obvious thing happens. I'm doing things three times, well maybe roughly three times faster. And of course using three threads, I can process the array that much faster.
Now, in some sense, I've faked up this demo. And I'm not really doing a lot of serious work underneath the covers. But what this allows me to do is I'm going to pretend I have 10 cores and spawn off 10 different threads to process these images and, of course, Bing. There we go. The point is just that-- The way that you got processor speed up in the past with increasing processor speed, clock speed, is going to be changed, if we can go back to the slides, and now we're gonna be going back, going into a world in which multiple threads are gonna be a key tool to get performance improvement in your applications. Now, unfortunately, not everything works out just so nicely. You can't just create threads and go do the work. There's a little problem. There are many problems that come about when you go starting to do multithreading.
by and large, most of them devolve down to one key issue. That is, changes to shared data by multiple threads ends up being the cause of most multi-threading issues. Now, there's actually two flavors of this. One is that the obvious one, and hopefully, you know, I'm assuming here that you have some experience, so that you've run into this one or you've heard about this one before. If you are changing data in multiple threads at the same time, of course, you can produce invalid data, invalid results, invalid states in the data. that is the threads can conflict with one another in producing their changes.
The second problem is a little less familiar, I think, to many people, and that is that changes made by one thread, let's say it's a thread running on one particular core, may not be visible to other threads running potentially on other processor cores. And this comes about because processor architectures have various layers of memory caches going back to main memory. And so from the point of view of a single processor, it writes out its information, say, into the L1 cache, and maybe that goes to an L2 cache, and eventually it proceeds out to main memory. But threads on other cores do not necessarily see those changes right away. You have to, on most processor architectures, include some additional assembly instructions or do some additional steps in order to inform the other cores that, say, their L1 cache for that particular address which you've changed is now invalid. And so there's a class of problems called visibility problems where changes made by one thread may not necessarily be seen by other threads without additional action. Let me illustrate the first example, the first problem. The first problem was that simultaneous changes can produce conflicting results. So we're going to start with an int in memory, and both threads in our example are going to increment the variable. So the first processor, the first core, is going to load the value into memory, load the value into the processor, I mean. It's going to increment it and then write it back out. Simultaneously, the second one is going to start and do the same thing. So when the first core goes to write out the value back into memory, the second thread on a second core, for example, is still incrementing the value.
The second thread gets done with instant increment and writes the value out to its value, its incremented value out to memory. And what's happened? Well, The change that thread one made has been lost. This is a classic example. B only got incremented once, which in some sense could be an invalid state in your data. You've lost the work that the first thread tried to do.
Let me show quickly an example of a visibility hazard. So we start with the same integer in memory. And the first both threads, we're going to assume, see the value as 5 out in memory. The first thread is going to increment the value. And it writes out 6 out to my memory. At some later point, perhaps it's briefly later, or it might be quite a bit later, quite a bit in terms of the CPU processor time at least, the second thread is going to come along, and it's going to load the value at that address, that B value. But it may only load it from, say, its L1 cache. and it's going to print that value out. But it prints out 5 because it still sees the value 5. The first process, the first thread on the first processor has not broadcast the information that it's changed B out to the other processors. And so that is the source of this visibility problem in this particular case.
So because there are these problems that come about when you try to execute code at the same time, this notion of thread safety is introduced. And for this particular purpose, I'm focusing on intrinsic thread safety. So what I mean by intrinsic thread safety here is that a caller does not have to worry about the activities of that API, for example, or that functionality, that feature.
when the caller is using that functionality on multiple threads. That is, the caller does not have to take any additional action, and that is a piece of functionality, a piece of software, which is intrinsically thread-safe. It's, in some sense, built-in. The thread safety is built-in. Now, typically, though, intrinsic thread safety does not protect compound operations, so that if you make one call, and then you make another call, and then you make another call into the same functionality, its intrinsic thread safety does not necessarily provide safety across all of those separate calls.
That is, the first call may happen safely, then the second call may happen safely, but things may change in between those calls. Now the COCA documentation talks about what things are intrinsically thread safe, and I'll be mentioning this again a little bit later. Many things in Cocoa are not intrinsically thread-safe, but most can be used in a thread-safe way by taking additional action. So... to review, the main problem turns out to be that changes to shared data by multiple threads cause visibility problems and conflicts between the changes.
Now, how do you avoid these problems? Well, if the main problem is that multiple threads changing shared data, cause conflicts or visibility issues, well, you can choose not to share data among threads. If all your threads are looking at different sets of data, then there's no problem. Or conversely, if you don't change any data, if you don't change memory, then again, there's no issue.
In practice, of course, this is not realistic. And most actual functional types of solutions devolve-- again, here we have two types. The first is you can change the way that the code or data is structured. And the second is you can augment the code with more code. So let me look at the first of those. Well, if you have data and it's immutable, say you're using an immutable collection or you're simply treating it as immutable is constant, then, of course, there's no issue because you're not changing data. And changing data, changing shared data, is one of the contributing factors to most threading issues.
The second approach is thread confinement, that I'm going to talk about at least, is thread confinement. Thread confinement means-- only one thread is going to look at or use or change a particular value. So the most trivial example is, of course, all the local variables on the stack are generally confined to that particular thread. You don't have to worry about the thread safety of writing and reading local variables. goals. Second type of thread confinement is thread specific data. Most thread packages offer some sort of thread specific data and of course, Cocoa does as well.
Another approach is to use dedicated threads. Suppose you have a very complex data structure, and it would be very complicated, perhaps, to try to allow multiple threads to access that complicated data structure thread safely. Well, what you can do is you can dedicate a thread to be responsible for accessing that data structure, and then funnel all requests to actually process or work with that data structure through that thread that thread actually do the work and look at that data structure.
So if you have a dedicated thread, of course, you may need to communicate with it. And so another form of changing how your data or your code is structured is, of course, to add communication between threads. Now, one has to be a little careful with thread confinement because you have to be careful not to publish to global data structures. You can publish, say, something to a global data structure by assigning it to a static global in a source file.
Or there are certainly global structures, like the NS Notification Center, which if you put an observer in the Notification Center, the Notification Center is public. It's global to the entire process, not just any particular thread. And so when you put an observer in the Notification Center, you've published that object to all the threads. And so you've lost any hope of thread confinement for that particular object.
Object confinement is another approach. This is one where you have, perhaps again, a complex data structure that you don't want to get into trying to make thread safe itself. What you can do is create a facade type object, a front object, some sort of proxy through which you funnel all your requests to that complex data structure. So you create an object which acts as a front And you can make that object thread safe without necessarily making a whole big gnarly object graph behind it thread safe. And there are also other special purpose data structures and various sorts of lockless techniques. I don't have nearly enough time to go into those kind of issues. So I'm going to gloss over that.
The second approach to making something thread safe, second general approach, I should say, is to augment the code. And this is the usual technique that most of you would be, I hope, familiar with coming into this talk. You add locks around the code that is doing the augmentation. shall we say, dangerous operation. So locks are one form. Conditioned objects are another way to protect data, protect chunks of code from executing simultaneously. There are also atomic operations or barrier instructions that one can insert.
And the nice thing about these is that they take care of both of the issues, both the change conflict type issues and the visibility issues. That is, if you use a lock, the operating system has -- the people working on the operating system, I should say -- have written the locks so that both the visibility issues are addressed and the change conflict issues are addressed.
Now, once one solves the threading issues with very simple, you know, sort of does the simple changes to solve the threading issues, one might introduce secondary problems. And so there are whole classes of secondary issues that come about by trying to make things thread safe. You know, deadlock and live lock are classic examples. performance problems can occur. One in particular would be if many things are trying-- many threads are trying to get access to a particular data structure, you often run into issues like lock contention, where if only one thing can change the data structure at a time, all the other threads are blocked waiting, trying to get access to that data structure, and you've lost a possibly lost some of the performance improvement you could have had if there was no contention. Another example is execution environment coupling. And I want to call this out because it's a little subtle. And most people don't think about this kind of issue.
It comes about when you use per-thread data. If you use per-thread data and you very carefully set up a data structure on a particular thread in a particular way, and the code then is going to come along later and use that per-thread data structure, What happens is that the code now is coupled, is bound in some sense to that very carefully initialized data structure. And so that code can't be just arbitrarily run on another thread. Another thread will have a different copy of that per thread data if it has any at all. It doesn't necessarily have the copy that exists on the thread the code was supposed to run on. And so this limits, this being execution environment coupling, limits your ability to move code around and run code on new threads when you're trying to take advantage of multi-threading. A classic example of per-thread data that runs into this is, of course, the run loop system, where you put sources in the run loop, and the run loop is a per-thread object. and then you try to run some code on a different thread, and those sources aren't registered with that run loop on that thread.
So those are some of the general issues. I'm going to get into a little bit more about Cocoa now. And I'm going to cover some NSThread APIs and some of the lock APIs that we have and some of the ways that you can communicate between threads. Cocoa also has immutable objects like immutable arrays and immutable data structures and so on. I'm not gonna talk any more about that, but of course, as I said before, using immutable objects is a way of achieving a form of thread safety because immutable objects can't be changed and so you know you're, in some sense you know you're safe.
I'm also going to be talking about NSOperation and NSOperationQ, and you've probably seen references to those before if you heard Simon Patience on Monday talk about them. And, of course, to Cocoa developers, the standard POSIX Pthread API is also available, this is a C API, also available to Cocoa programmers, as well as some functions provided by the operating system in the OSAtomic.h header. I'm not going to be talking about those functionality in any more detail in this talk. Thank you.
So, NSThread. Well, NSThread is the way, of course, that you create new threads, and new threads, threads in general, are the way that you separate work to be done by different cores. Now, you can create a new thread using the detached new thread selector to target with object method, and that was the traditional way. In Leopard, we now allow you to create NSThread objects without creating the underlying operating system thread. And what this allows you to do is create threads which aren't yet running. When you want the thread to run, you use the thread method start. And that creates the underlying operating system thread and starts the thread off doing its work. Now, what is the thread doing? the thread has called the main method in NSThread. This is the point where if you're subclassing NSThread, you would override main and in main do the work that you want the thread to do. If you were doing a dedicated thread, for example, you would probably approach that by subclassing NSThread and overriding main to do the body of the work. When the start method's called, what it does is it creates a new underlying operating system thread and causes main, the main method, to be invoked in the context of that new thread. If you have an existing method on an existing object, we also have a convenience method on NSThread now, where you can sort of wrap an NSThread around that method on that object and have that method be your thread body, the work that the thread is doing.
We've added a few more features to NSThread, like cancellation. But this isn't cancellation in the sense of POSIX Pthread cancellation. Rather, all it is is an advisory state, a Boolean, which you can set by calling the cancel method on a thread. And what's supposed to happen is that the code running on the thread can pull the cancellation state once in a while at safe points, presumably, and decide that it should shut itself down. So it's a sort of requesting that the thread cancel itself, that the work being done by main stop.
We've added other state like is executing and is finished. So you can find out what the threads are doing. And NS threads are going to be KVO compliant. I don't think they're fully KVO compliant in the leopard seed that you have. But the intent is that threads will be KVO compliant. And so you can hook them up with bindings and say observe an array of threads if you wish. and perhaps even display a UI like the activity viewer in Mail to show what the threads are doing.
Now, of course, Cocoa also has locks. These are lock classes wrapping the underlying Pthread, POSIX Pthread implementation. The Objective-C language also offers the atSynchronized directive, which creates a block which is synchronized by the object that you give the atSynchronized directive. And all this means is that the lock is essentially the object itself, or the object acts like the lock, perhaps, would be a better way to put that.
And again, I can't talk about these things in any detail. You have to go to the documentation to read more about them. Condition objects wrap the underlying POSIX Pthread condition objects. And we have a new one that we've exposed called NSCondition in Leopard, but it exists in Tiger and all previous releases as well. It simply wasn't exposed as an API. NSCondition is a-- since it's new, I'll talk about it a little bit-- is a more powerful, I would say, way of using condition objects than NSCondition lock was. You can do more complicated predicates using NSConditions. But again, I can't talk about these things in any detail because I'm very time constrained. The clock is ticking here.
I talked about dedicated threads, and one way, one thing we have to support use of dedicated threads is to be able to perform selector on thread. So you can send this message to any object and tell that object to, in the context of the given thread, perform that method. This is a very powerful way to communicate information between threads. It's also a powerful way to, if you have a dedicated thread, to give that dedicated thread work to do.
You can call the work the selector, the method that you're telling it to invoke, and of course the object receiver of this method is the implicit receiver of that message as well. And so you can give that dedicated thread work to do by using performSelectorOnThread as well as the more typical use, which would be to just communicate information. The performSelectorOnMainThread method that existed since Panther at least, I forget if we added it to Jaguar or if it was Panther, that method is now sort of a special case of this where the thread parameter is simply the main thread object.
Now, what is one way in which you can use this perform selector mechanism? There's something I like to call the receptionist pattern, which is you can create objects that act as a proxy or a front for a real intended receiver. And this is useful when you have an object you want to message from the context of some thread, but it isn't safe for that object to actually receive and do that, invoke that method on that or any particular thread. For example, you have an object which is only safe to use on the main thread. you can use something like a receptionist to get the information over to the main thread. So what the receptionist does is it's simply an object that records the messages that it receives and arranges to have them delivered on the main thread instead. So for example, I have a set title method here that I implemented that I want to invoke on the real destination object. maybe it's a window object. What I do is I implement setTitle on the receptionist object class. And what the receptionist simply does is it turns around and tells the target, its real target, which I've had to initialize the receptionist with, to perform setTitle on the main thread. So this is fairly straightforward.
What you would do then is use the receptionist object in place of the real object whenever you needed to address some message to the, of course, real object. Now sometimes you have methods which have more complicated parameters than a simple object. And you may need then to create a memo object, which just is a data-bearing object that remembers all the arguments. In this particular example-- I'm creating a memo object where I'm passing the ID and the int, the two arguments, to the complex method to the memo for it to remember. And I'm going to tell the memo, instead of the real target, I'm going to tell the memo to dispatch itself on the main thread, in this particular case, with the given target. And what will happen is that on the main thread at some point, the dispatch to target method will be called on the little memo object.
And all the memo object does is call complex method then on the real target that I intended to receive the message. And so that has gotten the invocation of the complex method over to the main thread. Of course, the logical extension of this is you can have a receptionist object, which is actually a proxy. It doesn't implement any methods. And NSInvocation would be your memo object. The NSInvocation in the forward invocation method becomes the memo. And you can tell the invocation to invoke itself over on the main thread by using performSelectorOnMainThread.
Now, how do you go about making work concurrent? Let me take a little sidebar here. Well, the traditional approach was to do several completely different things. That was a typical approach. In the future, we're going to be seeing more of what I've highlighted in the second bullet, that is decomposing a chunk of work, a large chunk of work, into multiple pieces.
The most simple type of this example would be where you have an array of the same type of object as in my demo program. I had an array of images, an array where you want to do the same thing to every object in the array. That's a very simple approach. And there are many systems like OpenMP and so on that some of you may be familiar with that help facilitate splitting up, doing work over an array of something across multiple threads.
Another thing, another type of decomposition is where you have a big thing that you want to work on, is sort of a logically single unit where you can split it up in some way and then successfully merge the result back together. For example, you might have a large image which you want to color process in some way, And perhaps it's possible for you to split the image in half, do the processing of the two halves on separate threads, and then stitch the results back together. Of course, if what you're doing is rotating the image or applying some very complex mesh warp algorithm or whatever it happens to be, then it may be nearly impossible to do that on two different threads and successfully stitch the results back together. So of course, this is a technique that only applies in some cases. One case would be, for example, sorting an array. If you have a large array and you want to sort it, well, sorting is sort of a logically single operation. That is, you're operating on all the elements of the array. But you can divide an array up into two pieces, sort the two halves, and apply a merge sort type merge operation to the two sorted arrays to produce your final array. So there's a way to go about doing that. Oops.
In Leopard, we've added an abstraction for a task called NSOperation. And all this-- and what I mean by task is just a bit of work to do, not task in the sense of a new process. The intent here is to offer an abstraction and a way, an approach, perhaps, to decomposing your code, to structuring your code. So the intent is to help you design your programs. So operations can be themselves concurrent or not. That's a key element. An operation which is concurrent does its own threading, but operations do not have to be concurrent. They can just be straight line pieces of code without doing any particular threading at all.
Operations also introduce a concept of readiness. So in the base class, NSOperation, An operation is ready when all of its dependencies, all of the operations it depends on have finished. So operations can depend on other operations. Of course, the classic example of this would be linking. When you hit the Xcode hammer button and it goes off and builds things, well, it can't do the link until it's done all the compile steps. And so the link step depends on all the compile steps having finished before it can begin its work.
Another point I should make, of course, is that NS operations are KVO compliant. So you can observe them through bindings. You can observe them through key value observing and also bind to them and display them in the UI. And it's operation queue we introduce in Leopard to allow you to apply some form of flow control, basically, to your execution of operations.
So you may have, for example, 1,000 operations. The user hits a button, and now there's 1,000 operations that need to be invoked. And of course, the Xcode example applies here as well. You hit Build, and all your source files need to be compiled. Well, obviously, somebody at some point tried that, spawned off 1,000 GCC processes and built them all simultaneously and then did the link step when they were all done. And they tried that, and it was an abysmal failure because... Well, there's any number of reasons, of course. But, you know, if each GCC is using, say, just 50 megabytes and you spawn off 1,000 of them, well, now you have 50,000 megabytes of RAM that are desired by all the processes running.
So, you know, it just doesn't work to spawn off and run all the GCCs at the same time. So what Xcode does, of course, is spawns off two at a time if you're on a dual core machine or four at a time for a four core machine. And that's the rule of thumb they've chosen for applying concurrency.
What NSOperationQueue does is you can tune the amount of concurrency you get in execution of the operations that you put in the queue. And so if you want two at a time to run, you can turn the knob and set two as the amount of concurrency, or four at a time, and so on. And what happens is that operations that are put in the queue that are ready are started as previous operations finish. And so what the operation queue is doing is it's churning through the operations for you, and you can sort of turn it loose and let it do its thing once you've put all the operations that you have in it. And operation queue is also KVO compliant. and so you can bind up UI like a table view to it and display all the operations that are going on in the background, much again like the activity viewer in mail. Amen.
So let me give you a simple example. This example comes from the demo. So I have a warp an image operation, which I'm going to apply over an array of images. So for each image, I'm warping it, and I'm adding the object, the new image, to the new images array.
What I'm going to do is I'm going to create a new class called ImageWarp to abstract the loop, basically. I'm going to make it a non-concurrent subclass of NSOperation. So I'm not going to worry about doing the threading myself. What happens when a non-concurrent operation is discovered in the queue by NSOperation queue, it creates the thread for the operation and runs the operation in the context of that new temporary thread.
I'm also going to move that loop to a new class method on the image warp class. I'm going to call that method process images. And what this move allows me to do is, well, for one thing, it reduces the one, two, three, six lines above down to one line, but it also allows me to do something different in that loop. So I'm going to embed the logic of the loop off in a method so I can change it in the future if I wanted to.
What does process images look like then? Well, I'm going to create an operation queue, fairly straightforward. I'm going to create an array to receive all the new images like I did before. And for each image, I'm going to create instead an image warp operation object. I'm going to tell it which image it's supposed to process. I'm going to tell it the array in which it's supposed to add the resulting image. Then I'm going to add that operation to the operation queue. And as soon as I begin adding operations to the operation queue, it's going to start churning on them right away. Meanwhile, my for loop will finish. I'll put all the operations in the queue. I'm going to wait until all the operations are finished. I'm going to release the queue. And I'm going to return the array of new images. Now, of course, there was a call to warp an image in my previous loop. Well, that moves to the main method of NS operation. So the body of the loop becomes the core of the operation.
fairly obvious. So what the main method is going to do, it's going to warp the image that it was given when I initialized the object. And it's going to add the result to the result array that I gave it when I initialized the operation object. Now, of course, many objects, many operation objects are poking at this array. So I have to add some synchronization around that to protect that.
Now NSOperation, we're introducing it as a tool to help you structure your code. It's not the be all and end all of threading. There are many approaches to doing multi-threading, many abstractions one can use to assist in writing or decomposing programs and making them concurrent. Now, of course, you can still use threads as you have before. We haven't changed that. One thing I'm going to point out, though, is that an operation queue is a place where you put an operation to happen later. So if you want something to occur right away, say you want to start an animation, you wouldn't necessarily create an operation for that animation, you know, create by, by which I mean structure your code into an operation. and do that operation later by putting in an operation queue. You want the animation probably to start right away.
Operations then don't have to be used in the context of a queue. You can call start on one right away. If you put it in the queue, of course, at some later point, the queue will get to it and call start and start that operation. But you don't have to do that.
So the key point I'm trying to make here is that NSOperation and NSOperationQ are just one way to approach task decomposition within your programs. Now, I've used this word task decomposition. Of course, maybe some of you have already started thinking about this. Well, you know, I have an array that I want to process, or I have some work I want to do. A few questions come up as you're going to try to do this. How small should the pieces of subwork be? How many threads should I have? Well, unfortunately, these are very complex issues, and there's no simple answers that I can just give you here up on stage. It very much depends on what's going on, what you're doing, what the individual threads are going to be doing. For example, if threads are doing a lot of I/O, they're gonna be spending time blocked, and so you could potentially create more threads to keep the, say, two or four cores saturated with work. Whereas if the threads are very compute bound, say they're all computing pi or e or some other of your favorite numbers, then you don't want to create a lot more than the number of cores that are on the machine.
So I could give you a rule of thumb like, in splitting up the pieces to say, well, how small should work be? I could give you an example, which is don't make your work, the thing that's going to be done by that thread less than 10 microseconds of computation. And there's your answer.
10 microseconds, how do I know how long this work is going to take before I actually do it? That's one problem. You know, and if it takes 10 microseconds or 100 microseconds on this machine, well, it might take a completely different amount of time on this other machine. And so there's no simple answer that applies across various architectures. And of course, as machines speed up in the future, all your answers change. And so what I would suggest is don't worry too much about decomposing your work into too fine of pieces at this point. It's just a waste of time. In a sense, you'd be over-optimizing. And tuning to a particular processor or particular architecture like your dual core at home won't necessarily improve things for the user.
Let me get back to multi-threading in Cocoa. Well, I started by talking about thread safety. Well, unfortunately at this point, I have to just point you to the documentation to answer the question what APIs are thread safe in Cocoa and which aren't. There's no way I can cover that in the time frame allowed here.
Now, as I pointed out at the beginning, many things aren't intrinsically thread safe. And if you don't see that a class says it's thread safe in the documentation, you have to assume that it's not intrinsically thread safe. But of course, you can use locks and other safety measures that I've discussed already to use these things in a multi-threaded environment.
I talked earlier about transferring work. And this is what I call what happens when you use perform selector on main thread. You're transferring the work that you wanted to do on the main thread over to, say, the main thread. I mean, you're transferring the work you wanted to do on the current thread over to a different thread to have it do it instead. And the app kit, of course, in some places does this sort of thing for you. If you tell a view to display, for example, it doesn't actually do the drawing on that thread. What the view does is it arranges for the view to be redrawn off in the context of the main thread.
But again, not all is ideal here. Transferring work can cause its own issues, and I'm going to illustrate that. So what I have here is a do work method, which only wants to run in the context of the main thread. So it tests to see if it's running in the context of the main thread. And if it's not, it tells itself to perform do work in the context of the main thread by using perform selector on main thread.
If the thread is the main thread that do work is called on, it will actually just go ahead and do it. Now we introduce a subclass. And the subclass wants to enhance do work to do some more work. So what the subclass does is it calls super do work.
And it does its own thing, whatever the subclass wants to do. But this has caused two problems to be introduced, two very subtle problems. The first is that the superclass has not actually done its work when the subclass goes to do more stuff. The superclass pushed its work over to the main thread to be done at a later time. Well, what does that mean for the subclass? Has it assumed that the superclass, because it called super do work, has done that work? Has all the data structures that the superclass was supposed to update been updated?
No. They're going to be updated later on the main thread. The second problem is that the subclass is going to do its thing twice. in this, at least in this particular formulation. Do work has been called in the background thread, and the subclass called super do work, and then it did the work, its own work.
Well, when that do work method is re-invoked on the object in the main thread, The subclass is going to receive that method. It's going to call super do work. And super is actually going to do it because it's the main thread. And then it's going to do its more stuff thing again in the context of the main thread. Well, what does that mean? Who knows? Is it okay that the subclass's work is going to be done twice? Is the subclass prepared for that? Who knows? It depends on the particular example, the particular thing that's going on. So transferring work like this is not a panacea for making things thread safe or approaching thread safety.
Let me talk a little bit about KVO as another example here. The KVO subsystem itself is intrinsically thread-safe. That is, if you invoke methods like adding an observer to KVO, KVO takes locks and protects its own data structures. But compound operations are not made thread-safe by KVO itself being intrinsically thread-safe. And what's a compound operation? Well, the whole will and did change process is itself a compound operation. That is, when an object wants to cause KVO notification to go out for a change that's occurring, will change value for key will be called on the object, and that sets things up.
Then the object actually goes and does the change, and then did change value for key will be called. I'm talking about automatic KVO here in this particular case. And the did change value for key call will go on and notify then all the observers. So it's a compound operation. There are multiple steps there. The class can try to be thread safe by putting a lock around its change to its property. That is, the assignment of value to foo in this particular case.
But what's not made thread safe thereby is the operations that occur outside the context of the set method. KVO is called that for you before your set foo method is invoked. And those are not being wrapped by that lock you introduced. So those things are not thread safe. And what happens then is that in the did change value for key method, when the observers are being all notified, The KVO subsystem goes down the line, tells all the observers, observe value at key path, and so on. All that is happening outside the context of a lock, and it's all happening in some sort of arbitrary order. So if an observer-- believes the information that it receives in the change dictionary, that information may be out of date. If multiple objects, multiple threads rather, are changing this property of this object, those observances, those notifications, are being received in essentially kind of an arbitrary order.
Well, one solution is, of course, to use manual notification and put the lock around the will change value for key and the change and the did change value for key. And what this does is it makes sure that nobody else, no other thread, changes that foo property until all the observers have been notified of the change as a result of this current thread changing it.
Of course, having to use manual KVO notification is kind of a bummer. The automatic notification is a very convenient mechanism. And so one approach to fixing that would be to use a receptionist type pattern to push the actual change and the resulting KVO observances over to the main thread.
Another class of things that arise relatedly is, of course, bindings in Cocoa. Cocoa views and currently Cocoa controllers are not particularly thread safe. And so what one has to do then is get those KVO notifications over to the main thread in some way to be invoked in the safe context. that as views should be generally accessed in the main thread for the work that the KVO notification is going to cause in them.
The problem is that it's not clear how to do that with bindings at this point. And this is an active area of investigation for us. Cocoa is doing most of the work for you underneath the covers. That is, it's doing the KVO ad observer calls, the KVO registration. You don't really have a hook into that. So you can't substitute a different object. You can't substitute, say, a receptionist-type object to capture those KVO messages and send them over to the main thread. So as I said, this is an active area of investigation for us, how to enable multithreading with the bindings, or at least make it more convenient.
Core data, another aspect of Cocoa. Well, of course, if you're on the Cocoa dev mailing list, you've seen any number of discussions on this. And I can't go into any detail. But they recommend that if you use a separate managed object context for each thread, the locking will be taken care of for you by core data. Some classes in core data implement NSLocking and can be locked explicitly if you need to. But you really have to go see the documentation to find out more about this.
Sometimes, Cocoa already offers built-in concurrency. You can tell a file handle, for example, to read in background and notify. And Cocoa will spawn off the thread and do the read in the background. The pulsing button animations that occur, say when you bring up an open panel and the OK button is pulsing, those are occurring in a background thread. And those things are taken care of for you. The ideal situation for you as a developer, of course, would be for Cocoa to do a lot of concurrency, to do a lot of use of the separate cores available in the machine. So you can just write your straight line code without having to think about any of these issues.
The problem comes about because when we need to call back into you, we don't know that you're thread safe. For example, I talked about sorting an array earlier. We could split the array in the sorting array function. We can split the array into two, sort the two halves, and merge the result back together. But the sorting process involves calling your compare function. Now, it seems like compare functions, compare methods, whatever, should be immutable. They shouldn't be changing things as a result of being called. That is, nothing should change out in memory as a result of the compare operation. But we don't know that. That's a problem.
Another example would be, say, in computing table views row update information or in saving documents. But in the case of a table view, it needs to compute its row information, row updates, and so on, what goes in the boxes, for example, by talking to the data source. The data source comes from you. We don't know that it would be safe to start calling the data source concurrently in multiple threads. Saving a document, another example.
But we don't know that the delegate won't be surprised when it's called on a background thread to actually go and package up the data and do the save operation. So one thing we want you to think about going forward here is making perhaps chunks of your code like delegates and data sources or compare operations. Things where you give, you sort of hook in to the Cocoa system. Think about making those threads safe. And so in the future, when we add, say, suppose we add a functionality like setThreadSafeDelegate on a particular class, well, you can then, at that point, immediately take advantage of that kind of new capability.
So I've talked about a lot of things here, but very shallowly. Of course, I didn't have time to go into any real detail, unfortunately. But it seems like in the future, multiple cores are the way that processors are going. And to use multiple cores, you have to use multiple threads because thread is the concept that the operating system provides to address work to those different cores.
But there are a lot of complex issues. Cocoa has some APIs to help you deal with these multithreading problems and the various issues. And in the future, we expect to have more and do more. And we're ourselves investigating these things ongoing. But to get-- real nice performance improvement in the future, you're gonna have to start being threading, doing threading and being multi-threaded. And perhaps after a few years, when we have these eight processor systems, your apps will be just so much, so much better than they are today. Thank you.