Cocoa Performance Techniques - WWDC 2004

Application • 1:26:45

Customers value application performance and responsiveness as highly as great new features. This session will explain, through example code and demos, how to increase the performance of your application. We explore a variety of performance topics and techniques, such as view display optimization, and how to organize your data to help you develop fast Cocoa applications. This is an intermediate-level session.

Speaker: Troy Stephens

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Hello. Welcome to session 434, Cocoa Performance Techniques. My name is Troy Stephens. I'm a software engineer in the Cocoa Frameworks group at Apple. And I'm here today to talk to you about performance and optimization, specifically as they apply to developing Cocoa applications on Mac OS X. Now, as we all probably know, performance is work and diligence. But it can also be exciting and rewarding.

It can pay off in applications that your users want to use. If there's one thing that users love, it's a well-optimized, responsive application, one that not only empowers them to do something of interest, but is ready to respond to their requests and provide feedback with quick turnaround to interact with them in a way that is responsive. So optimization.

[Transcript missing]

So turnaround is part of it. And one thing that you want to do is figure out where the opportunities for optimization in your application lie. And how do you do that? How do we figure out where to do things? Well, it's partly a matter of metrics, of using measurement techniques and measurement tools.

It's partly a matter of having some general ideas about optimization that may apply. You may have brought these from other platforms that you've worked on. And it's partly a matter of knowing the frameworks that you're working with and the platform that you're working with, knowing the various capabilities that it provides, their performance characteristics, and so forth.

So that's what I want to talk to you about today. The other issues, the platform-independent issues about performance, are easily covered elsewhere. They're comprehensively covered elsewhere. Today I want to talk specifically about Cocoa. And in particular, I'll try to point out to you some common performance techniques, some issues that you may encounter commonly as you're developing Cocoa applications, things lots of people run into.

I'll try to point out some API usage techniques and recommended usage patterns that you can use to help improve performance. And we'll also look at alternative ways, different ways that Cocoa provides a variety of ways to perform a particular task that give you the opportunity to choose the way that is most conducive to getting the best performance.

So as prerequisites for this talk, I assume only that you have some general experience developing in Cocoa, some familiarity, more or less, with the various APIs and classes and capabilities that it provides. And if you have some general knowledge of performance techniques that you've brought from another platform, that could be helpful too, but it's not strictly required.

So first, before we dive into the Cocoa specifics, I want to cover some general concepts, just to get us into the right mindset for thinking about optimization. There's a lot of wisdom that's been accumulated from various other platforms that we all have worked on. Some things about performance are constants, and we can learn from that.

So one of the most important things about performance tuning is it involves trade-offs, some give and take. On the one hand, you have this ideal of achieving high performance in your applications, creating a responsive application. But there are a number of various areas in which you may need to give a little and compromise. There are trade-offs involved.

One of those trade-offs is resource usage. A common technique for producing higher performance is rather than recalculating results, say caching it. You use a little more resource, keep some block of memory around, and reuse that. You're trading memory for performance in that case. There are lots of other examples of that with other types of resources.

Engineering time is always at a premium. There's never enough of it. And there are any number of aspects of your application that you can spend time tuning. You don't want to spend time optimizing every little bit of your application, because in many cases that can be premature, and it can lead to more complex code that's just completely unwieldy to maintain. So since engineering time is a limited resource, you need to really focus in on where you'll get the most benefit from the performance optimizations you're doing.

Convenience and simplicity. A lot of times, though not always, the most convenient way to do something is not always the way that is conducive to the highest performance. So sometimes you have to make your code a little more complicated to optimize it for the special case that it's really going to have to handle so that you can get maximum performance out of it.

Loose coupling, flexibility to change. These are the hallmarks of object-oriented programming. And to some extent, you can achieve performance while still retaining some loose coupling. But the essence of programming and developing an application is, in some sense, making assumptions, defining the assumptions of your problem domain. So often, the more assumptions you deny yourself the ability to make, the more general your code is, and the fewer the optimization opportunities you're really taking advantage of. So sometimes you have to give up a little flexibility in order to hardwire your code for the fastest path for the best performance.

Lastly, strict correctness and safety. Almost all the time, you want to make sure no matter what your code is correct and does the right thing. Only a few cases can you actually get away with approximating. But there are some cases, such as when you're drawing. If you're drawing something that doesn't need to be exact, that maybe speed is more important, you can approximate in cases like that.

And also, in terms of safety, if you're developing generic code, let's say, that you're going to put in a framework, and you don't know who's going to use it yet. It might be used by some unknown application or applications. You might not know whether those applications will need those objects that you're providing to be thread safe, let's say. So if you have to provide locking and thread safety, that's a whole other burden that can affect performance. You have to be able to balance the desire for performance against these other competing demands.

So there are some general recommendations we can make about performance. One of the most important things that you can do is to choose a scalable application architecture. Think about the ways that your application is going to be used. Try to anticipate the numbers of objects that your application is going to have to deal with. And choose appropriate algorithms and data structures. On any platform, this can make a huge difference in how your application performs. Also, when you're working with the framework APIs, don't fight the framework.

Try to leverage as much as possible what the framework provides. And in terms of Cocoa in particular, where Cocoa provides a facility to do something, it's in many ways to your advantage to leverage that facility to its maximum capability and use it, because then you have the benefit of a whole team of frameworks engineers at Apple who are constantly making not only performance improvements, but also enhancements and functionality.

And you get to inherit all of that for free. Also, wherever possible, when you're defining your own internal data structures and your own objects, try to use API-compatible types where possible. You notice, for example, in APKID APIs, we largely use NSArray. I don't think we have any APIs, maybe a few, that use NSSets.

So if you use a lot of sets, let's say, for example, in your code, in your model code, you may find you have to do a lot of conversions between your internal representation and what the framework expects every time you make a message send that uses some particular type of data. So try to avoid that sort of type impedance mismatch by using API-compatible types.

And you can do that with a lot of other types where you can. Also, try to keep in mind easily overlooked costs, things like heap allocations, including allocating and deallocating objects. Heap operations do take some time. There's some overhead involved in that. We don't normally think about that.

We kind of think that it's just kind of magic. We ask for some memory and get it for free. Also, keep in mind indeterminate time operations, particularly synchronous operations, such as synchronous file system and network operations, things where your application is going to block for a fair amount of time.

You don't want to freeze up your user interface, obviously, while the user is waiting for some file system or network access to happen. So if you have things like that that may block for an unpredictable amount of time, try to make those asynchronous. And we'll look at techniques for doing things like that. Consider caching expensive results, where that may help. If you have something that you do, that you compute, that you draw, that may take a lot of time and that you may reuse many times before you have to change it again, it may pay off to cache it.

Keep in mind at the same time that it may not pay off to cache it. If the framework, say AppKit, is already doing some caching for you and you add your own caching on top of that, you may find that there's not a net win, that you've actually added just another layer of extra code that doesn't necessarily have any effect.

So you want to measure to see whether the caching actually helps. But in general, caching can be a useful technique. Also, lastly, consider deferring operations when you can. And this is a good one. This is an interesting and powerful technique, and so I'll go into a little more detail. Think about deferring operations.

The benefits of deferring operations include not having, obviously, not having to perform an operation immediately. If you don't need the result right away, well, maybe you don't have to perform the operation right away. Maybe you just need to make a note somewhere that, hey, I need to compute this later. You obviously then don't incur the cost of the operation right now.

And if you have a number of requests, a series of requests, that come in maybe from different parts of the system to perform that given operation, but they don't need the, again, don't need the result immediately, you may be able to save doing that operation several times by coalescing those requests, just setting that flag. And every time you set that flag, well, it's a constant time operation, and it's really cheap.

And then later, when you actually need the result, you can go back and say, oh, I need to compute this. And there you've got it, and you only compute it once instead of several times. Lastly, if you're really lucky, you may never have to incur the cost of the operation at all if the result isn't needed. So try to think as you're coding about whether what you're computing really needs to be done.

Is this really necessary? Right now. Now, on the flip side, one of the costs of this is that when you do ask, when some part of the application does ask for the result later, it's not going to be there ready to go. The computation has to be done then.

So obviously, you want to use some discretion in applying this technique, but it is very useful and powerful. We use it, for example, in the app kit when we're drawing views. You're familiar probably with the set needs display mechanism that we recommend. Rather than asking views to display themselves immediately, you tell them, hey, you're dirty in this area. You're going to need to draw here. And you can have potentially many requests to mark different parts of the view dirty as needing display at some later point.

And all that drawing doesn't happen normally until you get back to the run loop, to the top of the run loop. And then we look and say, oh, it's part of the window dirty. Yeah, OK, we've got to redraw the parts of the window that are dirty. But we can do it all at once.

And so we can have thousands of requests to dirty different parts of the window satisfied by a single drawing pass. Similarly, in your own applications, if you break your application into components, if you do initialization in stages, that can be a useful way to improve performance by deferring results, operations, till later. Lastly, some other things to keep in mind.

Some of the techniques that I recommend here, or point out to you rather, may not be appropriate in certain situations, may actually degrade performance. And the only way you can really know, although our intuition as engineers is very useful, the only way you can really know whether an attempted optimization has had a positive effect is to measure. And remember to not just measure after, but measure before, so you have something to compare with. That's the only way to really know whether you've made things better or worse. And often the result is surprising.

So along these lines, we provide some very powerful tools for you to use, including Shark, which is covered in another session. Any of you who are here right now and not in the Shark session, I recommend looking into Shark on your own later. It's a very powerful tool, enables you to sample not just your application, but performance of the entire system while your app is running. Also, if you happen to be using OpenGL, OpenGL Profiler is a tremendously powerful tool in getting better. So be aware of these tools and use them to measure to find out what's really going on in your apps.

So OK, now we're ready to get-- we're in the right frame of mind. We're ready to get into Cocoa in particular. I'm going to divide the rest of the talk into two general parts. First, we'll look at optimization techniques and API usage techniques that apply to the APIs that AppKit provides. And then we're going to dive deeper and look at some foundation-related issues.

So first, the AppKit-related stuff, things that I refer to as user interface issues. and a number of techniques I want to cover with you today, including reducing launch time, taking long operations, as I was saying earlier, making them asynchronous, optimizing drawing and scrolling in various ways, and so forth.

So what about reducing launch time? This isn't exactly, maybe strictly, an optimization technique. This is a matter more of deferring operations until later. You can actually use both techniques here. Remember that the user's experience of using your app begins with launching the app. The sooner your application can be ready to go, the happier your user is going to be. So there are a number of ways you can reduce launch time. Obviously, brute force-- if there are things that you do at launch time and you can optimize them to run faster, well, obviously, that's one way to do it.

If you can defer loading of data, initializing different subsystems-- if your app has a lot of different subsystems and maybe your user isn't going to touch them immediately, certain features they may not use even in the session of launching the application, using it, and quitting it-- you can actually defer initialization of those subsystems if it's costly until later. And one technique you can use for doing this, you can use nib files and plug-in bundles.

These are two very powerful capabilities and facilities that Cocoa provides for factoring your user interface and your code. You can actually dynamically load code in bundles at runtime. You can factor your application into components that are loaded in as needed. And that also, incidentally, provides a way for other developers to be able to extend your applications, potentially. So plug-ins are great.

Where you perform your initialization may be important. You all know Awake from Nib-- this is our friend-- when we instantiate objects in Nib files, and we load those Nib files at runtime, Awake from Nib is sent to every object at a stage after all of the objects' interconnections, as were made in the Nib, have been established. And the network of objects is intact. They're ready to go and wired up. That's usually a very good and convenient time to perform initialization.

One thing about that is that Awake from Nib is usually sent, say, for your main menu Nib, or for your document Nib, if you're writing a document-based app. That's sent before the UI is brought on screen. So any initialization that you do there is going to make the user wait, potentially, if it takes a long time. A lesser-known place to perform initialization that's very useful is ApplicationDidFinishLaunching. This is a message that is sent either to the application's delegate, if it has one, or to any observer, any object that registers as an observer of the NSApplicationDidFinishLaunching notification.

It is sent significantly after the UI is already on screen, after any files that were requested to be open when the app was launched have been opened, and the run loop is ready to begin handling events. Your app is ready to go. So if there's initialization operations you can perform later at ApplicationDidFinishLaunching, that's a useful place to hook in.

What about long blocking operations that may block your user interface? Remember, in a Cocoa application, by default, unless you do otherwise, you basically have one thread on which everything happens-- your event handling, your drawing, any processing that your application does when I click this button over here, and so forth. And so any long synchronous operation that you perform is performed on the main thread and blocks your user interface normally.

So there are a couple of ways to deal with this. We can make the operation take less time, again, obviously. Or you can use any of various techniques to move the operation into the background. And a few of the facilities that we provide for doing this are asynchronous notifications, which are also sometimes called idle timers, run loop observers, and threads. Let's look at those in some detail.

Asynchronous notifications are simply notifications that are added to a notification queue. They are enqueued with the posting style of post when idle. These notifications are handled when the run loop figures out that the application is sitting idle and there's time to do something. It's just sitting there spinning, waiting for something interesting to happen.

Optionally, you can ask for such notifications to be coalesced. So if there are a number of different places you want to post them from to stimulate this idle activity to happen, whatever activity you've designated to happen in response to the notification, you can have them coalesced either on name or on the notification sender on the poster.

Another powerful technique that exists actually in core foundation-- remember, you can use core foundation from your Cocoa apps-- is the CFRunLoop observer. And what this lets you do is hook into not just your run loop, but at the idle time point, but at any point you want, really. And we have a number of different options here on entry to the run loop, before timers are handled, and so forth.

So you can hook in-- actually, this is a mask. So you can specify several different places that you want to hook in. And it's basically a mechanism for you to provide a callback function that will be called when the run loop reaches these various stages of processing. And then you can do whatever you want there.

Well, these are two very powerful techniques. But obviously, they're still doing their processing on the main thread. They're maybe a little more friendly to responsiveness of the application, because they're letting you do things sort of when the application figures out that it's idle. And if you don't do your operations in two big chunks, then you're OK. But what if you have a long operation, and there's no two ways about it? You've got to find another way to do it, and keep it from disturbing your UI. Another possibility is to use a background thread.

spin off another thread. This is supported through the NSThread class, and in particular, is very easy to do in Cocoa. The hard part, as with multi-threaded programming on any platform, is getting synchronization right and synchronizing object accesses and getting thread safety. But spinning off the thread is very easy. NSThread provides the detached new thread selector method. You specify a message that you want to send, the target object. You can optionally specify a parameter object to be passed in.

And then here we have a method that basically is the thread. It becomes the thread. When the thread is spun off, that method begins execution, and that method can run forever if it expects to exist, if it wants to exist for the lifetime of the app, or it can quit out if it's done processing, and then that terminates the thread.

And one thing that you want to do here is put an auto-release pool in there so that any objects that you auto-release during the course of processing in your background thread do get cleaned up. Otherwise, you're going to get messages in the console. When you're just doing main thread programming, you get an auto-release pool for free that's automatically provided for you. So we'll see that again.

So as I said, the main thing to do when you are doing multi-threaded programming, you have to make sure that no two threads are simultaneously accessing and especially trying to mutate a given object. We provide locking semantics for that. And another thing that you need to do when you're doing the processing in a background thread, obviously your main thread at some point wants to know about, well, what were the results of that processing? They usually affect something on the main thread, maybe some UI. And you need a way to communicate back to your main thread what the results of that processing was and maybe even some intermediate results.

So there are various ways to do that. The easiest way to do it is to use performSelector on main thread. This is a method that's defined in a category on NSObject. You'll find it's hidden away in nsthread.h. If you don't know about it, this is a great, powerful method.

And what it enables you to do is not just send a message to the main thread, but it schedules your message so that it's performed. It doesn't interrupt the main thread while it's in the middle of doing something else. And so you can pass objects across thread boundaries fairly safely. So this is interesting stuff. And to go into this further, let's see what we have cooking on demo one.

So here I have a document-based application that I call Scrapbook. And it's fairly simple. Got my model view controller divisions here. Basically what it does is it gives you a page on which you're able to arrange a number of photos. And you have a page view that you can use to see that and visualize it. So let's see. I've launched the application.

is a clean new document. Let's drag some pictures in. And it loads them up, and we can drag them around and move them. Simple document-based application. And this is a custom NS view that I simply wrote that displays the photos when it's asked to draw. Hi, everyone. I'm Troy Stephens. I'm the founder of Cocoa Performance Techniques. It's loading it. Oh, there it goes. Well, that took a little while. Let's try that again. Drag the document down.

It took a few seconds to do it. Now we're running, fortunately, on a dual-proc 2 gigahertz G5 here, but you can imagine easily on a machine that's more memory constrained or that's slower that this could take a lot more time. And especially if we only have two dozen photos here, that's not a whole lot.

So how could we make this a more satisfying experience for the user? How can we make it so that the document comes up more immediately? Sure we can. And in fact, here we have a multi-threaded version of Scrapbook that's not much more complicated. And when I run it--

[Transcript missing]

Here we go. And we just display a placeholder image in place of each image until the image arrives on the main thread. And if I do this fast enough, I can even grab the window, start resizing and interacting with it while the images are still loading.

So let's look at the code for that. It's fairly simple. Any venture into multi-threaded programming comes with certain warnings about how complicated it can be to get it right and to not have your app crash. But this is, in fact, not very complicated. I've got a background bitmap loader class that's simply a subclass of NSObject that I created here. It's got as its attributes a path queue that keeps track of-- this is basically its to-do list.

These are the paths of the image files that I'm supposed to load. We've got a lock that we use to ensure safe access to the path queue. So any time some code accesses the path queue, it takes this lock and then releases the lock when it's done. I've got a simple flag here to keep track of whether the thread is running, because I don't start it immediately when I create the object. And now we've got a delegate here, which is our connection back to the object on the main thread. In this case, I've chosen the document object as the object that receives the images as they're loaded.

So you create this object with a delegate. You can set its delegate. And when you want to load an image, we create one instance of the background bitmap loader. When we want to load an image, we send this request bitmap at path message, send it the path, and that returns immediately to the caller, has stuck the path in the queue already, and it's on its to-do list for images to load.

We also wanted to have a cancel all requests message that I can send so that if we close the document, let's say, before all the images are done loading, we want to be able to safely abort out of this so that we're not messaging back to a non-existent document object. And in the implementation-- The interesting methods here, request bitmap at path. As I said, we take the lock on the path queue before we manipulate it. We add the object, add the path to the path queue. I actually make a copy here to be completely safe.

The interesting methods here, request bitmap at path. As I said, we take the lock on the path queue before we manipulate it. We add the object, add the path to the path queue. I actually make a copy here to be completely safe. Similarly, for later, when we want to cancel all requests, all we do is take the lock, drain the queue of all the paths that have been queued up, and release the lock.

So this is the workhorse method of the thread, load queued images. We create an auto-release pool here to make sure auto-released objects are automatically taken care of once the method exits. And then I've got a loop here where I basically go until I run out of paths. Each time through the loop, I look and see if I have another path to process. I've got to take the lock on the path queue before I touch it.

I grab the next path in the queue, remove it, and unlock the queue, because I'm done with the queue. So the main thread is now free to add requests to it. And now, if I've got a path, if the queue isn't empty now, I go ahead and basically load my bitmap image rep using NSDatas in it with contents of file, and then NSBitmapImageReps in it with data.

I check whether I still have a delegate, and whether the delegate knows how to handle this message. And I message it back by sending it-- because I don't want to just send it the image. It's going to get the image and say, well, yeah, OK, which image is that? I want to send it also the path to the image. So I just wrap those up in a dictionary. Let's see. I need some code wrap here.

All right. Yes, I mean now. Okay, that's much better. So I create a dictionary. And the only sort of unusual thing to notice here that I'm doing is in terms of the normal object ownership rules, usually you hand something off to another method and you want to auto-release it so that you've discharged yourself of responsibility for releasing it.

But when you're passing an object across thread boundaries, you kind of want to make sure that that object's still going to be around by the time the other thread grabs hold of it and retains it and so forth. So I'm sidestepping the usual rules and I'm alloc-initing this dictionary and I'm passing it across using perform selector on main thread.

I tell the object on the main thread, the delegate, the bitmap has been loaded and here's the dictionary that has the path and the bitmap itself. Wait until done basically says return immediately as soon as you register this message to be sent on the main thread. Just come back immediately.

It's safe to do because I've already made the bitmap info. It's safe to do. It's got a retain count of one. It's going to stick around until the main thread takes responsibility for it. And then that's it. Once I've handed off, I can let go of the bitmap, the data, and the path.

And I'm also using an inner auto-release pool here, because I may be loading hundreds of images. This is an important point when you're writing a secondary thread method, is that I'm loading potentially many images. I don't know how many. It could be hundreds. It could be thousands. And if I'm auto-releasing things, I could end up filling up memory with all these images that have been handed off and are still sitting there waiting to be auto-released. So what I want to do is clean up each time through the iteration. You don't necessarily have to do this every time through an iteration.

You might, if you're doing 10,000 iterations, do 100 iterations of 100 and have an inner auto-release pool only in the inner loop. But basically, every time we hit this inner pool release, we're going to clean up any auto-released objects that were created during the course of that loop.

So that's the most interesting part of it. In the photo object itself, where we used to load the image immediately, we simply load a placeholder image that's small and quick to load, and it's only loaded once because it's part of our resources. And in the page view itself-- Oops, I'm sorry, in the document class itself, we have the bitmap loaded method that receives this dictionary. This is the other side of the gate, and this receives the dictionary passed off by the secondary thread.

It's handed off to us. We get the path and the photo that are in the dictionary. And basically all we have to do here is identify the photo model object that that image belongs to and assign it. And then we tell the page view, hey, you've got to redisplay this photo because something's changed about it. And that's that. We've got a multi-threaded version of our scrapbook app without a whole lot of effort. Let's go back to the slides if we could. Is this going to be a wait-and-fill? It will be eventually.

So what are some other techniques you can use that relate to AppKit components and facilities?

[Transcript missing]

Here's a real simple, cheap optimization. This is an easy one-liner you can add to your custom views. And the reason for the existence of the isOpaque method, I should point out, is that by default, AppKit cannot assume that a view will draw with complete opaque coverage of its background. Our Aqua widgets are often non-rectangular in shape. They have nice rounded corners and so forth. So this is less an issue of transparency as of coverage.

Basically, this is saying, if I'm not opaque, if I'm a custom view and I'm not opaque, I'm saying that I may need some of the background provided by the views higher up than me and the window background to show through to complete my drawing. I only draw some stuff on top of that.

If you don't need the background to show through, declare yourself to be opaque. This saves AppKit from drawing all the stuff behind your view. If you've got this huge document view in particular that's covering your window, if you don't make it opaque, we're going to think, OK, we've got to draw the window background behind it.

OK, now we've got to draw your view. We don't know that you're covering it. We don't know what you're doing in your draw rect. So be opaque when you can. Also, when your draw rect method is called, draw minimally. Try to draw only what we're asking you to draw.

The simplest first-level optimization you can do is simply use and observe the single rectangle parameter that is passed to draw rect that tells you basically a bounding box on the area that needs display. And you can use NS intersect rect, for example, to test against that bounding box.

If you have a number of objects, for example, here, drawable things that we need to draw in our view, this is a common case. For each object, before I go to the trouble of drawing it, which may be costly, see if its bounding box intersects the bounding box that I've been given. If not, don't bother drawing.

On Panther and Tiger, you can further constrain drawing beyond that. Beginning in Panther, we've been keeping a more detailed accounting than we used to of specific regions within a view that need drawing. We can give you, if you ask for it, a list of rectangles that are non-overlapping that specify the area that needs to be drawn.

So you can get this directly using the getRectsBeingDrawnCount method if you want direct access to that rectangle list. And if you use those Rects, you can use the parameter to draw Rect. You can still use that as a bounding box to do trivial rejection testing. You can first check your objects against that bounding box. If they lie within it, OK, well, maybe they're in this list of Rects somewhere.

Maybe they intersect that list of Rects and need drawing. Alternatively, in the common case, like we had before on the previous slide, where you're just iterating over a list of things to be drawn, you just want to test each of those things in sequence and not do anything fancy.

You can use needsToDrawRect, which was a new method provided in Panther, that will do all the testing against this list of Rects for you. It will do the trivial rejection test against the bounding box. So use that. That's sort of the more modern alternative to NSIntersectsRect that we have in this code sample. If you're doing Panther and later, you can use that.

What about speeding up window resizing? Window resizing, obviously, is partly dependent on view drawing. Window resizing is something that heavily exercises view drawing. When the user grabs a corner of your resizable window and starts dragging it, basically, at the worst case, we have to redraw the entire window for every iteration, every step of that drag.

So if it takes your window longer to draw, you have slower updates, then your resize is going to chunk along. On the other hand, if you make the updates shorter and faster, we get a higher frame rate, a much more fluid motion, and a much more satisfying user experience.

So obviously, anything you can do to optimize view drawing will also give you a payoff in window resize. There are also some further things you can do specifically for the case where a window is being resized. Not just drawing your views more quickly, but preserving view content wherever possible.

One thing you can do that's been around for a long time is the concept of live resize, I think since the beginning of Cocoa. A view, when it's drawing itself, can check whether it's being drawn in the middle of a window resize using the in live resize message that's highlighted in orange up at the top there.

If you are in live resize, you may want to take advantage of the opportunity to do some less expensive drawing than you might otherwise do. Because you're deciding, hey, if I'm just going to be drawing a whole bunch repeatedly real fast, maybe I don't need to be pixel exact.

Maybe I can take a cheaper, faster route and trade some accuracy for performance, let's say. We also have view will start live resize and view did end live resize are two methods that you can override to prepare to enter live resize mode and prepare to do any cleanup you need to do after exit.

So those are a couple of additional hooks that you get before you're that all views get this message sent to them before the window starts going into live resize and then after it ends. So if you need to do setup or cleanup, that's where you would do it.

In addition, on Tiger, we've been doing some further optimizations to allow Windows to preserve content when they're resizing. One thing you'll notice when you resize a window, oftentimes a lot of the pixels aren't really changing. They don't need to be redrawn, rerendered. And to help support that, we've added new API in Tiger on NSWindow and NSView.

A window has the capability to be put in a mode where it preserves its content during live resize, and a view that is savvy about this mechanism can implement certain functionality to help support it, to make it possible. This is something that propagates down the view hierarchy. So in order for a view to take advantage of this, all of its super views have to be clued in and know how to use this mechanism. And we're working now actively to optimize the various views, especially the container views in AppKit that will help support this.

And an example of how you would use this, if you have a custom view class, My View, You override the preserves content during live resize method. This is a simple thing like isOpake. You're just returning a bool to say, hey, I support this feature. I'm clued in. I'm in the know. And I know what to do to help support this live resize mode.

Then you override, usually, setFrameSize. Or if you have some other tile method, sometimes you'll override a different method to do this. But usually, setFrameSize is a good point to do this. You hook in. You pass on the message to super first to do whatever normally would need to be done. And then in addition, you can check whether you're in live resize mode. This is, again, that in live resize message.

If we are in live resize, then what we become responsible for doing, if we have declared that we support this feature, is dirtying the parts of myself, of the view, that will need to be redrawn to accommodate the new size. Usually, when you're redrawing in a window, if your view's content is pinned to the upper left corner of the window, which is the case that we support right now, then usually you have sort of an L-shaped update region on the right-hand side and the bottom side as the user is resizing the view larger. So you have these two strips that need to be drawn. And in fact, AppKit computes these and figures them out for you. So all you really need to do at this point is get the rectangles from AppKit-- and we have this method-- getRectsExposedDuringLiveResize, count.

And you get back the count of the number of rects, and you get back the rectangles. It's guaranteed to be never more than four rectangles. And then you can iterate over that list of rectangles and mark yourself dirty in those areas. So this is something that you do only during live resize. When we're not in live resize, we simply do a full set needs display to require the entire view to be redrawn. So that's kind of not so interesting in code. Let's look at a demo and see how this works.

So I've got my old scrapbook app here. And one of the things you may have noticed as I started resizing my view is, boy, it's really chunking along there. I mean, that's just too slow. Now, to be completely honest in disclosing what I'm doing here, I'm not exactly trying hard to be very smart about the drawing I'm doing. We've only got two dozen photos up here. But what I'm doing is, these are full-size images in memory.

And I'm scaling them down every time I draw them. So I'm doing sort of an expensive, unnecessary operation. If I really wanted to make this fast, one of the things I might do is resample the images as they're loaded down to thumbnail size, since that's all I'm drawing them as. I don't need the full detail. But you could suppose that you have some other objects that are complex and expensive to draw.

So this is just kind of embarrassing. On a dual-proc G5, we shouldn't be chunking along this slow. And in fact, if we run Quartz debug, This is a useful feature for gross display speed measurement. If you're not familiar with it, there is a frame meter in Quartz Debug that's available from the Tools menu. Show frame meter.

And you can use it as you-- tells you basically your number of frames per second on the red gauge there as you move windows around and as you resize. And if we start measuring this, just sort of informally dragging it around. I'm not even breaking 10 frames a second there. This is really embarrassing. So what can we do about this? Start by dragging OK Scrapbook to the trash. And luckily, I made a copy of it first, and we've got this better scrapbook.

And what I've done in my page view This is my custom document view class. I have overridden, just as I said, preserves content during live resize. I've got an IVAR for this setting so that I can show it in the demo, but basically we'd always return yes if we know that we want to support this feature.

And then in set frame size, just as I showed you, this is almost an identical snapshot of the code that was on the slide. We figure out if this option is turned on for the demo. OK, yeah, we want to preserve content. If we do want to do that, we check whether we're in live resize.

And just as a paranoia check, I checked whether we actually support this message in NSVUE in case I was running this on Panther. Remember, you can do these kinds of checks so that you can take advantage of a feature only if you're running on the newer version of the system. We get the rectangles being exposed. We mark them as needing display. And let's see if that was even worth all that trouble of two methods.

"So we've got the application here, drag the album in, we've got our background image loading happening, and So again, slow before. If I go here and turn on preserving content during live resize-- Look at that. That's just silky smooth now. We go up to 60 frames a second. So that was significant in this case.

And in fact, while we're at it, before we leave the demo machine, let's look at another optimization that I put in there. So we've got preserving content, but supposing also that you just want to take advantage of in-library size, as I was saying, to just do simpler drawing. I've gone to an extreme here.

And I've got a mode where I can have my view just draw the photos as simple outlines, just real simple stroked boxes during live resize mode. I enable that optimization, and as soon as I-- things are normal when I'm just dragging things around-- but as soon as I grab the window corner and start to resize, everything changes to boxes.

And I can resize all I want. Obviously, this is extremely cheap drawing, so I could be on any old machine. Even if I'm on a limited memory iBook or something like that, this would be very fast. And this technique is usable way back on earlier versions of OS X.

So you probably wouldn't want to go to this extreme. I'm sort of doing it to make a point that you can do very different drawing, entirely different drawing if you want, or you can do somewhat similar drawing is the more likely case, that is somehow less expensive and approximate in some way while you're in live resize mode. So basically, you can do anything you want in this mode. And what I've done-- to implement this is fairly simple. There we go.

Here we have the draw rec method. And I guess we didn't look at this before. I'm doing some of the other optimizations I recommended. We're getting a list of recs that we're drawing so that when we go to do our background fill, we can only fill that list of recs instead of trying to fill the whole view. We do clip for you when we're drawing just specific areas of the view.

So even if you don't obey that list of recs, even if you just look at the bounding box or just try to draw all over the view, we will clip your drawing just to the area that drawing is being requested in. But you still save something by not even issuing those drawing commands in the first place when you have a choice about it.

So we're drawing the background fill using that list of recs. And the interesting part here-- is if I have this Draw as Outlines in Live Resize mode enabled, and we're in Live Resize mode, then instead of drawing the whole photo, I'll just get the frame of the photo, And I'll just use BezierPath to stroke erect and white where it would be.

And that's pretty much all we have to do. The only other thing that I need to add is in view will start live resize and view did end live resize, I need to tell the view to redisplay itself. Because when I grab that corner of the window and I start dragging, if I don't do this, then I will get the complete photo drawing on the one hand.

And then as I start dragging, new stuff that appears will be drawn and outlined. And that looks kind of inconsistent and weird. And same thing when you're exiting live resize mode. You want to redisplay the whole view at that point to go back into the old mode and reflect the change. So OK, if we could go back to slides, please.

Thank you. Scrolling-- not a whole lot to say about scrolling, except that obviously it hinges somewhat on view drawing performance. And in some sense, it's related a bit to window drawing, window resize performance, because it exercises view drawing intensively. So if you have a document view that you're going to put in a scroll view, and you want your scrolling to be smooth and fluid, one thing you can do is make sure that your document view uses these facilities that I pointed out earlier, and can efficiently draw small bands of its content.

Because those are the kinds of requests that you're likely to get during a scrolling operation-- lots of small little incremental scrolls, where you just have to draw a little strip. And the more efficiently you can determine, oh, I don't need to draw this, I don't need to draw that, all this other stuff that's not in that strip, the more quickly you can draw and the smoother your scrolling will be.

Another little feature of Scroll View and Clip View to be aware of, by default they are opaque. They are set to draw an opaque background behind the document view. So if the Clip View area is sized bigger than the document view, you will get opaque fill all around there.

For best performance, leave it that way, unless you really, really want that outside margin area to show through to the views and the window behind. Because it does significantly impact drawing performance and scrolling performance within the Scroll View. So that feature is there to use it. Be aware that that has a performance cost.

String drawing, we provide very convenient methods as categories on NSString. You can draw a string at any point or within a bounding rectangle and supply a dictionary of text attributes, and we'll just do that for you. I mean that's very little work to draw a string. That's a one-liner, right? It's very convenient. But note that when you invoke these methods to do this convenient string drawing, what we have to do under the hood to implement that is potentially set up the text system, wire together the various objects.

If you've worked with the Cocoa text system, you know there's a lot of different things involved, text containers, layout manager, and the text view itself and so forth. So we've got to figure out how to lay out the glyphs for the string that you've given us, and then we've got to do the actual rasterization or drawing of the glyphs.

This is a lot of stuff going on under the hood. Now this is an area that's being actively optimized. We do provide some degree of caching. Obviously, it's sort of a general caching. We can't know what the behaviors of your applications will be because every application is different. But we do attempt to avoid these costs.

So you should see this improving on Panther and Tiger. This is continuing to evolve. But it may still be advantageous to use an old technique that was discussed years ago and is illustrated, still best illustrated by the worm demo that's available in developer examples app kit. If you're repeatedly drawing the same string-- with the same layout-- you need only compute the glyph layout once.

The glyph layout isn't going to change. And so for that reason and others, you can get some performance benefit by assembling your text objects and hanging onto them. If you just do a little manual mucking with the text system, allocate your text storage object-- text storage is an object that contains both the characters and the attributes applied to runs of those characters. Allocate your own layout manager. Allocate a text container.

Wire them together and hang onto them. Keep them around. And use them. Every time you need to draw the glyphs, then all you have to do is message the layout manager directly. Say, draw glyphs for glyph range. And if you play with the worm example, you'll see that this makes a tremendous difference when you're rendering, especially on older versions of the system, when you're rendering text repeatedly.

Bezier paths-- not a whole lot to say about Bezier paths. But if you're drawing really complicated Bezier paths-- I'm talking about paths with hundreds or thousands of elements-- be aware that there are crossing calculations that have to be done. And the scaling of the algorithms that necessarily need to be used to get the closures right and get everything right with the Bezier path rendering, it doesn't scale too well when you get to very complex paths.

So one thing you can do, if you don't need to get-- if you're drawing paths like this that are very complicated, if you don't need to get exact, exact results, you can split those paths into segments and render them in sequence. And it looks like pretty much the same-- it looks like pretty much the same thing, except you get a big performance win from doing that. Also, if you have paths that you're creating over and over again, the same path, drawing it over and over again, stroking or filling it, keep them around.

And this applies in general to a lot of different types of graphics objects. If you create them and hang on to them, there's a lot more caching that we can do and that other layers below us, like Core Graphics, can do to recognize, oh, yeah, that's the same thing I saw before. I just draw that again. Object allocation, initialization, deallocation.

There are a couple of interesting things to be aware of with relation to NSImage. SetDataRetained, we have this notion of where an image is able to reference its data source rather than actually containing the image data. If you initialize an NSImage using either a NIT by referencing file or a NIT by referencing URL, you will get an image that, for example, when it's archived, instead of archiving itself with all the image data that it loaded from that file, it just archives the reference to the file or URL. So that creates a very compact representation. Some people use it for that purpose so that they can maintain the reference rather than just copy the image in.

But be aware that if you're drawing that image repeatedly and you are in particular using it for a specific purpose, you're going to have to be very careful about that. If you're drawing a particular image and you're in particular changing its size, this may cause NSImage to go back to disk or go back to retrieve the image from the URL every time that it needs to resize it, recache it at a new size to draw. Remember, NSImage does do some caching in the background that's transparent to you.

So if you're using a referenced image and you're resizing a window and things are chunking along, this is a thing to look into. Check whether this setting is on and be aware of it. And if you need to resize an image, try and resize it once and get the size you want and leave it that way. There's also this notion of caching separately.

When you've used NSImages, if you've ever dumped the debug description for an NSImage, you may have seen cached image reps. And you may have seen these. And they're part of the public API. A cached image rep is a representation of an image that usually is generated from something else, some drawing you've captured from a view, or from resizing an image that you originally loaded as a bitmap.

A cached image rep is an image that lives in an offscreen window somewhere, and that's where the Windows server is keeping it in a screen-compatible pixel format, usually ready to go. This is sort of an optimization, keeping these cached images. Now by default, an image is not cached separately in its own window. We have the possibility for NSImages allowed to cache images together and to just sort of grow this shared window or shared windows as it sees fit when it needs to load additional images.

If you have large transient images that you're loading using NSImage, and you load them, and then they go away, and you're not going to replace them with another large image, and you want to avoid the potential resource usage cost of this cache window growing and growing and growing, you may want to set cached separately to yes, to force the use of a separate offscreen window for that particular image.

That image will stand alone in the offscreen cache that the Windows server that NSImage uses. So this is a good strategy to use for transient images. So that's issues that generally pertain to AppKit APIs. Let's dive in deeper now and look at things that are more at the foundation level.

In particular, we'll look at notifications. These are venerable and widely used, and still very powerful and important mechanism in AppKit. Ways of accessing collections and strings are sort of a kind of collection more efficiently. Using auto-release and the memory management mechanisms that ObjC and Cocoa provide. The immutability concept and also techniques for working with property lists.

So notifications, first of all. Notifications are a very powerful and flexible mechanism for a given object to broadcast, volunteer information about some interesting event or potentially interesting event that's occurred in it to any number of other observers that the object isn't aware of. The observers don't need to be aware of one another. They can all register with the central notification center. And the notification center mechanism provides sort of the loose coupling that allows all these objects to communicate about a particular topic.

without necessarily knowing about one another. As an observer, you can subscribe for a specific notification posted by a specific object, or you can cast wider nets and ask for all notifications posted by a given object or all instances of a particular notification, regardless of the object that posted them. You can also use the distributed notification mechanism to pass notifications across process bounds. This is a very powerful general facility. It's been in the app kit since the beginning.

Troy Stephens But there are certain performance issues associated with its use, particularly abuse. One thing to be aware of is that when you post a notification, there's a cost there, even if nobody is listening. Obviously, we've tried to optimize this to make it a fast path within the notification center's dispatch mechanism. But even if nobody's listening, there's a certain cost involved in checking to see whether anyone is listening. And this gets to be a problem in particular if you are sort of speculatively writing classes if you're not listening.

Troy Stephens But you're writing code maybe that's going to go in a framework, once again, that's sort of intended for general reuse by potentially unknown clients, clients you may not know about right now. You may not know, well, what sorts of notifications might they be interested in? What kinds of changes in me might they want to know about? So you get into this problem of sort of speculative posting. You tend to err on the side of caution and post notifications for all kinds of things, potentially, just in case somebody might be interested in listening for that.

[Transcript missing]

So what can you do if you are going to use notifications to get around this? In general, help us help you. Be selective as much as you can about what kinds of events you really need to broadcast to the world. Be as specific as you can as an observer when you're registering for notifications.

If you know the name of the notification you want and you know the object, register using a non-nil name and object. These types of entries in the dispatch table are, in general, easier for us to optimize. You can add yourself as an observer with nil as the name and just the object. You can also, remember, as an alternative to that, you can add yourself several times as an observer once for each notification, depending what version of the system you're running on that may or may not be a performance advantage. So measure to see if that's an issue.

Also, obviously, the notification handler methods that I mentioned, the notification handler methods that I mentioned, they're not necessarily the same as the notification handler methods that I mentioned. that block while they're processing and don't return control to the poster until they're done, obviously make these as efficient as possible. And if they don't have to do work right away and you can defer it, defer that work.

And as I said, avoid repeated removal and addition of observers if you possibly can. One way around this is to consider using a longer-lived intermediary object that keeps track of all of these transient objects and directly messages them when it receives the notification. As an example of this, let's say we have an originator object that volunteers some information about itself by posting a notification.

It posts the notification to the Notification Center. And if we do this in a really simplistic way, we'll have all these transient objects here. They're coming and going. They're adding themselves as observers to the Notification Center. They receive the notifications directly from the Notification Center when they are posted.

An alternative to this is to drop a longer-lived object in the middle there and have it be the only observer that is added to the Notification Center. It's maybe some object that knows about all of these other transient objects and can more efficiently dispatch this information to them by, say, sending a simple objc message to each object.

You've got less overhead there because you've got fewer entries in the Notification Center dispatch table at any one time. And also, you're not adding and removing them and churning that table over and over again. So this is one technique you can use if you're seeing a lot of load in the Notification Center. Consider using intermediate objects.

As an alternative to using notifications, you may not have considered key value observing and key value binding. These are powerful technologies that were introduced in Panther that basically enable one object to observe value changes in an attribute of another object or bind one of its attributes to the attributes of another object. You may not have thought of these as alternatives, but really they're just sort of another way of propagating changes through your application.

One advantage to this approach is you don't have to anticipate what others might be interested in. You just have to make your accessor methods key value coding compliant. There's no performance penalty for unobserved changes because you're not posting, trying to anticipate what others may be interested in. There's no overhead involved for an object whose attributes aren't observed or for a particular attribute that isn't observed to change.

So obviously that lifts the burden from you, too, of having to figure out what others might be interested in. And it facilitates a similarly loose coupling to what notifications provide. I'm not saying that notifications are going away by any means. They're hardwired into a lot of what we do in the kit. We have notifications that objects advertise. They'll continue to provide those.

Like NSView provides view bounds did change, view frame did change notifications when its geometry changes. There are lots of things that are built on that. But if you're writing new modern code, you might want to consider KVO and KVB as another way to go if notifications aren't appropriate performance-wise.

Also, obviously, if your problem is simpler and you don't have a whole lot of objects you have to notify, consider the delegation pattern that we use throughout the app kit and foundation. Just have an object have a delegate that it passes off a message to. It's a simple Objective-C message send to a single receiver with the same cost.

[Transcript missing]

If you have an ordered collection, and specifically an NSArray, you can use the objected index method and get objects range to directly access the elements much more efficiently than you might be able to with an enumerator. And in particular, if you're doing random access, this is the only way to really do it.

So if we're iterating through an array using these methods instead of using an enumerator, one thing you want to remember to do is cache your count. And in fact, this goes for anything that any properties of an object that are invariant over the course of an iteration. Move the getting of those values outside of the iteration, and you save time. In other languages that you may have used, this may not be as important.

But the Objective-C compiler is not able to know that, for example, if I had written i less than bracket an array count bracket there, the compiler would not be able to know. It would have no way of knowing that the count is not changing. It's got to send that message every time it checks the termination condition for the for loop. So keep that in mind. If there are messages that you send to get invariant quantities, cache those instead of using them each time through the array, instead of asking for them. for them each time.

Calling a method through an imp. I don't know how many of you are familiar with this technique. It's been around for a long time. It should be used with care, as for one thing, it breaks polymorphism. It only works when all of the objects that you are sending the message to are in general of the same type. Actually, they really just need to have the same method implementation. So usually, maybe a common base class would be sufficient for a given message that you want to send them.

What an imp is, is an instance method pointer, a pointer to an implementation method for a particular message. You can get that and then use that as basically a function pointer to call that method directly. And this is particularly useful where you have a whole bunch of objects of the same type, and you want to really minimize the overhead of communicating with them.

Here in this code sample, we say have a do something useful with message that we want to send to all the objects in an array. First thing we need to do is get the imp for that message. And I take the object at index zero as an arbitrary object in the array. I'm assuming that this is a heterogeneous collection, I mean a homogeneous collection rather, excuse me. And these objects are all going to have the same implementation. This code will break otherwise. So I get the imp pointer. So basically now I've got the function pointer.

I don't have to go through the obc dispatch tables for methods. I just call that function method directly as if it were a function. So I get my count of the objects in the array. And I'm also here using, it's not highlighted, but I'm using the get objects method, which enables you to retrieve at once a whole bunch of objects. In fact, in this case, all of the objects out of an array. There's also a get objects variant that has a range parameter. That lets you specify a sub range of the array whose elements you want to get.

So here what we're doing is we're getting all of the objects out and we're mallocing a memory block to hold the ids, the instance pointers for all of these objects so that we have them in a standard C array and then we can access them any way we want. It's not essential in this case, but it's something that might be useful if we wanted to do other really fast random accesses. So we get the objects out and then for each iteration, all we have to do is call through the function pointer.

And the important thing here to note is the first two parameters. When you send a message to an object using the usual obc syntax, the target is specified as the first thing in the brackets. And then the selector is sort of collected together as the message to send, the different selector elements if you have more than one parameter.

These are sort of implicit parameters to the method implementation that obc usually hides from you. So when you do call through an imp, you have to provide those yourself. So as the first parameter to this imp that we're calling, we provide the object that's the target of the message.

We provide the selector just in case the imp wants to check these because it can ask for it obviously is going to need self and it might actually check the command that is being sent to it, the selector. So we send those. And then we provide an argument here because in this case our do something useful with needs a something to do something useful with.

So that was collections in general. String operations are, in some sense, sort of a special case. I mean, a string is really just a collection of characters, right? So a lot of the same conceptual ideas apply. In particular, you want to avoid intensive character by character access to a string if you can. Character at index is there if you just need to grab a character or two out of a string once in a while. But it's not really intended for heavy use.

And again, this is the kind of thing that the Objective-C compiler can't really inline these types of things. It needs to send the message each time in order to actually get the character. So as alternatives, look at all of the methods that NSString provides. It's a big header, one of the bigger headers we have.

And there are all kinds of useful methods that can directly access the internals of the string and can do more efficient operations if you're looking for substrings, doing searches through a string, and so forth, or doing mutations to mutable strings. Try to use the NSString methods when there's one available. When you're doing sequential scanning or parsing through a string, searching for substrings, and wanting to keep track of your position in a scanner, remember NSScanner provides a whole bunch of methods for doing that.

And if neither of those facilities provides what you need, you might just want to fetch the characters in blocks. And you can use the getCharactersRange method to do that. Similar thing to what I was doing with getObjects. For NSArray, you can get all of the characters out as Unicode characters into a standard C array. And then you can do whatever you want with them, including using C functions and so forth.

Object management and managing responsibility for objects is a big part of any object-oriented application, and Cocoa applications are no different. We provide a number of convenience methods, factory methods, class methods on various objects for very quickly and easily, with very little code, getting back an auto-released object. It's important to remember that these are auto-released by convention when you use a factory method.

So if I call NSMutableArray array, I'm getting back an array that has been auto-released, so it's been added to the auto-release pool for the current thread, and it's going to have to be processed by that auto-release pool later. This is a lot more convenient than writing, say, NSMutableArray alloc init, and then remembering to release the array later.

It's less code, and so it's convenient, and it's okay to use in the majority of cases, unless performance is really being impacted. If you have large numbers of objects or large numbers of objects, you're going to have to use a lot of performance. numbers of iterations that you're dealing with.

Also remember that there's no need to auto-release a temporary object that you're using just within a method body. It's something you're not returning back from the method. You're just creating it, and you're going to use it temporarily to do some operation and get rid of it. That would be a case where you can alloc init and release instead. Again, not in every case is this really necessary. But if it becomes a significant fraction of what you're doing, you might want to consider that. Keep that in mind.

This is obviously most worthwhile for objects that are created and auto-released in large numbers, and it avoids potential spikes in memory or resource usage. Remember, when an object is auto-released, all these auto-released objects start to sort of pile up and stick around because they're waiting around to be disposed of when the auto-release pool is cleaned up.

By default, unless you're creating your own auto-release pools, that's when your code kind of finishes up and returns control back to the run loop, usually after processing a user input event or something like that. When you get back to the run loop, all that stuff gets cleaned up.

But in the interim, if you're doing a whole lot of operations and creating a lot of auto-released objects and maybe a lot of big auto-released objects, if you're doing, say, image processing or image loading, as I was doing, you can get spikes in memory usage, where instantaneously you get peaks in your usage of memory. So, for example, that's the kind of thing that you would see using ObjectAlloc, one of the performance tools that we provide for you, and it's also a useful debugging tool.

So we have an application here, and I'll magnify, where we've gone through a loop and we've basically allocated 10,000 auto-released strings, a real simple, trivial example. We have a current count of only 1,097 strings, but at the peak, we had about 11,000 strings that were running around in this application, hanging around, waiting to be auto-released at the end of the loop and as we got back to the main run loop. So this is what an auto-release spike looks like.

When you see these long bars in ObjectAlloc, you may want to look at what your auto-release patterns are and whether there are cases where you could use additional auto-release pools to provide cleanup before memory usage gets out of control. It's not a big deal if you're not using a whole lot of memory, but if you're really using big chunks of memory, remember you're sharing the system with a whole bunch of other processes, other applications, and if memory is tight, you may start swapping and thrashing. and it becomes a performance issue. So again, an application's main thread has a default auto-release pool that's provided for you when you auto-release. You don't have to worry about it. It just goes to that. You can bracket operations with your own auto-release pools, as we've already seen.

[Transcript missing]

So we're almost done here. One last thing I want to point out to you. We have a binary format for property lists. You've all seen the XML format. Property lists provide a convenient way, if you have data structures, data storage that uses dictionaries, arrays, strings, and all the sort of standard foundation types, they provide a convenient way to serialize those out, save them out to disk, and so forth.

The XML format is great. It's human readable. You can go in and hand edit it. If your user has corruption in a document, they can send it to you, and you can read it and say, oh, this is where this went wrong. But we also have a binary format that is smaller. It is faster to read and parse. It is very easy to specify it when you're serializing out your data. It's just another parameter. And we have a PLUtil command line tool that you can use to convert back and forth between binary and XML.

All you have to do is specify the format to be binary when you're serializing your property list using NSPropertyList serialization APIs. That's it. And then you get a binary representation handed back to you instead of the textual XML representation. When you're reading back that data, you don't even have to worry about it.

It doesn't matter. NSPropertyList serialization APIs will figure out for you whether it's being handed some binary or textual XML property list data. So if you want, that information is available to you. It will tell you, oh, I opened this document. It was XML, or this was binary. But you don't even have to be concerned with it when you're reading the files back.

So that's it for Foundation Techniques. Just some quick conclusions to help you optimize better. Know the framework. That's, I hope, the take-home lesson today. And I hope I've provided something new for everyone with respect to that. Know the functionality it provides. Know where there are different ways to perform a given task that may have different performance characteristics.

Work with the framework wherever you can. Where appropriate, look beyond the framework. Remember, we have a lot of facilities available to Cocoa apps, things like Core Data and QuartzCore you've heard about at the talk here. OpenGL is the fast path to the hardware for rendering not just 3D, but also 2D.

We have the Accelerate framework that encapsulates AlteVec-based image processing and general computation. And remember, use the provided performance tools to measure. Because if you don't measure, you don't really know what's going on. And you may actually waste time optimizing in areas that you think a lot of time is being spent in.

But you may be surprised to find that time is really being spent elsewhere. And you could spend your time more productively. For more info, we have documentation about performance in general and drawing performance in particular that you might want to look at on the ADC home page. Matthew Formica is the contact for any questions that you may have subsequent to this conference.