Getting Started with Instruments - WWDC 2008

Tools • 53:23

Instruments is a versatile and powerful software analysis tool introduced in Mac OS X Leopard, with added support for iPhone OS. Instruments brings context to your analysis, allowing you to view multiple aspects of your application's performance over time and easily correlate events. This introductory session will help you understand how you can utilize this tool in your own development, rapidly identify problems in your code, and write better performing applications for the Mac and iPhone.

Speakers: Daniel Delwood, Lynne Salameh

Unlisted on Apple Developer site

Downloads from Apple

SD Video (649.2 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Howdy. I'm Daniel Delwood, and this is Getting Started with Instruments. So today we're going to talk about instruments and try to take you from zero to 60 with using the tool so that you can go out and improve your existing Mac and iPhone applications and really get the best performance and efficiency out of them.

So starting off, what are we going to talk about today? Well, first of all, what is instruments? We'll talk about when to use it. And then we'll go on to talk about some key examples or key elements in instruments that you need to understand. And then we'll use those in an example workflow demo.

And then we'll go on to actually diagnose an existing performance problem using sampling technologies. And then finally, Lynn will come up to talk about memory tools, analyzing dispatch, and some other features we don't have time to fully cover today. So first of all, what is Instruments? Well, Instruments is a powerful, convenient, and flexible meta-analysis tool with the simplicity of an IAP.

Now, I know that sounds like a lot of a mouthful, but what I want you to understand is that it's an analysis tool, not just a performance tool. So you can look into the behavior of your application, as well as gather performance metrics such as CPU usage, memory usage, etc.

Now, by simplicity, what we've tried to do is put a unifying user interface on top of analyzing both the Mac and the iPhone. And so, while we're going to talk exclusively about Mac in these sets, in these examples, in these demos, everything we talk about today is directly applicable to the iPhone, and will be necessary for understanding lots of the stuff later on this week in the iPhone-specific talks. So first of all, when would you use Instruments? Well, if your application is performing badly, if it's spinning, you may want to find out what's taking the time, what part of your code is responsible.

What files is your application accessing during an operation? This is a behavioral question that you can also ask. Maybe it's when you click a button. Maybe you're interested in files accessed on launch. Is your application leaking memory? When or how much is your method called? Again, it's also about behavior analysis. Is your application causing lots of paging on the system? Or maybe it's just time to optimize and you want to get that low-hanging fruit to get your performance up and a better running app.

So Instruments can help with these because of the following characteristics. First of all, it's got event-driven profiling. And this is where you really sort of cast out your net and say, I'm interested in certain events. And instead of polling for those events, Instruments waits for them to happen and reports them instantly to you.

There's live analysis, which allows you to both run instruments and your target application at the same time, and really interact with both. So if you want to click a button and see what happens and analyze the data, you can do that. We've got time correlation, so you can see different types of data in time and see when they happened corresponding to each other.

We've got many powerful data mining abilities for you to really dig in and find out what part of your code is responsible. And the whole goal of all of this is to streamline your workflow so that you can very quickly move from editing in Xcode, building, analyzing, finding out what your problems are, and fixing, and really just completing the cycle there. Lastly, Instruments harnesses the power of DTrace. And you may have heard of this technology. It was introduced in Leopard. And it's a very, very cool thing. Now, what is DTrace?

Well, it's a dynamic tracing framework originally developed by our friends at Sun. Now, by dynamic, this means that you don't have to recompile your code. You can instrument any running code. So this can be debug, release, anything. The idea is that DTrace is based on probes. And these probes you can specify at different points, whether that's a system call or the beginning of a function or the end of a function. Very flexible. And when the event occurs, these probes fire.

When they fire, they execute script actions, and these can be very powerful things like gathering a backtrace. You can find out the parameters to the function that you're probing. And it's just overall very powerful. And what we've tried to do is bring that to you in Instruments, letting you use the power of DTrace without having to understand descripting. Although if you do understand descripting, you can build your own custom instruments and go from there. If you'd like to learn more about DTrace on Mac OS X, there's a session later on this week.

So let's talk about some instrumentation. Well, first of all, built on top of the Darwin Foundation, we've got lots of instruments built on the OS X framework, such as lots of memory tools, which we'll talk about later, sampling tools to find out where your application is spending its time.

And then very specific tools, such as the OpenGL driver monitor or the user interface event recorder. And then finally, on top of the power of DTrace, there's a lot of tools such as file activity, instruments, garbage collection, statistics that you can get, and also, like I said, you can build your own DTrace-based instruments.

I've been talking a lot about what instruments is. What are some of the key elements you should understand before using this? When you start up instruments, the first thing you'll see is the trace templates. These are task starting points. So the idea is when you usually launch this tool, you've got a question, an idea in mind that you want to solve.

If you're looking for leaks, there's a trace template for that. And maybe you're just curious about the performance of core data in your application. So these are very targeted templates to give you a starting point to the questions you may want to ask. I've mentioned an instrument many times, and what is it? Well, it's really a tool that just gathers data. Now, this is analogous to a doctor's stethoscope, which you'd use to find out the behavior and really diagnose what's going on with your heart.

And in Instruments, we've got lots of things, such as this file activity instrument, which records open, close, and stat operations, and lets you dig into those. Now, where do you get the instruments? We've got a library, and this lists all of our instrumentation. So if you're interested in sampling, there's tools for that.

There's memory tools. And with the search field, you can quickly filter down to what you're interested in. Say you're interested in everything that has to do with files. You can find those. And finally, they're sorted by category. So if you're not exactly sure what the instrument's called, and you need to take a look at the categories and see what type of data you're looking for, you can find those too.

So, all right. The trace document is where all your instruments and data really gets stored. And so, as with most document-based applications, this is the thing that you will spend most of your time dealing with, that you can save, that you can mail off to your fellow developers. And there's a couple parts of it I'd like to call out.

First of all, the track view, which, like I said earlier, gives you time-correlated data, sort of a high-level overview over what type of data you're gathering. And then if you're really trying to dive in and analyze, there's the detail views at the bottom. And this is where all of the data is organized in three different views, the table view, the outline view, and the diagram view. So, all right, I've told you about lots of the key elements. Let's see these together in just an example workflow demo and put them together. So, I'm going to go over to the demo machine. And I'm just going to fire up Instruments. So -- oh, other one.

So I just launched Instruments, and you can see the template chooser comes up. And there's a bunch of different things I can choose. I can choose leaks or core data. And there's also iPhone-specific templates, such as the core animation template and the OpenGL ES template. And you can create your own user templates. But for now, I'm just going to select blank and hit choose.

And I'm at a blank trace document. Well, what's the first thing I want to do? I want to ask myself what data I'm interested in, and in this case, I'm just going to start recording Sketch. I'm interested in how Sketch saves its documents. Does it use the AppKit document model? Maybe I just want to backtrace. So, starting from here, I'm going to start by going to the toolbar and selecting the library and typing in "file."

So here are all the instruments that deal with files. I'll bring the file activity instrument and apply it to my trace document. Drag and drop. But since we can record multiple types of data at once, I'll also select the CPU monitor, which is a lot like what you'd see in Activity Monitor. Oops. All the way.

And I'll go ahead and look for user interface recorder. Now, this will allow me to find out when the file activities happened while viewing at the same time what I did in the user interface. All right, so I've answered the first question, what do I want to record? Second question is, where do I want to get the data from? Now, from the toolbar, there's this target chooser, and I can either attach to a running process, or I can launch an executable. In this case, I'll launch Sketch.

So I hit record. And Sketch immediately launches an instrument and starts recording data. I'm going to move Sketch over so you can see that it's already gathering data in the background. And I'm just going to do something interesting by drawing two rectangles. Fun. And I'll hit command S. And you'll notice in the background that my command S was recorded by the user interface recorder. I'll type in test, save it to my desktop, and replace.

And there we go. So I'll go ahead and quit my application and take a look at the data. And first off, you can see in the track view when the file activities occurred. And so I'll go ahead and click on file activity. And you can see in the detail view that the data is updated to show me a lot of detailed information, such as when I open info P lists, and if I'm interested in even more data about a specific event, I can bring in the extended detail view from the toolbar.

[Transcript missing]

Now, since I'm interested in, well, the test document I saved out, I'm going to go to the filter box in the bottom and type in test. And it immediately filters down to the saving of my file. And if we take a look at the stack trace here, we see, well, first at the top, that the file descriptor is 19. And it looks like we are using the NSDocumentController model. Very cool.

All right. So we've looked at some of the behavior of our application. But sometimes it's not necessarily that simple. You may have a bug that reproduces one in five times, or you may want to go back, change your code, and then run the same trace document against it.

Well, the user interface recorder doesn't just record what I did, but it allows me to play them back. So if you've noticed that the record button in the top left now says drive and record, I can click it, and it's going to repeat what I did, even though I'm not actually touching the machine. So it goes over, it's going to draw my boxes for me.

And it's gathering other data in the background, the file activity in the CPU monitor. All the while, my user interface recorder is replaying the events from the first run. So there we go. The idea is really to simplify your workflow. I'm gonna go ahead and go back to the slides.

So what are we talking about here? Well, the idea is really to get you familiar with setting up a trace document and using instruments by first asking what data do you want? This helps you decide what type of instrumentation you want to use and really filter down to exactly that problem set. Second, what are you going to target?

This, in our case, was Sketch, but this gives you the context for your trace, whether it's all processes on the system, a single launch process, or a process that's already running. And finally, the goal of all this is just to simplify your workflow by giving you this record, iterate, and repeat model.

The whole goal, to take you from this, a blank template, to this, where all your data is right at your fingertips and you can solve your problem. So cool, we did a workflow demo, and you saw how to use Instruments, but let's use it to solve an existing performance problem. So for sampling, this is a very common technology to use to find out why an application is running slow and where it's spending its time. So I'm going to go back to the demo machine here.

and open up a project called Image Enhance. And so I'll just build and run this and explain what it does. The idea here is that you've got a lot of low contrast images, and you'd like to improve them. And I'll just select some images. You can see that they're not all that great. The rail in the bottom left is hardly visible. This one's just horrible.

And they really could be improved. And so you've heard of histogram equalization. It's a two-pass algorithm that's very common. And while you could use preexisting implementations of it, you'd like to try implementing it yourself, perhaps make some modifications, and see how fast it runs. Unfortunately, in this case, it runs really ridiculously slow.

So let's go ahead and sample this and figure out what's going on. So right with an Xcode, instead of just running, I can go up to the Run menu, and there's a Start with Performance tool item. And this allows me to choose any of the Instruments templates that I've created. So I'll use the CPU Sampler template, and my application launches right into Instruments. It starts recording, and here it is. And I'll go ahead and get it processing so we can see some data.

So right as I opened the files, you could see there was a spike in CPU usage as it loaded all those files. And I can hit enhance and it starts chugging away. So I'll just leave that in the background. So, Sampler, in the track view, shows me the CPU usage of this one single application. And in the detail view, it shows me lots of the sample data I've collected. And what Sampler does is it takes backtraces of your application, of all the threads, and it then aggregates these so that you can find out what's spending the most time.

So, in this case, I've got a call tree that's showing me that most of the time I'm just sitting in a Mach message trap. Well, that's interesting, but I really want to see sort of the top-down, high-level view of what my application's doing. So, I'll go ahead and choose to uninvert the call tree, and I'll talk a little bit more about what this does later. But there's a lot of options on the left, and really geared toward data mining your data.

So here you can see that we're taking samples of every single thread. And since I'm really interested in focusing on the operations that are running on the CPU and getting my CPU usage down, I'm going to select running sample times instead of sample counts from my perspective. And again, we'll talk a bit more about what this does later.

But it shows that my enhanced operations, in this case we're using NSOperationQueue to do our work, my enhanced operations are taking the majority of the time. And if I want to see exactly how, I can bring in the extended detail view to take a look at the stack.

So here we go. We've got a bunch of frames. It says that NSOperationStart is calling EnhanceOperationMain. And I'm just going to stop the app, give our poor little computer a break. So main is calling enhance. Enhance is calling gather pixel counts. And that's calling into NSBitmap image rep, git pixel at xy.

So I can go ahead and select my code, and I can even double click to go directly there in Xcode. And so here's my gather pixel counts for image rep function. And it looks like what we're doing is the first pass in our two pass algorithm by looping through all the pixels and getting the brightness value.

However, when we call git pixel at xy, it seems to be quite slow. Why is that? Well, instruments from this trace, we can see that we're gathering an app kit lock, a recursive lock every single time we make this call. Since our usage case here is that we're looping through every single pixel, we can probably do better. So I'm going to open up some code so I don't actually have to write it for you on stage.

And what we're going to do here is we're going to gather our raw bitmap data right before the loop by calling bitmap data. And then, instead of gathering the brightness by calling getPixelAtXY, We will use that data that we gathered just a minute ago. All right, so we save, go up to the Build menu, and select Build and Go.

And what it does is it launches right back into the same document we were working with. Again, to simplify the workflow. So I'll choose images, I'll say enhance them. Whoa! The first half was very, very fast. And that's what we're looking for. But the second half is still slow. And so this is all about really iterating and finding the performance problems that are taking the most time.

So if I go back to Instruments, we can take a look at the sample here. Again, it looks like we're calling git pixel at xy. Well, we're actually calling color at xy first, and that calls git pixel at xy. Where is this? If I double click on the frame, it takes me directly to the code. And I can see this is the second part of the 2Pass algorithm. Well, I should probably do the same thing.

So from here, I simply get the raw image data. And instead of calculating the new color like that, I'll use the new data. This data pointer right here. Okay, so I save, go up, build and go, and launch it back into the same Instruments template. So let's see how we did. Choose some images.

Hit enhance, and wow, there we go. In fact, we'll probably even get to see some better contrast images here. So that one's much better. You can see the bar on the lower left. And this, this is a whole lot better. So we've succeeded in improving our performance. All right, so back to the slides.

So let's talk a little bit more about sampling. First off, like I said, it's a technique to gather backtraces for all of your threads at a given time interval. And it gets all of the threads that are in your application. The idea is it's a statistical profiling, not a function profiling. And so if you have very short calls or calls that you make very infrequently, they may not show up in the sample because, well, they're not taking your application. So you can't do that.

So we're going to talk about how you can use this tool to gather backtraces for all of your threads at a given time interval. The idea is to find hot spots in your code. And the whole goal here is to tell what your application spends its time doing. Or possibly what it's not doing. And so if you're in blocking reads or perhaps you're waiting on a lock, you can find out what threads are waiting on and really trace that back to its source.

So all right, there are a lot of options there on the left. Let's talk about a couple of them that are the most useful. First of all, if we've got a sample here of, let's say, your application Sketch, You can see that a lot of the frames are from AppKit and from HIToolbox, and you're kind of interested in your code.

So one frame here is NSApplicationHandleKeyEquivalent, which probably is going to call into your code, but it'd take a little bit of digging to find. And SKTGraphicViewMouseDown is your code. This is the sort of thing you're interested in. So by using the data mining abilities of Hyde system libraries, you can get those all out of your way and go directly to what you're looking for.

So, sample perspective. Now, I toggled this from all sample counts to running sample times during the demo. What does it do? Like I said, sampling gets backtraces of all the threads in your application, whether they're running on the CPU or whether they're not. So if you're interested in finding out where your thread states are, then you probably want to leave it that way. You want to have all sample counts on. So if you have six threads, and these four are blocked, and these two are running on your cores, then it'll show you all six, and you can see where they're blocked.

Now, if you're interested only in the two that are running, and you really want to optimize the time that you're spending on your CPU, you can switch this over to running sample times. And the idea is that you can find out which ones are executing and optimize there.

Again, the idea is blocked versus running. Counts for blocked threads, set it to running sample times for running threads. Okay, so what does that look like? Well, with this example sketch sample that we have here, You can see that SKT Graphic View, mouse down, and SKT Graphic View, draw rect, took about the same wall clock time. They had about 160, 170 samples.

So this means that From the perspective of Sampler, they were about the same. However, when we switched this to running sample times, we quickly see that draw_rect spent about three times running on the CPU that mouse_down did. So this means mouse_down was blocked a bit more and that draw_rect really was doing more work. This is probably where we want to spend our first time optimizing.

Last thing, call tree inversion. Let's talk about call trees a little bit. You've got your function that does something. And it takes 800 milliseconds. Well, the call tree shows what your application -- or excuse me, what that doing something frame called. And you can see how the time is broken down.

And it's sort of the top-down view. So when you have call tree and version off, you're asking where was overall time spent and what did my code call? Now, if we're interested in sort of the hotspots in your code, perhaps what should I optimize first, you can see in this example that the red circles are taking a lot of time, and if we added them together, we'd probably see that we wanted to look at them first. Well, we can turn Caltree Inversion on and flip all of that upside down.

It coalesces the leaf frames and allows you to find the individual hotspots in your code, and also ask the question, what called those hotspots? So let's have a concrete example of this. Again, with the sketch example we had, with inversion on, you can quickly see at top that SKT Graphic View Draw Contents in View, at least for this sample, was taking the most time running on our CPU, 1,450 milliseconds. And also interesting is that right below it is SKT Graphic View Draw Rect, which was where it was called from most of the time. So, all right, that's sampling. I'd like to ask Lynn to come up to talk about memory tools.

Thank you, Daniel. So now that we've seen how to use Sampler to find hotspots in our code in order to optimize them, let's move on and let's talk about a new type of performance problem, namely memory management. Now, regardless of whether you're programming for the iPhone or for the Mac, memory is a limited resource.

And as such, it's very important to manage it efficiently. But you might find yourself faced with some common memory issues that prevent you from doing that. First is leaked memory. This occurs when your program unintentionally fails to free a region of memory when it's done using it. Now, as regions of leaked memory accrue, this could reduce your performance due to paging to your disk.

Second is having a large memory footprint. Now, your goal when you're doing efficient memory management is to keep the working set of your memory down in order to increase the performance of your application and prevent paging to disk, which would, in fact, reduce this performance. And that's why you should avoid having a large memory footprint. Third of all is pointers to freed objects. This can cause many problems. For example, your system crashing or your application crashing when you send a message to a freed object.

Fourth, and this is a more subtle problem, is dynamic memory usage. Now, allocating and deallocating small regions of memory, or just allocating and deallocating, incurs a certain overhead. And if you're doing this continuously, so you're allocating and deallocating small regions of memory, this overheads can accrue and cause performance problems.

What you really want to be doing is allocating a large region of memory and then freeing it when you're done. Amen. And finally, for garbage collected code, your garbage collector might fail to collect a certain object in a timely fashion if it has a lot of unnecessary references to it from other objects. And this is called over-rootedness, and you want to try to avoid that.

Five memory problems. Three of these are very pertinent to the iPhone. Since you're working with a smaller memory, leaked memory, large footprint, and dynamic memory usage become very noticeable on the iPhone, so you want to try to avoid them. All right. So we've talked about all these problems, and your question is, well, since finding these problems is non-trivial, how do you go about finding them in the first place? Well, Instruments tries to simplify that process for you by providing you with the necessary tools.

First of all, we have the object alloc instrument. And this is a very powerful instrument in the sense that it tracks all of the memory allocations that are performed by your application. And it keeps detailed addresses. And it's useful for finding programs with a large memory footprint and which objects are responsible for causing this large footprint. And it helps you find pointers to freed objects and why your -- whether your application is using memory dynamically. You can also use it to find things such as over-released or over-retained objects.

Second is the Leaks Instrument, which does precisely what its name suggests, which is checking for unreferenced memory blocks in your code. Now, if you're interested in higher level statistics for your processes, like for example, the size of the virtual and real memories or when your process is paging in and paging out, you will need to use the memory monitor instrument.

And since you are more intimately knowledgeable about your code, you might have that one particular memory problem that you're trying to find and solve. Now, Instruments is powerful in a sense that it allows you to create your own custom DTrace instruments that are tailored for finding that particular memory problem. For example, you can create a custom DTrace instrument that will tell you when NS Auto Release Pools are created or freed. And you can also create an instrument that will tell you when certain regions of data are mapped and unmapped from your process address space.

Finally, and this is new in Xcode 3.1, we've introduced the Object Graph Instrument. Now, for garbage collected code, it helps you visualize directed cyclic graphs of memory block references. What this means is that it helps you find those objects that have unnecessary references to them that will prevent them from getting garbage collected in a timely manner.

All right. Let's talk a little bit more about the ObjectAlloc instrument. So ObjectAlloc doesn't only keep track of objects that your application has allocated, but all regions of memory that have been allocated by your application. It also keeps track of pointer histories. For example, it keeps track of when an object has been malloc'd, freed, retained, released, or auto-released.

And it also keeps track of higher level statistics about certain types and certain classes of objects. For example, for the class NSString, it will tell you about all the NSString objects that have been allocated by your application versus the net objects that have been allocated and freed. It also keeps track of call trees for when your objects have been allocated, retained, and freed, and it works with garbage collected applications.

A little bit more about leaks. Now, leaks is a static memory analysis tool, which means it's going to suspend your process and it's going to go through all the memory regions in order to find the ones which have been unreferenced, in order to find the unreferenced memory regions. And it works alongside the ObjectAlloc instrument, because ObjectAlloc provides leaks with the necessary pointer histories so that you can find the harder, more insidious leaks in your code. And this is not for garbage collection.

So, now that we've talked about what these tools are, let's see them in action. Switch over to the demo machine. All right, so I'm going to actually use my leaks instrument to try to find leaks in my application, Image Enhance. First of all, I'm going to launch it from Xcode.

I'll just open it up in Xcode. And then I'm going to go down and start Instruments. So the first thing I see, once again, is the template chooser. And what I want to do is I want to go up here, select the leaks template, and hit Choose. Now, as you can see, the leaks template added two instruments to my trace document, the ObjectAlloc instrument and leaks.

And as I've said before, ObjectAlloc is needed to provide you with pointer histories about the leaked objects. So up here, I'm going to choose my target process. I'm going to choose executable, and I'm going to go back and find my target application. Which I'm guessing hasn't been built yet. All right, let me go to Xcode and launch it from there.

Okay, so from Xcode, I'm going to hit Run and start with Performance Tool. I'm going to select the Leaks template. And there we go. So here's my application, Image Enhance. And let me talk about what you can see over here in the trace document for the object alloc instrument. Up here in the track view, I'm plotting the total objects that have been allocated, sorry, the total bytes that have been allocated by my application. And in this case, it's been about 776 kilobytes.

Down here in my detail views, the first detail view I'm looking at is the table view. And as you can see, it shows me certain statistics about my different types of classes. For example, for CFString, I can see the net bytes allocated versus the net bytes in proportion to the overall bytes that have been allocated.

Down here I'm going to switch over to my call tree view by clicking the icon in the bottom left. And the call tree view shows me all the objects that have been allocated and their backtraces for when these allocations occur. And finally, my diagram view shows me all the objects that have been allocated and their pointer histories.

For example, if I choose, let's say, an NSLock and I click this arrow right here, it tells me precisely when the NSLock has been allocated in memory. All right. Well, let's run our application and see whether we can actually use these tools to find leaks. I'm going to bring it up here and I'm going to hit choose images. And I'm going to hit open.

So let's forget about Lease for a second, and it seems like we have a large memory spike over here in our Object Alloc Instrument. So it seems like a more pressing issue than Lease, and let's go in and investigate it in more detail. So I'm going to stop my application, and I'm going to take a look at why this is occurring.

So, first of all, I'm going to zoom in. And as you can see, my working set, my working memory set is about 21 megabytes. But in this region over here, it seems that I'm accumulating a bunch of allocated objects that are pushing my memory up to about 100 megabytes, which might reduce my performance if my application starts paging to disk.

So if I want to investigate this in more detail, I want to do something called time filtering. And to do that, I'm going to hit the option key, I'm going to click and drag around the area that I want to investigate. If I switch back to my table view, you realize that the data has been refiltered and redrawn to show me just the data that I've filtered for.

Now, I want to find which objects are responsible for this large memory spike. And to do that, I'm actually going to go ahead and go to the call tree view. And see where these objects have been allocated. So I'm going to hit All Allocations. I'm going to click the arrow. I'm going to bring up the extended detail view by clicking the icon in the bottom.

So if you can see, it seems that most of my objects have been allocated from this method call in my code. Enhanced controller, thumbnail image for path with size. And what this does, it's creating an image from the contents of a file, and the NS image is going ahead and creating an NS bitmap image rep.

So now we found out that NSBitmap image reps are responsible for about 100 megabytes worth of allocations in my code. So these are the objects responsible for the spike. Now, let's see what I'm doing in my code to actually cause this spike. I'm going to go back to this frame. I'm going to double click it.

And as you can see, this is my thumbnail image for path method. And what it's doing is I'm loading my original image, my large image from file. I'm using that large image over here to create a thumbnail image. And after I'm done creating that thumbnail image, I'm releasing my large image. All right. So I'm releasing this large image. Why am I getting the large memory spike, even though I'm releasing the NSBitBap image rep? To answer that question, let's go back to Instruments and move to the diagram view.

And find all the NSBitmap image reps that have been allocated by my application. So down here in the search field, I'm going to type in NSBitmap image rep. I'm going to find the one that has been allocated from my code, or any NSBitmap image rep allocated from my code, such as this one, for example. I'm going to hit the arrow. Now, as you can see in this view, this is the pointer history for NSBitmap image rep. You can see where it's been allocated.

And if we scroll back up, you can see when it's allocated, it has a retain count of one. And you can see when it's been retained, released. And this is when it's released right before it's freed. Now, as you can see, the NSBitmap image rep is released in an NSAutoReleasePool context, which means that When it's created, it's placed in an NS auto release pool, but as my application is running, my run loop isn't flushing this auto release pool in time for me to get rid of that large memory spike. So this is a very easy fix in order to prevent the spike from happening. If I go back to my code, so if I go back over here, take this out.

Go back to my code by double-clicking this frame. I can fix this easily by encapsulating or surrounding where I allocate my large image or load it from file with an NS auto-release pool creation and release at the end. So I'm going to save. I'm going to build. And I'm going to start my application again. So I'm going to hit record. All right, let's see whether my memory spike happens this time around. I'm going to choose images, hit open.

So we've seen this large memory allocation, but let's compare it to the previous run where we had our spike. So if I go back to Instruments, I'm actually going to stop the application right now. And in order to see your previous run, all you need to do is pull down this triangle right here. And I'm going to actually zoom out so you can see both runs on the same scale. So do you see the large memory spike over here? And now it's been eliminated. All right.

So now we've used the ObjectAlloc instrument to tell us about our memory working set, and we used it to eliminate this large memory spike. But if we look down here, it seems that we have a bunch of leaks. My leaks instrument tells me that I'm leaking a whole load of NSStrings.

So I want to find out where I'm leaking the strings from. So to do that, I'm going to go back here and click on my call tree view. In my call tree view, if I click on this frame right here, it shows me that all of my leaked strings that I've seen are being leaked from this method call.

And it's Enhance, Control, or Choose Images. So let's investigate this in our code. We double click. You can see that what I'm doing is I'm allocating a string, but I'm never releasing it, and this is causing my leak. And the string here is just the path name for the thumbnail.

So, in order to fix this problem, all I need to do is auto-release my string. Save, hit Run, start with Performance Tool, Weeks. And here we go. So, I'm going to open up my application again. Choose Images, hit Open, and let's just wait for it and see whether any leaks occur. All right, so it seems that we fixed our leaks problem.

And... or not. It seems like I've just built the wrong project. Anyway, if... let's go see what sort of leaks we have this time. It seems we're still leaking some strings, but we've reduced the amount of strings that we've leaked in the first time, which was one problem we've solved.

Finding leaks is an iterative process. So you could fix one leak and find more leaks and keep iterating and continuing to find all the leaks that your application incurs. All right. So now that we've seen that how to use the object alloc instrument and the leaks instrument to find some memory management issues, if we could switch back to the slides, please.

Let's talk about how Instruments can help you analyze and leverage new technologies that have emerged in Snow Leopard. So, as you all know, the number of cores in our devices is increasing. And if you want to write better performing applications, your goal is to keep all these cores busy. And to do that, you need to write applications with a higher degree of concurrency. The current way to do that is to use threads.

But as those of you who have written multi-threaded code in the past should agree with me, threading is hard. It's hard because explicit thread management is difficult because you have no knowledge of the underlying hardware. And you might fall into the trap of underutilizing your CPUs because you don't know how many threads to create that will scale between one CPU versus eight CPUs.

If you're dealing with shared data, you might face synchronization issues, such as deadlocks and race conditions. Finally, if you're dealing with threads, you might face problems with blocking API and contended memory access, when two threads are trying to read and write from the same cache line at the same time. So, if threading is hard, and since you have no knowledge of the underlying hardware, but your system does, then why can't your system deal with threads for you?

While in Snow Leopard, it can. As you've seen from Bertrand's talk, Snow Leopard introduces a new concurrency-oriented programming model called Grand Central Dispatch. This is in essence a new infrastructure that provides you with simple API that works with the operating system to provide you with concurrency without the previous obstacles that we talked about.

So how does it work? Well, it's based on the notion of a queue, where a queue is an asynchronous unit of concurrency. What you need to do is you need to divide your code up into work units or work items. And then add these work items to the queue. And then GCD goes ahead and executes these items concurrently and assigns the necessary threads that will perform these actions without you having to worry about them.

And with Instruments, we've introduced a new GCD instrument in Snow Leopard. The GCD instrument records queue events, such as the creation of a queue and when a queue is deleted. It also records when an item has been added to a queue and when it's been popped off and when its work has been executed.

And it also keeps track of the number of queues that are active versus inactive. Now, active queues are defined as queues that currently have items on them that are executing, versus inactive queues, which are queues that have items on them, but they aren't executing at this time. In essence, the GCD instrument tells you about the interaction of dispatch queues with your application. So let's take a look at the GCD instrument in action.

If we switch over to the demo machine, and I'm actually going to switch over to the There we go. All right. So, I'm going to bring out, bring, launch Instruments, and as you can see, I'm going to select the Dispatch Instrument from my template chooser. I'm going to hit choose. And my target application over here, I'm going to launch GCD Mandelbrot and hit record.

So Mandelbrot is really a fractal drawing application. And the way it works is that for each row, I'm creating a new queue. And for that queue, I'm pushing on items that are 10 by 10 cells. And GCD is taking care of rendering these 10 by 10 cells for me. So what I can do is I can zoom in, and it re-renders.

Over here, I'll zoom in again. And as you can see in the background, I'm recording the active versus inactive cues as my application is running. So in the track view, you can see the active cues in red versus the inactive cues. Let me just do it again. Here we go.

Now let's go back, and this is what you see in your track view. If you take a look at your detail view, it shows you all the queue events that are occurring. Like for example, when you're creating queues, when you're pushing items onto queue, et cetera. Now I could filter for, I'm going to actually stop my application first, and I'm going to filter for queue create.

And you can see all the places where my queues have been created. For example, here I've created my render at y equals 0 queue. And down here I've created my render at y equals 220. Let me clear the filter. And if I scroll down, you can also see when Q work items have been completed. For example, I've completed a work item at Y equals 310 before I completed a work item at Y equals 390.

And finally, if you go to your call tree view, And I'm going to select separate by thread. And you can see all the threads that have been created by GCD in order to take care of rendering this manual for you. So this is an eight core machine, and you can see that GCD has created a whole bunch of threads. And you see that some of the threads made 308 calls versus some threads that made 26 calls. So we just saw how to use the GCD instrument on Snow Leopard. If we switch back to slides, please.

So now we've seen how Instruments can simplify the performance analysis for you. And let's talk about how it integrates seamlessly into your workflow. So as you've seen, I can run Instruments directly from Xcode by selecting the Run menu and then selecting Start Performance Tool. And then I can select any one of these templates to start Instruments with.

You can also launch Instruments from using Quick Start Keys. Now, sometimes you might face performance problems that are transient, which appear and disappear suddenly. And with Quick Start Keys, Instruments does not have to be running. All you need to do is to place your cursor over that application, hit a Quick Start Key, and Instruments would launch with the selected template that you've assigned for that Quick Start Key. And you can assign these Quick Start Keys in the Preferences pane of Instruments. Third way you can launch Instruments is from the command line. All you need to do is specify the template to use and the target PID or application and launch Instruments right away.

And in addition to integrating seamlessly into your workflow, you can also customize instruments as well. You can create your own custom templates. For example, you've selected a group of instruments you want to use together and you want to save that for later use. You can go to the File menu and hit Save as Template, and you'll be prompted with a save dialog, and you can save that template that you've chosen for later use.

You can also run Instruments in mini mode, which will save you the overhead of the graphical interface. Finally, you can use multiple graph styles to plot the data in the track view. In this case, you see the block graph style, the point graph style, and the filled line graph style.

The bottom line is, Instruments simplifies your performance analysis because it integrates seamlessly into your workflow. Now, even though it's simple, it still retains the breadth and depth that allows you to write competitive and compelling apps. And with that, I hope you all enjoyed this session. If you have any questions, please don't hesitate to email Michael Jurwitz. He's our developer tools evangelist. For more information about Instruments and Instrument documentation, that can be found in the Xcode documentation. And finally, for more information about DTrace, you should go ahead and check out Sun's website. It has a comprehensive guide to DTrace.

Now, if you're really interested in GCD, I really recommend the session right after this in North Beach. And I believe the title on this is wrong, but it's session 382. And if you're interested in how to use Instruments to debug and profile iPhone applications, I also really recommend the session at 10.30 on Thursday in Presidio. And there are two more sessions about DTrace that you can see up there.

And finally, we'd like you to come talk to us and tell us about your problems, and we will help you with your performance problems and how to use Instruments. We're going to be in the labs right after this in OS X Foundation Lab C. We're also going to be in labs tomorrow at the time shown.