Developer Tools • iOS, OS X • 53:49
Instruments is Apple's premiere tool for analyzing the performance of iOS and OS X applications. Watch the experts reveal deep performance issues and explain the collected data. Learn critical skills that will help you find memory leaks, improve network efficiency, and display the smoothest graphics possible. A must-attend session for anyone looking to better use Instruments.
Speakers: Joe Grzywacz, Victor Hernandez, David O'Rourke
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good afternoon. How is everyone doing? Show of hands, how many people are first conference? Wow. How many people have used Instruments even once? Whoa. Impressive. So we're here to talk about learning Instruments. I've had the pleasure of working with most of you in the lab for the past couple days. We'll be in labs for the rest of the week. But let's get started.
What are you going to go over today? First, we're going to go over elements of performance. This is important to set baseline and for us to figure out what we're talking about and what sorts of things we're going to optimize our applications for. We're going to give you a methodology for performance. This is a technique that's proven, a technique I've gone over with many of you in the labs that have come and spoke with us and the rest of the Instruments team all this week.
We're going to give you a brief tour of Instruments, although with the number of people already seen it, maybe I should cut that part out. But, and then we have some excellent iOS optimization demonstrations. We made an app special for the conference. We have a few interesting problems. And we'll show you how you can use Instruments to find those problems. I think you'll relate to these problems and you'll find them to be very interesting.
What is performance? This seems fairly obvious. The first one is probably a gimme, but we'll start there. We all want our application to be fast. That's fairly obvious. But what some people don't think of sometimes is we also want it to be responsive. And lastly, we want it to be efficient.
So you can use Instruments to accomplish all of these goals. And there's an interesting side effect. If you focus on these three elements, you also tend to save battery life. And we're working with mobile devices. Our customers expect and demand that our applications don't drain their battery. So, if you use Instruments to profile your application, you'll be able to also optimize for battery life. If you don't use Instruments to profile your application, this might be one of your app review comments. Or this one. And most certainly, this will be your rating.
So we consider performance a feature of your application. It's just as important as what your application does. Use Xcode to author and build your application. Use Interface Builder to put the bits on the screen, decide how your layouts are going to be. Use Instruments to profile your application. It's just as important as the compile. It's just as important as Interface Builder. You use it to optimize your performance. You can use it to reduce crashes and terminations. And you can use it to improve your power usage. Thank you.
What is the process? I've had a lot of you come to the lab this week and said, "What should I do?" And we have a really excellent example in my opinion. A lot of you are familiar with debugging your computer, and when you're debugging your computer, when you're debugging your app, you reproduce your problem.
You then use the debugger to inspect the code and you maybe add some logging messages. Based on what you see in the debugger, you form a hypothesis. "Hey, if I make this change, my app's going to stop crashing." You then actually make the change. and reproduce the problem. Well, the profile process is nearly identical.
But instead of debugging a problem, we're going to measure the problem. So you have an area of your application that's going a little slow. Reproduce the slowness. Use Instruments to profile your application while you're doing the slowness. Use the data Instruments showing you in order to form a hypothesis about what you could do differently in your application that might speed the code up. Make the change that you hypothesized about. And most importantly, go back and measure that you actually improved the situation. The number of people who sometimes don't do this is interesting.
Where can I find Instruments? Well, the good news is you don't have to do anything else. It's included with Xcode. All of you who have Xcode on your hard drive right now have Instruments. We've given you at least two ways to access it. There's more, but we'll go over the top two. The first way is under the Xcode menu. You can use open developer tools. And it's so important, it's the first application listed.
There's also the profile command under the Run menu, or under the Project menu under Run. Profile is one of my favorite features of Instruments because what it actually does is launches Instruments, pre-targets your application, and will actually stop and start your application as you start and stop traces in Instruments, which you'll see later. So this is the easiest way to get into Instruments.
We have a little pro tip here. When Instruments is running, it's actually a separate app bundle. It shows up in the dock. You can right-click on the dock icon and choose to keep Instruments in the dock so that you always have quick access to it to profile your applications.
So we're going to give you a brief tour right now on the screen behind me. Instruments is a document model. What that means is when you launch Instruments, we ask you to pick a template, much like Keynote asks you to pick something, Pages asks you to pick a template. But Instruments templates are common profiling operations that you'd want to perform.
What you see on the screen is time profile leaks and allocations. Once you pick a template, you now have what we call a trace document. And all of the data that you record will go into that trace document and you can save it after you've made your recording and review it later or review it right after you're done. So it's a document model, standard document window, fairly common.
Once you have the document window open, there are several elements on the screen, and we're going to tour these elements on the screen so that all of you have an idea what function they perform. There's a toolbar, which lets you control recordings and such. I'll go into more detail on that later. The left-hand side we call the Strategy section. These are what instruments you've added to your document. Normally what's there is what you selected when you picked a template, but you can actually customize that and add any instrument you want.
There's a Timeline view. This is where we will show the time-based data of what we were recording over time from your application. At the bottom, there's a Details section. This is really interesting. There's all sorts of interesting information shows up here. But what's most interesting is the Details section automatically configures itself and customizes itself to show the most relevant information based on the instrument that you used to record.
For example, when doing a time profile, we get CPU usage, but when doing allocations, you get the amount of memory that's been allocated, as shown in the Details section. section. and the extended detail section is also very powerful and can occasionally illuminate data that's hard to see in the detail section. All of these will be used in the demonstration. I think you'll enjoy seeing what they can do.
The toolbar is fairly important. The first and most important menu in the toolbar is what are you going to target? Now, all processes is something we do at Apple because we want to profile the entire operating system, but most of you are app developers. And so it's important for you to pick your application from the target menu before you start recording. In this case, we're picking Safari, and I'm going to profile Safari and see if I can give the team some feedback.
After you've picked your target, you can use the record button to start and stop a trace. This will allow you to get your application launched, get to where you want to do the profile step, start recording, go reproduce the problem with your application. When the problem reproduces, you come back to your stop the recording, and now you have a precise trace of what you were trying to analyze or what you were trying to profile.
We have a timeline, and this lets you set start and end points on the timeline. Sometimes you have too much data, sometimes the data is too small or too compressed. You can use these controls to zoom in on the timeline, pick a range of the timeline so that you can focus on just the data that's most relevant to you. This is a time window. It shows you how long the trace data is, how much trace data you have, how many runs you have. One of my other favorite features is you can run a trace multiple times, and Instruments lets you compare multiple runs side by side.
Instruments is a pained window, so you can actually choose what panes are showing. What's highlighted here is the left-hand side is shown. The bottom and the extended detail are hidden. You can click on that to use your screen real estate appropriately to show more or less data based on what you want to achieve.
and the library. The library window lets you show other Instruments that maybe weren't part of the template and add them to the strategy section so that you could maybe combine file I/O tracing along with your CPU load tracing and see if your file I/O has something to do with your CPU load problems.
This is a sample timeline of a time profile I took of Safari reloading the Apple website over and over and over again. The purple lines are CPU load. and you can see this is, you know, it's a timeline from left to right. The three buttons above the time profile icon are what we call the strategy view.
And the strategy view you're seeing here is the instrument view. And what we mean by that is this is the way the instrument has chosen to represent its data. Different instruments will draw the timeline differently. This instrument has an alternate view. This actually lets you see all of the threads that your application was running.
And each one of those little stopwatches on there represents a different sample from where instruments took a sample. And that represents an entire backtrace at that point in time of where your application was executing when we took that sample. This is very useful for debugging concurrency problems or optimizing performance where you can see how the threads are interacting and ping-ponging.
The last strategy view in Time Profile is the Core Strategy View or the CPU Strategy View. And this lets you show what CPUs your process was actually scheduled on by the operating system. This lets you see how much parallelism you're actually achieving. If you've optimized your code and choose parallel, what you should see is activity on all of the cores simultaneously, and we'll have a demonstration of that later.
The timeline can be filtered. We've switched here from a time profiler view to a system trace view, and you can see the data is presented differently. The highlighted section there I used by shift-clicking, and that allows me to zoom in on that section and see the data in more detail.
In this view is the detail pan at the bottom. You can see what we call a call tree. This is probably the most, the bread and butter data that Instruments presents. When Instruments runs, most of the time we're sampling a backtrace and we have the actual symbol from when we took a sample of your application of what it was doing at that exact second. We sort, sift, and present all of that information to you in the detail view. And we can normally show you the line of code that allocated too much memory or took too much CPU time or was on the CPU too frequently.
The call tree views in the detail section, as you can see here on the screen. And what you're seeing here on the screen is main thread took 53.6% of our time. We could drill down there and find out more details about why main thread was taking up that much of our time.
I can change the views in the jump bar by choosing a pop-up view, and you can see the actual samples that we took. Again, what's on screen is 0 through 7 and 8. There's probably 10,000, 12,000 samples in this list, but this is the actual data that we sampled. Sometimes if the data's not being illuminated in the call tree view, you can plow through the detail view yourself and find interesting information.
We can also show you your source code in this view. So you can double click on a call tree, click on the focus icon, and we'll actually show you the code. What this window is showing us is that G-Lunar texture was 75% of the samples in this particular case.
And if that was a problem for you, you'd know that that was the line of code to work or replace or find something different. This is the extended detail view. It sometimes shows information that's more succinct or more precise. And you can see here that for this particular sample, 75% of the time was at the value texture line, and the other 25% was spent doing S print F buffer.
We can also show you the disassembly. There are some cases where you have handwritten assembly code or you have instruction scheduling issues, and you can use Instruments to find out what instructions are taking most of the time. This is a system call view with a detailed pane. This is showing us that Mach timer arm is taking 57% of our time.
And you can see that the timeline is radically different. Each of the samples on the timeline above with the little telephone represents a system call. And I can actually get a back trace of what my code was doing when it made that system call. If you're getting a stall with a system call, you could actually see what part of your program was causing that.
Next slide. There we go. And this is just another example of the variability in Instruments display and the flexibility in that we can show the activity monitor, which provides an excellent summary. In this case, the Galaxy's program was the lion's share of the CPU time, and we have user load, system load, and various size and VM size information up on screen.
So Instruments is very flexible, very versatile. We can show you data in a lot of different ways. And at this time, I'd like to bring Joe Grzywak on stage and quit talking about screenshots and actually show you how to use Instruments to optimize an actual application. Thank you.
Thank you, Dave. Hello, my name is Joe Grzywacz. I'm an engineer on the Performance Tools team. And today I want to take you through using Time Profiler template in order to optimize an example application we wrote for you today. Okay, so we're going to get right to it, launch our project in Xcode.
Now, the first thing we're going to do when we're running with Time Profiler is make sure you're targeting your device. Running on the simulator will show completely different performance characteristics as you're running on your local host instead of the device. So make sure we have our iPad here targeted. Click and hold on Run. Choose Profile. That'll build our application in release mode. That's important. We want to make sure we're optimizing the release bits that we're going to be giving to our customers.
That'll sync to the device, and up will pop the Instruments template chooser. Here we're going to be selecting the Time Profiler template, as I have some performance problems I want to investigate. So we'll choose Profile. Up immediately will pop Instruments, and it's recording data in real time for us and updating the timeline, so as I interact with the application, I get data feedback right away.
I've stopped this recording because I want to actually take this time to switch a recording option. You can do this by selecting in the Time Profiler instrument, this little eye icon is your inspector. Open this up, we see some recording options. We're going to select Record Waiting Threads, and we'll get to why we did that in just a little while. For now, since we haven't made any code changes, you can actually just use the record button right from Instruments. It'll launch your application once again.
First, I would like to go to the iPad device. We can see that. And now I will press record within Instruments. And... All right. So we see our app came up. Once again, I'm going to tap on the top paid ocean, and we're going to have to refetch that RSS down, that feed, as we do that each time.
Instruments is providing me with real-time data here that I can see and you'll see in a moment. And while the application is running, let me describe to you what it's doing. So it downloaded the RSS feed. It's going to start drawing some icons across the screen. In order to do that, it's going to go out to the internet, download the icons.
And you can see we're actually running here at a pretty miserable frame rate, something like five frames per second. Now I'm expecting 60 frames per second. When I wrote the application, I created a timer, fire 60 times per second, should call into my main thread, wake it up, move all the icons across the screen, and then draw the screen and go back to sleep. We can see here we have some icons moving across the screen. There's some schools of icons. There's parents with their children following behind them. And we're trying to collect enough data here that we can actually analyze this problem.
And so we're getting to a point here where actually, OK, we got to another phase where we're actually running a little bit more smoothly, but still only at 12.5 frames per second. And that's what we want to actually analyze at this point. So we're turning now to Instruments. Here's what Instruments has been recording this entire time. I'm going to stop the recording, as I have enough data.
[Transcript missing]
For now, I'm only interested in what's taking all this time. And so I'm going to, on the left side, choose this running sample times option in the sample perspective area. And what that will do is temporarily just kind of put aside all those background samples we got. So now we have something like 3,300 samples.
Now, what is taking all of our time? Well, if you click and hold in the ruler view up here, we can see that we're using 100% of our CPU, 80% of our CPU. This little inspection head, it's called, will actually tell you how much CPU time you're using at the current moment.
Okay, so we seem to be using a lot of our CPU. Our frame rate's low. I'm going to, you know, conjecture that I have a frame rate problem caused by, you know, overuse of my CPU. So where's all that time going? Well, the answer is here in the sample list, but it's kind of completely unusable for humans, which is why we provide you the call tree view, which is the default view you get to because that's where you're going to want to spend your time. Now, we have a call tree for each thread. I'm going to go ahead and open up a couple of these entries.
So what exactly is a call tree? Well, we took all those samples we took and all those backtraces and found all the unique paths through those backtraces, combined them together into a single tree structure. So we can see here our main thread called down into the symbol main. The gray text to the side is either your program name or the system library you're in. Here we started in my AppOcean program. And now when we generated that call tree, we also generated how much time was spent in those sections of the call tree.
Here we can see we spent 31,190 milliseconds, or 94% of our total samples were in our main symbol and its children. And so we can continue moving down here. Main called into UI application main. That was in the UI kit framework. UI application main actually went in and called two different symbols, gseventrunmodal, as well as uiapplication_run.
So how do we know where we should continue drilling down now that we've hit actually one of these branch points? Well, simply just follow the big number. Here we have 86% of our time went down this portion of the call tree, and so we're going to keep on opening up each of these and continue following down. Now, there's a quicker way. To do that, and that's in the toolbar, is opening up that extended detail view again. This time, it has the heaviest stack trace for us.
What we see here is the exact same thing we were seeing in the main window, the main area in the detail pane, that main called UI application main, et cetera. The gray text here, system libraries, the black text, your code. Great way to go through this quickly is just go ahead and scroll all the way to the bottom, and then kind of scan your way back up until you start seeing the black text again, which is your code. Here I'm going to go ahead and click on Ocean View Draw Icons. And that highlighted that symbol in our call tree area over here.
And what can we see? Well, UI views, draw layer and context called my code, AppOcean's code, draw icons as well as draw rect. And between these two guys, we spent about almost 80% of our time. So I know if we optimize this code, we should see potentially a very large speedup. You can go ahead and see your source code right within Instruments. Double-click the symbol.
and your source code will show up right inside the detail pane of Instruments for quick reference. And here, if you're particularly eagle-eyed, you may have noticed something interesting. I have my draw record routine, which calls into both draw background and draw icons. However, back in my call tree view, We saw that it looked like that draw icons routine was actually a child of this draw layer in context. Looks like he was called directly. That's not the case. What you're seeing is the fact that we ran in release mode is the compiler is performing a lot of optimizations.
One of them is called the tail call elimination. So sometimes you'll encounter this where it may look like your frames are slightly out of order. That's a compiler optimization. So I know that, in fact, my draw rect routine actually did use 80% of the time. So I can double click on the symbol again to go see that source code. And let's say I want to go change it. In the top section here of the detail pane, there's a tiny Xcode icon. Click on him.
Xcode will come up and it'll go actually right to our source code for us. You can go ahead and start making your changes. So this is the first point where I actually would want to make my change. I know that this draw rect routine and all of these methods above it that it's calling are really slow, and so I'm going to get rid of them.
The fix here, if you read a bunch of documentation, is to use another technology called Core Animation. Now, we don't have time to get into all the details of Core Animation. It's a really cool technology, but it's going to move all this redundant CPU calculation off to the GPU, which is really good at it. And so we'll deal with actually fixing that up later, but all that old code is going to be gone. So we're turning to Instruments.
There was one other thing I wanted to look at before I made any of those code changes. And, well, that was, we actually seem to have two different kind of regimes in this application here, where at the end we saw a pretty solid purple bar, and then kind of in the middle we see lots of peaks and valleys, peaks and valleys. And so I want to know what's going on there.
Underneath this Instruments list, there's a widget called the Track Scale Slider. You can drag it to the left, it'll zoom out slowly. Drag it back to the right, it'll zoom in slowly. And so right now I'm interested in why do I have all of these gaps? Now, normally, when you have a gap, that might be okay.
If you have nothing to do on your CPU and it's idle, it's a good thing. You're saving power. Here, however, I have this 60 hertz timer firing, and I'm expecting my main thread to wake up 60 times a second and do some work. We see here between 41 and 42 seconds. We actually see only maybe it looks like one peak. We only had one thing wake up.
So the question is, well, what was my main thread doing at that time? And Instruments provides a great way to actually answer that question by going over here to the top left of Instruments and switching away from the default view, which is the Instruments strategy, and selecting the third icon here, which is the Thread strategy.
What happens now is that list of instruments has been replaced with a list of all the threads in your application. We have a bunch of them here. Some of that background RSS feeding and whatnot. We're interested in what my main thread was doing, and I've highlighted it here.
And now, as Dave mentioned, there's a whole bunch of these little icons going across the screen. Those are the samples that we took. Some are darker and some are lighter than others. The darker ones are when we took a sample of something that was running on the CPU at that time, which is the default mode for Time Profiler.
We have these semi-transparent icons, and those represent the samples of your application when it was not running on any CPU at that time. We got those because we enabled that record waiting threads option at the beginning of the demo. Now, I did that to save us a couple minutes of demo time. If you didn't have that option enabled, you'd see gaps in these regions instead of those semi-transparent icons. And what you would do is just go back, enable that option, do another recording. Not a big deal.
And so now what I should be able to do is click on one of these particular semi-transparent guys. And now with one click, I've answered the question of, well, what was my main thread doing at this time? Because we can see we took an idle call stack sample. Our CPU ID is, well, not running on a CPU because it was in the background. And we have the call stack here.
And if we just scroll down it, we see the black text is my code. Some sort of icon downloader start download. That went and called into NSURL connection send synchronous request. And so right there, I know the problem. I did a synchronous request out to the Internet on my main thread, and that's going to block everything from happening. Your user can't interact.
There's no responsiveness to the user for touches. The screen won't be animating, and you won't even get timer firings until this data comes back. And so it's obviously a performance issue, but it could be worse than that. If you make this Internet connection and it doesn't come back within 10 seconds, the system's going to terminate your application because you took too long, and your user's going to see what appears to be a crash.
So make sure to move all of these synchronous requests off to a background thread using one of the asynchronous APIs we provide. Okay, so that's two problems and two fixes. There's one final thing I want to look at, and that is going back to the Instruments Strategy View. We can go ahead and quickly zoom back out using the View, Snap, Track to Fit menu item.
And what that'll do is go ahead and, you know, bring all the samples back into view really quickly. And now I want to focus in on this startup time here at the beginning. And so using a shift and mouse drag gesture, I'll quickly just zoom in on that region. And so now what we're looking at is my startup time. We can go ahead and drag this a little bit larger.
And so what I want to do now is option drag over this region, and that's going to do two things. This little pop over here is telling me that it's taking 2.85 seconds to do my startup routine. And when I let go, I've now applied a time filter. Now, for temporarily, all the samples outside of this highlighted blue range, we're just ignoring them. It's very easy to clear it. They're using that inspection range in the toolbar. You can go ahead again and set it to any ranges you want really quickly, pretty easy.
And so now what we've done is highlighted this region, and now I want to go back and see what was actually taking our time again. So using the jump bar, we're going to click on call tree again. This has now been updated with just the samples that are in this particular range. And as we did before, very quickly, go to the extended detail view.
And we have a very long call tree here. And what we see here at the bottom is some of my routines at the very bottom. I'm going to click on that. And now it's highlighted here in the call tree for us. Let's get this up a little bit.
And what are we seeing here? The view loaded, and I did some compositing of images that took 70% of my time. There was some loading of images that took another 27% of my time. So once again, I have about 98% of my time was spent in these functions. And you'd want to go through the same routine of, well, let's try to optimize that.
Unfortunately, let's say I tried that, I couldn't get out any more time, and so the code is what it is. I can't make it any more efficient. Or can I? Well, the last thing we want to look at is the third and final strategy, which is the CPU strategy. When we click on that, now our list of instruments is replaced with the list of cores in my device. This is an iPad 2, so there's two cores. And what we can see now is these blue sample bars.
Each blue sample bar represents a different CPU. And so we took a sample on that particular core. The height of it represents the depth of the backtrace. And if you take a really quick glance, you may say, yes, we actually have some parallelism going on here. In fact, we don't.
We can grab that same inspection head before and kind of help us draw a vertical line and see that any time there's a blue bar on one core or the other core, but never at the same time. So we actually are making only, using 50% of our CPU resources, we can actually make this much more efficient by doing some work in parallel.
So the code change that I need to make is to take that daytime and nighttime loading of images and do them at the same time. And then break up those images into a bunch of tiles and use GCD, Grand Central Dispatch, to actually create a whole bunch of threads in the background to work on those tiles in parallel and then display them to the user.
Now, I did one other final enhancement to that was that as soon as the daytime image loads, I can display that to the user immediately so they can begin interacting with the application as quickly as possible. Meanwhile, all those computations will be happening in the background and will show up for the user as they complete. So now we're going to Xcode. As promised, start making some of these coding changes.
Except I'm going to take the cooking show approach and cheat a little bit. As these changes take a little while to make, so I've prepared some pound of fines ahead of time. First one is let's go ahead and use Core Animation. That removes all our draw rect code. It enables us to use the CA layer type technology.
Again, I would recommend you go see a Core Animation talk to learn more about that. The second fix was let's do all of our icon downloading asynchronously, get it off the main thread. And third and finally, let's rewrite that initial map loading algorithm to actually do things more in parallel and use GCD.
So save those changes. And this time I want to make sure to select Profile from within Xcode. Since we have coding changes, we need to make sure we profile sync the latest bits over to the device. So before I do that, let's go back to the iPad. Okay. So now I'll select Profile again within Xcode. It's compiling, syncing it over to the device.
And this time we see the daytime image comes up really quickly. The nighttime image blends in the background. The entire time, the user interface was interactive for the user. I can tap on the top paid ocean. And again, it's going to go out, load that RSS feed again for us.
And after it does that, the icon should start downloading. And now we can see my frames per second counter is actually running at 60 frames this entire time. I don't see any random stutters. The children are following their parents quite nicely. And everything looks great. I'm pretty sure I've verified that I've actually fixed the problem. But what does Instruments have to say about it? Returning to Instruments. Let me see what the graph looks like now.
Our CPU usage is much lower. And so that looks great. We can drag the inspection head even while it's running and see that 30%, 20%, much lower CPU usage. Great. Stop this recording. The last thing I wanted to look at was that startup time. Shift-drag to zoom in on that.
I can go ahead now and once again do the option drag, which will time filter this region as well as let me measure it. The duration now is down to about 1.7 seconds, so we took off a little over a second. And did we do this more efficiently? This time we can see that in our core strategy view that, yes, indeed, we actually have samples now running on both of our cores at the same time. And that's where we got that big speed up as we were using both of our cores' top efficiency.
The last thing I'd like to show you is the all-new threads and process picker that we have this year. We open that up. The left box actually lists all the processes that we recorded. We only recorded our one application, so there's just one. The left panel shows us all the different threads that were actually involved in our process this time. There's a bunch here, and there's some quick shortcuts to say, if I want to focus in on my main thread, click the main thread filter, check that box, and now we'll be focusing in on just our main thread.
So when I return to this core strategy view, we can see the blue samples now are our main thread. All the gray samples was all that background code, all the compositing logic that GCD is handling for us. And if I option drag over this region, and I'll see the duration was 413 milliseconds. So the user was able to start tapping, interacting with my application in under half a second versus the almost three seconds before. And with that, I'll turn to slides and summarize what we just went through.
Thank you. So the first thing we did was actually want to analyze where we were spending all our time. And that's what the call tree view is really good for. It took us a matter of a minute to just find these two methods, in this case, drawIcons and drawRect, that were taking most of our time. We went and optimized those away.
Then we had a responsiveness problem where our main thread wasn't responding. It was hanging a little bit. And we went to the thread strategy, which very quickly allowed us to see that I was actually doing this blocking synchronous call on my main thread. And that's where those little hangs were coming from.
And third, we went and used the CPU Strategy View. That told us how efficiently we were doing our work. Here, we weren't actually doing it efficiently at all. We were only using half of our resources. So we went back in, made our code more parallel, used GCD, and then we used all of our CPU cores. And with that, I'd like to introduce Victor Hernandez, who's going to take you through a memory profiling optimization on the same application. Thank you.
Thanks, Joe. My name is Victor Hernandez, and I'll be showing you how to use Instruments to fix memory issues in your app. It's really important that you do this because memory is a shared, constrained resource on the device. And fortunately, Instruments provides tools that makes this easier to do. I'll be going over two tools, the leaks instrument and the allocations instrument. So let me go to the demo machine. Here we go. I'm going to bring this up in Xcode. Okay.
So I have it running on the iPad. That's great. I'm going to check the profile action here. And unlike Joe, I'm not going to be using the time profiler template, but instead I'm going to be choosing right here the leaks template. So let me choose that and click on profile. This launches Instruments, and it's already started recording. So I'm going to start using the app and using it just bringing up some oceans. Here I'm bringing up the first ocean.
I'm going to wait until a few apps are swimming around, and I basically want to exercise the app like your user would be doing. All right, I have a few oceans swimming around. That's great. Let me swipe away and bring up another ocean. And you'll see it's still collecting more information. And I'm waiting for that ocean to be filled with apps.
Let's see, it's taking a little while to download our assess feed. Oh, wait a second. I should stop this right away. Because if you'll notice right here, if you look at the reason I stopped immediately is because the leaks instrument is already showing me that I have some leaks. What this means is that there is some memory that is not accessible for my app, yet has not been freed. This is an issue that needs to be resolved as soon as possible, and I want to start there.
So how I go about doing this is first I need to find out what the information that it's showing me is. If I go over here on the left-hand side, you will see-- It says right here that the leaks instrument is doing automatic snapshotting every 10 seconds and What that's showing me means is that this is the first time it detected any leaks, and it detected 220 leaks, and about 10 seconds later, it then detected 304 leaks. All right, so I'm leaking about 15 kilobytes, and I need to take care of that. Now, the way I go about doing that is right here in the detail view.
I have the different categories of objects that are being leaked. If I organize this, I should start by looking at objects from my own app. The ones I notice here are app child and app parent. As Joe mentioned, when the apps are swimming around in the oceans, we have some of them that are leading other apps. Those are the parents, and they're leading their little school of fish, and that's the app child.
Well, it looks like some of these are being leaked. In fact, I have 79 app childs being leaked and 28 app parents being leaked. Now, I'm rather surprised that they're being leaked because I wrote this code using Arc. Aren't I supposed to avoid these sorts of issues? Well, even though Arc makes it much easier to write retain and release memory apps, you can still fall into the same pitfalls that you did before with that kind of technology.
Fortunately, Instruments makes it pretty easy to identify one of the common sources of leaked memory, which is a retain cycle. And the way you find this information is you go right here in the jump bar and you choose cycles and routes. And so what you have here is what Instruments has recorded for you are all of the different cycles and routes.
Specifically, I want to go to my App Parent, the objects for my own code, and it brings up this graph. What we're seeing here is we're seeing a reference graph for the App Parent. The App Parent has this red arrow, and the red arrow indicates that there is a strong ARC reference from the App Parent to a mutable array for its children property.
That mutable array has a blue dotted non-ARC reference to a buffer, and then that buffer has another non-ARC reference to an app child. And finally, this app child has a red strong reference back to its parent that's stored in its parent property. Well, that's really surprising. I didn't expect this to be a strong reference at all.
The other thing I want to point out is, as I told you earlier, this memory isn't accessible from the rest of your app, and sure enough, there are no references to this App Parent from anywhere else. They're only from the app child. So I need to find out why this is a strong reference, even though I didn't expect it to be.
So if I double-click on that arrow, it takes me to Xcode, and it'll actually take me to the definition of app child. Here we go. Here is the line where I'm setting the parent property in app child. Nothing looks wrong about that. So let me actually go to the header and see what's going on.
Now if I scroll down to the property declaration, aha, I see what's going on. I forgot that by default, properties are declared to have strong references, and I in fact need to explicitly state that this is a weak reference. So I think that should fix my leak, and now I need to run again in Instruments to verify that I've in fact done that, fixed the leak. In the interest of time, I'm going to be... I'm going to be loading a pre-saved trace. So let me go ahead and save this one. And here's my pre-saved trace. And so this is now Instruments recording data running through multiple oceans in my app.
After I fix the leak. Well, first of all, I'll notice that the leaks timeline shows no leaks at all. So I think I fixed the problem. That's great. Now what's the next thing I need to do? Well, I need to go to the allocations timeline and see if it's telling me that there's something that I need to fix.
Before I do that, I want to show you that I've added these flags throughout the timeline. These are really useful for correlating behavior in the app to data that's been collected inside of Instruments. You'll see right here, I've labeled this flag to say map of the world, and in fact, it's the point in time when the app has brought up the map of the world.
My next flag identifies that at this point in time, the top paid ocean has been displayed once, and there's a whole bunch of apps swimming in there. And then I have one for the new paid app ocean, and then finally, I brought up the top paid app ocean again.
It's pretty easy to add these flags to your trace. You just move the inspection head anywhere along the timeline and go to edit, add flag. It adds it, and then if you double click on it, you can go ahead and put a label in there, which you can reference later. I'm going to go ahead and remove this since we're not actually going to use that one.
But this is really useful, especially if you save your trace and you want to access that information later. So let me go to view, snap track to fit to make this thing take up the whole screen. Okay, great. Now that we know what our app is doing over time, we need to see what its memory profile is. So the allocations instrument in the timeline shows you the amount of memory used by all of the malloc objects in your heap.
Initially, when the map of the world was shown, I'm using about 910 kilobytes. And by the time I bring up the first ocean, I'm using 3 megabytes. And on the second ocean, I'm using 4.56 megabytes. And finally, I'm up to 6.38 megabytes. Well, that's pretty bad, because that makes it sound like if I keep on bringing up more oceans, I'm eventually going to just increase memory usage, and eventually my app is going to get terminated, and my users are going to see this as a crash. I definitely have to fix this.
On top of that, this doesn't match my expectations for my app. I know that I'm using the same amount of objects to display each of the oceans. In fact, I expect this to be flat, and it seems like there's objects being used for earlier oceans that are sticking around when I'm displaying later oceans. Well, I definitely need to fix that. But how do I go about doing that? Well, that's where the rest of the allocations instruments data comes into play.
What the allocations instrument is doing is it's recording every single memory event for every single object allocated during that trace. It's recording its allocation, all the retain and releases, and then eventually it's free. It's a lot of data. That's why the app runs a bit slower on your iPad when you're recording with the allocations instrument turned on.
Down here it displays the different categories of objects that have been allocated. And to show you what's in here, let me filter out. I go to the search field and I type in string. And this shows me all the different types of strings that I've allocated. If we look specifically at immutable CFStrings, we'll see that 20,202 of them have been allocated and that at the end of the trace, there are 3,525 that are still living. Those are the key numbers you should pay attention to. Also, if I focus in right here and I press that, it shows me the number of strings that have been allocated.
I now get a list of actually all of those instances. But specifically to know which instances of all the ones I've allocated I'm showing, I actually look over here at the allocation lifespan on the left-hand side, and I'm currently selected created and still living. So these are all the immutable CFStrings that were created during my complete trace that are still living at the end of the trace. If I instead click on created and destroyed, now the list will be updated to show me those instances that were created but have already been destroyed before the trace ends. It went by so quickly that I can't tell. There we go.
It updated. There's a lot of them. Okay. Well, for my analysis, I'm worried about objects that are created. And I expected them to be destroyed, but unfortunately they're still living. So I'm going to switch it back to created and still living. And I'm going to go back to my object summary.
And I'm not going to search for strings. And now I'm presented with all these different types of objects again. Now where do I go from here? Well, this can really feel like trying to find a needle in a haystack because we're presenting so much information. Well, the best way to go about using this is to think about -- actually, pick an object in your own code and check to see if the data recorded about its object lifespan matches your own expectations. So I'm going to do that specifically for the Ocean View class in my app. Great. There it is.
All right, and now what I have here is this is presented to me. I create an ocean view every time an ocean is displayed on the iPad screen, and then when I swipe away, it should go away. So let's see what it's telling me. What it's telling me that I've created three of them overall. That makes sense.
That's what I expected. But it's unfortunately telling me that there's three living at the end of the trace. Well, that's not what I expected. In fact, what I expected was what we're seeing for ocean view controller right above, that I had three total, but at the end of the trace, I have one that's still living. So now I need to figure out why I still have -- why these ocean views are still alive.
Let me click on here to get to there. And you'll see that now I have a list of all my OceanViews. If I scroll through these, you'll see that the inspection head moves to the point in time when each of them got allocated. Well, now I need to click on the first instance again. And now this brings up the complete memory history for the OceanView instance.
Starting from its initial allocation, that gets me a ref count of one, followed by retain and releases that increase the ref count. That's the key number here in this column. And eventually, this should get all the way down to zero, and it'll get freed. But it's not getting freed because it never gets to zero.
So who's holding a reference to this? Well, the best way to find out is to move the inspection head along in the timeline to a point in time where I expect that ocean view to have already been freed. How about right here where the second ocean is displayed? At that point in time, the memory events being recorded are retain and releases from NSFireTimer.
Well, that's the timer that Joe was mentioning was telling the ocean view to animate itself. That's weird. I don't expect this to be happening. So for some reason, the timer hasn't released its reference to my ocean view. I need to go back to the very beginning and find out when that ocean view got its initial reference to my ocean view. So I've got a reference to my ocean view.
Well, that seems like it happens right here. If I bring up the extended detail view, I'll see that that is -- okay. Well, basically, what's happening here is that this is being called from start animation in my own code in the ocean view. So now I'm going to go back to Xcode and take a look at that code.
And I'm going to look for start animation. I see that in start animation, I am creating the timer and it's passing itself as the target. And stop animation right down here is correctly invalidating the animation timer. So I need to find out why I'm not calling stop animation at the right time. So if I do that, I see that I'm actually not calling at all. I have a declaration, a definition, and no call to it. So that's wrong.
So I now know that I need to match my stop animation call with where I call start animation, and it should be right here in view will disappear, which gets called when I swipe away the ocean. So let me fix that, and I'm going to run this again in Instruments to verify that I've made the correct fixes. So let me pick the leaks profile, and I'm going to be recording here again. Okay, let me bring it up. I'm going to highlight ocean view.
And sure enough, there are no ocean views being displayed, being recorded. And I went to an ocean, and now I do have an ocean view. So that's good. Now let me see if I swipe away from the ocean if it goes away. All right, it did. Good. I think I fixed something. Let me do it again. Well, actually, I don't have time for that, so I'm going to stop this. The other thing to note is that it looks like as I bring up more oceans, the amount of memory I'm using is not going up.
You'll see that I'm hovering it around 1.12 megabytes, so that's really good. This is an iterative process, and this is only the beginning. I really need to verify all of the object lifespans for all my different classes to figure out how best to fix all of my memory issues. But that's basically the leaks instrument and the allocations instrument.
Applause So what have I shown you? I showed you how to use the leaks instrument and specifically the cycles and routes to identify retained cycles in your code. And secondly, I used allocations to determine why memory was increasing over time and specifically checking my assumptions about the lifespan of my objects. And now I'd like to return the floor back to Dave. Thank you, Dave.
So what you saw on stage was Victor and Joe following the profile process. They reproduced a problem, they measured it with Instruments, came up with a theory about what changes they could make to actually cause a change, made those changes, and then went back and reverified that they'd actually fixed the problem. This is a critical process. If you follow it, you'll have results.
The instruments templates used was we used Time Profiler, for which Joe found three improvements fairly easily, a startup time improvement, a drawing animation improvement, and we fixed an async problem or a synchronous problem that was blocking the main thread. And Victor managed to find two memory leak improvements by using the leaks template. This is by no means the end of instruments.
We can profile virtually anything on the operating system. And if you read the documentation or look at the library thing, we have over 35 instruments that can profile different aspects, core profile, core animation, GCDQs. Come see us in the labs. We can tell you if you are having a particular problem with a particular subsystem, there's probably an instrument that will illuminate the data and what your application is doing with it.
If you do the profile process, these are the reviews that you're going to get on your app on the App Store. Fantastic and fast. Must buy this app. And the ratings are going to be radically different. If you need more information, I encourage you to contact Michael Jurowicz. Instruments documentation is included with Xcode and, of course, the Apple Developer Form. There's a vibrant community discussing instruments on the Apple Developer Form. I appreciate everyone's time and have an excellent conference.