OpenGL ES Tools and Techniques - WWDC 2012

Graphics, Media, and Games • iOS • 42:08

The iOS SDK includes powerful tools for analyzing the behavior and optimizing the performance of OpenGL ES apps. Learn key practices to identify bottlenecks, find rendering errors, tune performance hot-spots, and maintain high frame rates. See how the tools can pinpoint problems for you and even provide specific advice for the best way to correct them.

Speakers: Michael Mayers, Seth Sowerby

Unlisted on Apple Developer site

Downloads from Apple

HD Video (653.9 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good afternoon, everybody, and welcome to our session on OpenGL ES Tools and Techniques. I'll be taking you through the great new tools we have for developing games on iOS. Now, as you know, iOS is a great platform for developing games. And with OpenGL ES, you have great technology for pairing these games. Today, we're going to be talking about the OpenGL ES tools we give you to streamline that and make sure that you're spending your time developing games, not finding issues in your graphics.

So what we're going to cover today. Obviously, we'll talk about what OpenGL ES tools are available to you, why you need these tools, and what new features are available to you within Xcode 4.5 and the iOS 6.0 SDK. Finally, we'll talk a little about some workflows and methods to help you get the most from these tools. So first, let's kind of talk about what tools are there for you within the shipping SDKs to catch everybody up.

So the first tool I want to talk about is the OpenGL ES Frame Debugger. Now, this is a great tool that's built right into Xcode. You just launch your app as you normally would, trigger a frame capture, and then you can start diving straight away into your OpenGL ES code.

Within the debug navigator, we show you all the GL calls from your frame. And you can select any of those calls to instantly see what the state of GL is at that point. We also provide a debug label, a debug marker extension, sorry, that helps you annotate your frame itself. So you can group using this API calls into your different rendering passes or different objects you're drawing or however you choose to. It's a really powerful API to let you choose how you want your frame to appear within that debug navigator.

As well as being able to see all your calls within there, you can disclose any of these calls and we'll show you what the CPU call stack was at that point. So you can dive straight back to your CPU code from your GL draw calls or state calls. Within the main render buffer view, editor view, we show you your frame buffer. So we'll show you what the state of that frame buffer is at whatever call you're at. Within this, we'll show you the color buffer, desk buffer, stencil buffer, whatever buffers you happen to have enabled.

We'll also, by default, highlight the current draw call in wireframe. So it makes it really easy to see what you're drawing when as you scrub through the frame. In the assistant window, we show you all of your GL objects. So you can investigate the resources you're using. By default, we'll show you what the boundary sources are, i.e. what are the textures, programs that you're drawing with at that current draw call.

But there's also options to filter to view all of the objects. So if you think, I should be drawing with this texture. But it's not bound. Well, let's dive in and look. I can look through all the textures, see if I actually loaded it or not. So I'm going to go ahead and do that.

As well, you can double-click on any of these resources and we'll give you a detailed view of that object, be it a texture, a vertex array object, and your GLSL code within your programs. We'll also have viewers for buffers and all the other GL objects. Similar to the debug marker extension I mentioned for annotating your frame trace, we also have a GL label object marker extension, which is great for labeling your resources.

It's easy to tell what text is which from the thumbnails, but obviously with your programs, your vertex array objects, you've just got the GL name, which is an integer, without using this. But with this, you can give a text label for all your resources, and you can instantly see, oh, this is the mesh for drawing a teapot, this is the mesh for drawing a cat, whatever.

Within the variables view area, where you should be familiar to seeing your CPU variables, we'll show you your GL state, and we'll show you all your context state, your object state, and the renderer capabilities as well, so things like what GL extensions are available. We even summarize the state, so we'll group related state into a category.

We'll say, for instance, in this example, depth, so that if you're quickly scanning through all your state, you don't need to individually pick out, oh, state depth this, depth that. We'll give you it in a single line. And as you step through, we'll highlight what state has changed since the last draw call.

So as you're stepping through, you can see how you're building your state vector up and quickly spot where you're setting something that you weren't intending. The next tool I briefly want to cover is the OpenGL ES analyzer instrument. Now, you should all be well familiar with instruments. For all your profiling needs.

And the analyzer instrument is another rich instrument within that toolkit, giving you graphical representation of GL performance data. And like all the other instruments, you can use any combination of the instruments to solve the issue you're trying to look. So you can have the analyzer instrument alongside CPU profiling, network access, whatever you choose. And as you can see here, we've got all the normal great graphs and visualization options within the instruments.

Even more interesting, though, are some of the detailed views that we give you. So, for instance, within the OpenGL ES API Statistics window, we'll give you a list of all the GL calls you make, how many times you make them, and how much time you spent inside them.

This is great for diagnosing if you're spending a lot of time in state calls or in drawing calls, as you probably more expect. And we'll even tell you how much time you're spending in each one on average, so you can check whether certain calls are taking an inordinate amount of time.

In this example, for instance, you can see that generateMipMap does take a long time, but then you should only be calling that during your initialized code anyway. An even more interesting view within this window is the OpenGL ES Expert. The OpenGL ES Expert is like having an expert in your Mac for you, diagnosing your GL ES app. It runs a rich set of heuristics to detect a plethora of GL errors and GL best practices. So it'll tell you if you're passing... You know, illegal enums to GL.

But it'll also tell you if you're doing things that might not be particularly performant, be it using an inefficient texture format or inefficient vertex formats, or making successive redundant state calls, things like that. It's great for giving you a list of all the things that you can and should fix within your GL code. Thank you.

The last tool I want to recap is the OpenGL ES Performance Detective. The Performance Detective is a standalone app giving you an easy way to do quick profiling of your OpenGL ES app. To use it, it's very simple. Just select the app you want to launch, and it will launch it on your device.

And then you want to wait until the point where the performance on your app is not what you want. So you probably want to get through your menu code, through the loading of your scene, and then dive in. And, you know, look at the frame rate, see when it's not running as quickly as you want, and trigger an investigation.

At that point, the performance detective takes over and captures a frame of your app's OpenGL ES and all the textures that you're using, all the state, et cetera. We'll then replay that frame back on the device thousands of times, each time reconfiguring the graphics pipeline in a different way. This lets us isolate particular possible bottlenecks. It's very similar to the steps an experienced OpenGL ES developer might do over the course of an afternoon. But with the OpenGL ES performance detective, it's done for you in just a couple of minutes.

Once done, it'll come back with a report telling you what is the predominant performance bottlenecks in your OpenGL ES code. In this example, it's telling us that we're fragment shading bound, and it's giving us a set of advice as to how we could address that. But I suspect the real reason you're here today is to find out what's new for OpenGL ES debugging in Xcode 4.5 and the iOS 6.0 SDK. Well, we've got a great list of features, and let's skip through the list and just start by diving into the first one.

Shader, Edit, and Continue. Now, you might have seen a preview of this during the graphics and games kickoff yesterday. It's a really, really great feature. Now, when you've done a frame capture within the OpenGL ES frame debugger, you can dive straight in and start editing your GLSL code.

Then you just hit the Update button, and we'll send that GLSL code back down to the device, compile it, and find out if there's any errors. You'll get the errors reported in line in your GLSL code, just like you would do your CPU code when you compile that. Hopefully, though, you won't get any errors, and instead, you'll see the effects of the shader straight away within the main render buffer view.

Now, obviously, as you're often doing a sequence of changes as you're trying to tweak performance or tweak the effect you're looking for, so we save all these changes to a shader edit log, which you can find in the log navigator. At that point, you can always scan back to find a version that you'd made partway through when you think, oh, I'd really nailed it, and then, oh, I didn't like those later changes. And when you resume your app, you will see the shaders inserted back into your app, your live running app. So you can tweak, resume, tweak, resume, instantly iterating to get the performance and the shader quality that you want.

Our second new feature that we're talking about today is the integrated OpenGL ES Expert. Now, as I was mentioning with the OpenGL ES Analyzer, the Expert is a fantastic tool for finding errors in your GL code. Well, now you don't need to step out and jump into instruments to get the benefit of this technology. We run it for you every time you catch a frame in the debugger.

This is great for helping you catch bugs early and often. Hit frame capture, wait for the capture to complete, and any issues that existed will now appear in Xcode's issue navigator, giving you the familiar method of navigating any issues and letting you jump straight to the code. There's no need to call GL get error all the way, dot it through your code, trying to track errors yourself. We call that every single time we capture a GL command. So you get the benefit of us automatically calling GL.

We can get error every place. And, as I mentioned, because it's in the issue navigator, you simply click on the issue and it'll take you straight to your CPU code where you're calling GL in a way you shouldn't. In this case, for instance, we can see that it's calling GL text image with some wrong parameters. Great way of finding errors quick, fixing them, and getting back to what you were meaning to be working on.

Our third new feature is multi-context debugging. Now, we've heard from a lot of you that you're starting to bring the use of multiple GL contexts into your apps, particularly for having a background context for streaming in resources, predominantly textures. Well, we've now brought support for showing you these context switches back into the debug navigator. So as you see in this example, we can see the switches of context highlighted in between the commands they're calling.

It's a great way of quickly diagnosing multi-context issues and threading issues in your code. As well as that, we've extended Eagle Context, adding a new debug label property, which lets you set a text label on your context, similar to how you can already set a text label on your thread. And if you do that, we'll pick up that label and use that within the debug navigator. So you can quickly say, ah, that's my render context, that's my background text you'll edit.

The fourth new feature we're talking about is Save and Load Captured Frames. This is another one that a lot of you have been asking us for. Now, when you're in the frame debugger, if you want to save your capture, you can quickly do that. Save your frame capture and pick it up later. It means that, you know, if you're in a rush and you don't have time to look at this particular issue, you've got this other thing to look at, you save it for later. Or you can save it and load it back on a different device.

Now, in that case, obviously, it needs to be a compatible device. If the GL extensions aren't available on the device you're trying to replay that were when captured, it won't work. But we'll tell you about that, and it'll give you a warning to tell you, not this device, sorry. But at least you can, with any compatible device, load it and start debugging where you left off.

You can also, if you choose, send the captures to somebody else. So if you think you've found a bug and it's not in your code, in the code of the graphic, like the guy in the office next door, you can send it to him. Or if you think the artists are using ridiculously big meshes for something that's far away, and you want to send them a frame capture to prove it to them, you can send that.

Fifth up is performance monitoring. Now, with games, performance is a first-class bug. So we've made it easy for you to always see how quick your app is running. You don't need to do anything to your app. You don't need to do anything to the settings in Xcode. Now you get an FPS tray if you're using OpenGL ES.

And we'll even highlight in red if you're running slow. And we pick up the frame rate you're telling CA DisplayLink that you're trying to run. So if you're running along nicely at 30 and that's what you're aiming for, we're not going to tell you that's bad. We're not going to show that in red. You'll be good.

Best of all, if you click on the performance tray, we'll show you in the main editor window some extra GPU utilization and frame timing information. So in this case, we can see the sample I captured fully maxing out the GPU almost, 95% utilization, reflected as well in the fact that it's taking 32 milliseconds to draw each frame on the GPU.

But we can also see that we're only actually using 6 milliseconds per frame on the CPU. So we could add some really cool physics or whatever we chose to at that point. On the flip side, we also know that there's no point tuning our CPU at this point. It's not going to get us any performance improvement. We really need to start working on the GPU performance.

Which leads us into our next new feature, the integrated OpenGL ES Performance Detective. We've taken the standalone OpenGL ES Performance Detective app and instead brought it fully into Xcode. So now, when you capture a frame using the frame debugger, you just need to press a button and we'll do all the performance work that was performed formally within the Performance Detective.

As well as that, once you've got the performance report from the performance detective, you're now in the debugger with the frame for which it's given you the performance information, and you can start diving in and query it, looking at, oh, you know, it says I'm texture-bound. Well, yes, this texture is 2K by 2K, and I'm sampling it in a weird way. That's probably the issue. Or, hey, it says I'm fragment shader-bound. Let's look at my shaders. You're right there where you want to be once you've got the information.

Furthermore, we have this feature we're calling OpenGL ES Performance Analysis 2.0. We've radically reworked how the performance analysis engine behind the detective works, and now we can detect many, many more bottlenecks. A particular error improvement is now we can detect CPU bottlenecks within OpenGL ES. So if you're doing things within your GL commands that are causing the driver to spend a lot of time processing state or uploading textures, we'll detect that.

And we'll tell you, hey, your state validation time is maxing you out, or you're spending too much time uploading textures. This should be particularly valuable for you guys who might be using some of these 2D sprite engine middlewares, which tend to be quite often heavy. They use swapping state a lot.

If it tells you there that your state validation time bound, maybe it's time to start looking at whether you can sort by state to minimize the amount of state changes and minimize the number of draw calls. We'll also, as I said, detect more GPU bottlenecks. And again, we'll detect your target frame rate from CA DisplayLink to tailor the recommendations the analysis engine gives you. We're not going to tell you, hey, you're bottlenecked by F. if you're in fact running at your target frame rate.

And another advantage of bringing the performance analysis into Xcode is we can join it up with the GL expert that we brought into Xcode. So now, if there's commands in your GL stream or issues in your GL command usage that is relevant to the bottlenecks that performance analysis finds, we can tell you about it.

So if it tells you you're fragment shader bound, we can say, hey, this shader is using dependent texture reads. You might want to look at that. This shader is using diamond outflow control. That might be slowing you down. You don't have to hunt for what your issue is now. It all joins together.

And that leads us on to our eighth new feature, find your slow shaders. Now, if you're shader-bound, we'll run a set of additional experiments on each of your fragment shaders to determine which is your slow shader. From there, you can jump straight to the GLSL code for that shader just by clicking on the link. And from there, you've got the Shader Edit and Continue feature that you already know. So you can start tweaking your performance, changing the shader, and seeing how that affects your performance, how it changes your scene instantly.

At this point, let's skip from talking and jump to a demo. I'll give you a tour of some of these new features. Okay, so our app's just launching on our device. Let's flip to that. As you can see here, I'm showing the light pre-pass demo that you might have seen a little during the games and graphics demo. This is a really cool deferred lighting app with the great sort of rotating fairy lights giving some really cool lighting effects.

And if I flip back to Xcode and flip to the debug navigator, you'll see here in the... in the Debug Navigator that we have our FPS. In this case, I'm running a nice, smooth 30 frames a second. And if I flip to that, you can see that at this point, I'm maxing out the device, but at 30 frames a second, so it's nicely balanced. And also see, yeah, we're hitting 32 milliseconds per frame on the GPU and 6 milliseconds per frame on the CPU.

So at this point, if we wanted to dive in more, we need to take a frame capture. So I'll just quickly flip back. to the app. And there's a couple of ways I want to show you about how you can trigger frame captures. One really great way, particularly if you're wanting to trigger at a particular point, is to add a breakpoint.

And then if you... Option click and edit breakpoint. You can set a debug action to trigger a frame capture. And you can even use some conditional breakpoint there if you want to trigger a very, very specific point. This is great if you're wanting to transition -- you know, get a frame capture at a transition from one scene to another or when you're starting some particle effect or something like that. For this part of the demo, we'll instead use the most simple method of just clicking the frame capture button here on the debug bar.

So at this point, we've paused the application, and we're pulling down all the resources from the device and recording all of the calls that we make during the sequence of the frame. And then in just a brief second, you'll see we pop up within the frame debugger. Now, the first thing I want to highlight is if we look up here in the activity monitor, it shows us that there's a couple of issues, a couple of errors, in fact. Well, let's take a look at those straight away. Errors are bad. Let's look at them. So now you see in the navigator, we've switched to the issue navigator, and it's listing two GL errors. And if I click on that, it takes me straight to the code.

And as I showed before, you know, GL text image 2D and some wrong parameters. So at this point, I could just edit the code straight away, hit stop, resume. But for now, we'll just keep going with the tour, and I'll leave the bugs for somebody else to fix.

If I flip back to the debug navigator, you'll see I can highlight the new context markers within the frame trace. They're both the render context, background, texture loader, and then switching back to the render context. And it's also highlighting what thread I'm calling on. So in this case, you can see I'm failing in respect to my background texture loader by doing it on the same main thread as the render context. A great way for doing the background texture loading with minimal effort is to use a GLKit API for asynchronous texture loading.

Now, if I disclose any of these groups, you can see that within the groups, there's more GL calls. This is because I've used the GL push group marker, pop group marker API that I mentioned earlier. It's a great way for giving that semantic detail to your frame trace.

And as I mentioned before, if I disclose any of these calls, I can jump back to the CPU code where I made any of the GL calls. This is really great for sort of, you know, you highlight, you find your error using the GL debugger, then you can dive straight back to where that bug is in your code and fix it there and then.

So down here in the variables view area, we have, as I mentioned earlier, the GL state. And we can see at this point that I'm showing the... that the depth has been changed since the last call. And if I disclose, I can see that, for instance, the thing that has changed since the last draw call was the depth write mask.

This highlighting of what's changed since the last call is really great for figuring out, oh, this should be just the same as the last call, but this one thing's different. That could be what's wrong. And we do the same status coloring within the bound GL objects as well. So if you disclose, for instance, the vertex array object, I could go in and see what's changed about my VL.

And I can see that I've switched to a different VAO since the last call. But in this case, it's telling me that the real difference is I've switched to a different VAO. Now, up here in the main render view, we show, as I mentioned, what the current buffers are at that point. And I can toggle between hiding and showing wireframe.

But I need to be at a draw call where I'm drawing something for that to have much of an effect. Now, if I zoom in, you can see that as I zoom in, both the color and the depth buffer zoom in in synchronization. And if I pan, both buffers pan in synchronization.

This means that if you're inspecting a part of your color buffer, trying to figure out why is this drawn here, why is that not drawn there, you're looking at the same part of each buffer in unison. I can also just flip to just show the color buffer as desired.

Now, over here in the Assistant area, as I mentioned, we show you the bound GL objects by default. But you can also flip to all GL objects, and you can even choose to say filter to show me only the programs. And as I mentioned before, it really makes it a lot easier if you use a debug label extension to label your programs as you wish. There's a great first step, for instance, for labeling your textures or just label them with the file name. But if you want to get smarter, you can use whatever sort of semantic labels you want. Thank you.

Because we're harnessing the power of the Assistant view within Xcode, you can open as many Assistant windows as you want until you run out of screen space. But as I'm sure you're all going to be picking up the new Retina MacBook Pro, you're going to have a lot of screen space to play with.

Now, a great additional view we have within this API, within the assistants, is the stack view. This lets us view the CPU code at the current draw call and tracks automatically as you step through. So if I start stepping through the frame using the step markers, you can see the wireframe showing me where I am, and you can see the CPU code tracking where I am. This is great for knowing what you're doing at any point. Anyway, let's close that for now and get some more screen space back and flip back to the bound GL objects.

If I double-click on the texture view, I can see the texture in detail. And you can see in this case, I've got a texture with a lot of mipmaps. And I can choose to view a particular level as I wish. And as before, there's zoom and flip and rotate. It's really great for flipping your texture whichever way you want to see how it's coming in.

You can also directly navigate between resources using the navbar. That's a great way to flip quickly from one resource to another. If I flip, for instance, to the VAO, you can see that we've got a nice, simple VAO, in this case, drawing the fairies with just position and color bound.

The last editor I want to show you in this section is the Program Editor, where we can jump to either the Fragment Shader, or the Vertex Shader. Now let's just resize this to give us a little bit more room as we're using the super big presentation font. Here you can see that for this particular draw call, I'm using where I'm drawing the fairy lights.

If I flip back to the vertex shader view, I can start editing the shader code. And in this case, because we're drawing point sprites, you can see that here we're setting the point sprite size in GLSL code. So I can simply edit that to make the point sprites 10 times the size and update the program.

Well, here we can see instantly that I don't really know what I'm doing. I've got an error in my GLSL code. Well, it's telling me exactly what it is. Divide does not operate on int and float. So let's fix that by putting a 0, a double point 0 on the end to make it float by float and update again.

Now, let's flip the wireframe off. We can see our scene is instantly updated with the change from that shader. Now, the point sprites are much bigger. And I like that effect. It's kind of cool, but I think we want to make it a little smaller because it's a bit too psychedelic.

So let's flip it down to half that size and update again. That's much nicer. Well, now, if I switch-- if I resume the app, And switch back to the iPad, you can see that those bigger fairies are in our live running app. Now I could start navigating through my app as I chose, and the effects of those shader edits are there for me. And you can edit as many different shaders as you want from change to change. It's really great because now you don't need to go through that long, make an edit, recompile, redeploy, test.

Redeply, recompile, redeploy, test thing for shaders. You can tweak them quickly in place. Really, really quick option. So at this point, I've exhausted the level of my manager coding ability, and it's time to turn things over to a real engineer who's going to take you through fixing some real live issues within a version of Light Prepass that I've broken for him. So at this point, I'd like to welcome on stage Michael Mayers.

Now that we have taken a tour of the features of these tools, let's go ahead and put them to good use and crush us some bugs. So I'm going to go ahead and open up the problem project. In this case, we have a hobbled version of Light Prepass like we have seen before.

And let's go ahead and boot her up. I am now going to switch over to the device. So, immediately we can see we have a few issues. We see that the light orbs have a strange blending artifact and the temple's geometry is severely lacking in detail. We can also see that the frame rate is a little bit too low.

So let's go ahead and switch back to the debugger and initiate a frame capture. So I'm going to go ahead and click the frame capture button down here. So right now the debugger is retrieving the sequence of GL commands and set of resources that compose the captured frame. And once that is complete, we will be shown the frame buffer that is bound at the end of that frame.

So here we see that. Let's go ahead and set that to auto. We see that we have a depth buffer bound and it looks just fine and the color buffer looks as messed up as it does on the device. So now the question is, what next? Well, when in doubt, head over to the issue navigator, which is over here, and we see a laundry list of issues to fix and two of them stand out in red. So let's go ahead and fix one of those.

All right. It is telling us that we are passing an invalid enumeration into glEnable. And those of you that have been around the block with gl before know that this should instead be glBlend. So let's go ahead and fix that. And move on to the next issue. So next we will tackle the temple geometry's missing detail. And to do that, I'm going to go back to the debug navigator.

I'm also going to bring up the Assistant view. And now we will scrub to the draw call where the temple's geometry is produced. And I'm going to employ a binary search method when I'm scrubbing. So I'm going to go ahead and scrub to the halfway mark. And now the quarter mark. Now the eighth mark. So we're pretty close. Now I'm going to step--there we go. So let's go ahead and kill the wireframe.

And reveal the draw call in the debug navigator. And I'm also going to hide the depth and stencil. So I see that the engineer that programmed LightPrePass used the group marker APIs, and that is allowing me to see that the current draw call is within the Gbuffer pass. And this means we should be laying down some normals. And what I see here does not look like normals to me. It looks more like a colorized depth image. So I want to go to the Assistant Editor and bring up the Currently Bound Resources.

So I see that we have a normal map bound, which seems relevant to this normal pass. So let's go ahead and open that up and make sure it's okay. So I see that there is plenty of detail present in these normal maps and all the MIT maps are present, but I don't see any of these details present in the render buffer. So staying late in the pipeline, I'm going to switch over to the fragment shader and give myself some more room.

Okay, so nothing immediately stands out, so I'm going to go old school and use some printf style shader debugging. And I do this by overwriting gl_frag_color with a variable of interest. And right now I'm interested in v_normal, so let's go ahead and do that. And extend it to four dimensions. And I will now update the shader.

So we see in the render buffer view that it has updated itself to reflect the changes we made in the shader. But what I see is just an unbiased version of what we had before. So I'm going to move on to the next variable and update once more. And there is no change. And this is highly suspicious. So I'm going to go ahead and set the shader back to what it was. and collect some more evidence. We are going to go to the VAO view and check out the vertex stream feeding the vertex shader.

So I see that we have one, two, three, four, five vertex attributes, each having three components. And if we look a little bit closer, we see that we have the same data in each. I see 9-0-5, 9-0-5, 9-0-5. Well, this is definitely wrong. And to investigate further, I want to go down to the Variable View and open up the same VAO. I see that we have an element buffer bound and five vertex atribs, as expected. Let's go ahead and open up two of these vertex atribs.

So I see that both these ATRIBs have the same exact setup and this suggests that they're sourcing the same data, which is what we can see elsewhere. I can also note that since the stride is 60 and the buffer binding is 1, we are using a VBO, that these are intended to be interleaved vertices.

So I now have enough information to see what is wrong. It is the vertex-attrib setup. So I'm going to go ahead and search the code for all vertex-attrib setup. So I'm going to go to the search navigator and type in gl vertex-attrib. We have several hits, but the one at the top stands out in particular because it has the name Temple in it. So let's go ahead and select that and give myself some more room.

Okay, so at first glance, this GL vertex-attrib pointer call looks just fine. I'm going to go ahead and scan around a bit. Okay, I have found a problem. I see that we are setting the vertex stride twice but not the vertex offset, and this is because of a copy and paste bug. Well, I'm going to go ahead and fix this copy and paste bug with another copy and paste.

And now that we have fixed two issues, I think it's time to rerun the application and see where we stand. I'm going to go ahead and switch over to the device. So, we have cleansed the artifacts, but the frame rate is still running a bit slow. So let's go ahead and switch back to the debugger and start another frame capture.

So again, the debugger is retrieving the sequence of commands and set of resources that make up the frame. And that last bug was a little bit tough, so this time I think we're going to go to the issue navigator. and pick off something that stands out. So I am liking the logical buffer load. I'm going to go ahead and click on that.

And it is telling us that we are not clearing a portion of the frame buffer. And this means that we are incurring some additional memory traffic because the hardware's tile renderer needs to reload the previous contents of a given tile. If you instead clear that, it can initialize it with a constant value.

This is happening, of course, because of another copy and paste bug. We are clearing color depth and depth, not color depth and stencil. So let's go ahead and fix that. And we will rerun the application once more. Let's switch over to the device. Okay, we did not quite get back the frame rate I was hoping for with that. So let's go ahead and switch back to the debugger and break out the big gun.

So I am going to go over to the debug navigator. And we see that with the application running live that the debug navigator will have this performance tray, which is currently showing us the frame rate. And if I click on that, we can see a course breakdown of where the application is spending its time. And down here, we will see the big gun, the Analyze Performance button. So let's go ahead and click on that.

So like the debugger, the performance analyzer needs to retrieve the frame. But once that is complete, it will run an extensive battery of experiments on that frame repeatedly so that you don't have to. And with each iteration, it will modify the experiment slightly to collect another data point. Once all those data points have been collected, it will run them through a series of hereafters. modeled after an experienced graphics developer.

And that in turn will produce a list of issues that are affecting your performance. So let's go ahead and give the analyzer a second. Okay, so we have some results now. It is telling us that the performance is limited by the cost of fragment shader processing. And this could mean that we have too many fragments to shade or that the cost of each fragment is simply too great.

Give it one more second. Okay. The analysis is complete and it's telling us that we are bound by the fragment shader bound to program number three. And if I click on that, we will be taken to the offending shader. And I see here that we are using a software shadow filtering fallback path, and we should really be using hardware since we know that this device supports it.

Now I'd like to make a note that ideally this is something that should be done at startup dynamically based on some device selection method. And if you need help with that, head on down to the OpenGL ES lab and an Apple engineer will be happy to show you how. For now, let's go ahead and update this program. And the performance analyzer is telling me that the analysis it just ran is now invalidated since we've modified the shader. And that to get new analysis, we should rerun the analyzer.

But we're not going to do that right now. We're just going to go ahead and continue to see where we stand. Okay. And we are at 30 frames per second. The artifacts have been cleansed. And with that, my job is complete. So I'm going to go ahead and hand the torch back to Seth. So finally, just a few workflow recommendations to help you get the best out of these tools. Firstly, always have a spare USB cable handy.

Secondly, the issue navigator, as we showed you, will find all the GL issues for you. A plethora of issues, both errors and warnings. Fix them early, just like you would do compiler errors and compiler warnings. And remember as well, any GL errors can leave GL in an undefined state. So you might think that you've got great performance, but then when you go and fix the one GL error that's causing some rendering artifacts, you might suddenly be running at half the speed.

So, you know, it pays to fix your GL issues early. Remember to annotate your frame with the push group marker, pop group marker calls. It makes it a lot easier to navigate your frame within the debug navigator. Really, you know, the investment you put in, just putting a few simple annotations in there, probably putting whatever your mesh names are, which you probably have from your scene graph, into the debug navigator makes it so much easier to traverse. And similarly, label your resources with GL label objects. Shader edit and continue, as we talked about earlier, is great for letting you tweak your shaders in place. So it's really great for making those performance versus quality tradeoffs.

Once you've got the effect you want looking well, you can use it to quickly tweak precision level or how many taps you're taking if you're using some funky filtering or whatever. So, yeah, really useful for letting you do that in place without having to continually rerun. And finally, use the integrated performance analysis to find your bottlenecks. You know, do that.

Don't just assume you know why you're slow. There's this great tool built right in Xcode now that will tell you what your GPU or GL performance bottlenecks are. And remember to think about performance early. Now, you know, premature optimization is the root of all evil. But on the flip side, if you're trying to tune gameplay in a game and it's running at 10 frames a second, it's really hard to get the gameplay balance you want. So, you know, tune your performance early.

For more information on the tools, ping the technology evangelists. Both Alan Schafer and Mike Djuravich can help you. And remember, the developer forums are a great source of information as well. So, in summary, great OpenGL ES developer tools that just got a lot, lot better in Xcode 4.5. So please harness them to make great games. Thank you.