Advances in OpenGL ES for iOS 5 - WWDC 2011

Graphics, Media, and Games • iOS • 54:12

OpenGL ES provides access to the stunning graphics power of iPhone, iPad, and iPod touch. See how you can tap into the latest advances in OpenGL ES for iOS 5 and harness the programmable pipeline enabled by OpenGL ES 2.0. Get introduced to the new GL Kit framework and learn how your apps can take advantage of its built-in features and effects.

Speakers: Gokhan Avkarogullari, Eric Sunalp

Unlisted on Apple Developer site

Downloads from Apple

HD Video (317.1 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Welcome to Advances in OpenGL ES for iOS 5 session. My name is Gohkan Avkarogullari and Eric Sunalp will join me on the stage to talk about some new cool features. Every year we release a new version of iOS and with that new release we basically make new features available. Two years ago we made OpenGL ES available on iOS platform.

Last year we did it with MSAA, Discard, DevTextures, and such. But sometimes releasing new hardware makes things possible and this year we released iPad 2. And with iPad 2 you get a lot more graphics horsepower, you get a lot of CPU horsepower, and you get a lot of memory bandwidth. Which enables you to do new things or do all things in a new and improved way.

I'm going to talk about an application on App Store that took advantage of iPad 2's processing power. And they basically changed their pipeline to do something specific on iPad 2 that makes it look much nicer relative to iPad 1 and other devices. That's Real Racing 2 HD from Fireman. And I'm going to walk over a few examples of what they did on iPad 2.

The dashboard in this scene, as you can see, is almost one-third of the screen space. And it's very visible to the player. And what they did on the dashboard, they basically used normal maps and per-pixel lighting. That takes advantage of the compute power of iPad 2 and memory bandwidth as well.

And on the asphalt, they basically use multiple textures, light maps, gloss maps, specular maps. And basically, they took advantage of iPad 2's very large memory bandwidth. They implemented dynamic shadows on the dashboard depending on where the sun is and the car is. MSAA. MSAA, if you can't change anything on iPad 2 but only one thing, you probably should enable MSAA and see how it works. It should probably just work fine and you get instant image quality enhancement. But you should do more. If you can do only one, do MSAA.

And finally, with iPad 2's processing power, they were able to push a lot more polygons into the device. They have much higher poly models they're using in the games. OK, so that was what the hardware gives you. But we also added some software features. I'm going to talk about-- we're going to talk about GL Kit, a new framework that we are introducing with iOS 5, and the new features that we enabled through OpenGL extensions, OpenGL ES extensions. So let's start with GL Kit.

Before I start with GL Kit, I want to talk about using OpenGL ES 2 in your games. And I want to talk about an example again here. Brad Larson has an application called Molecles App that has been on the App Store for some time and was using ES 1.1 as its rendering backend. And this was the kind of rendering he was getting using ES 1.1. And when he actually set out to improve it, he decided to use OpenGL ES 2.0 and the power that the programmability of OpenGL ES 2.0 gives it.

And the end result, as you can see on the right side, is basically a much better looking model, in this case a molecule visualization. So visualization is very important for this application. And he uses techniques of multiple passes and true programmability of ambient occlusion, per pixel lighting, great specular, as you can see. So these are the kind of things you can do with OpenGL ES 2.0. And when we set out to do GL Kit, that was one of our goals, to make these kind of things possible.

So we had two primary goals. The first one is basically making life easier for you, the developers. And the best way to do it is basically find problems that are common to every one of you and then provide solutions for those. The second goal we had was to basically encourage a unique look for each graphics application.

The ES 1.1 fixed function pipeline is basically very powerful, but it's limited in the ways of how it can do rendering. It does BlinFong shading, for example, and textures and such. You can differentiate through textures. But it kind of limits you what you can do. So the idea is that to allow you to basically use shaders, but we also understand that you have a lot of investment in ES 1.1.

and we wanted to help you with porting your ES1 applications in 2.0 and make it easier. So these are two goals of the GL Kit. Basically making life easier through common solutions and allowing you to easily port your games from ES1 to 2.0. And it has four sub-parts.

I'm going to talk first about the texture loader, just briefly. So basically the idea of texture loader is you give us a reference whether it's on the network, on the file system, and you get an OpenGL ES texture object back. So you don't need to deal with image IO, libping, libpng, and such.

The second subpart is the view and view controller. So we looked at our template and there's hundreds of lines of code to set up FBOs for multisending and not, and set up the display links and such. And we thought that we can make it much simpler by giving you this UIKit style view and view controller that also works really well with the UIKit hierarchy.

And the third part is basically a 3D graphics math library. OpenGL ES 2.0 lacks the transform API, the matrix stack API of ES 1.1. And also we've seen that a lot of people implementing vector and math libraries themselves. And basically the GLK library gives you that, all the functionality of the OpenGL ES 1.1 that you can reuse on OpenGL ES 2.0, and also adds vector and matrix stacks to that.

And finally, the GLK effects. Basic GLK effects are fixed function pipeline features implemented in ES 2.0 context. I'm going to have Eric come over here and talk about some of these. Thank you. Thanks, Gohkhan. All right. Hi, everyone. So we're going to start with the GLK Texture Loader.

So like Gohkan said, we want to make texture loading as simple as possible. You know, you give us a file reference, we decode the image, we load it, create a GL texture out of it, and there's pretty much just, you know, one API call to get this thing loaded, so you don't have to worry about setting all your OpenGL states. We're going to make this as simple as possible. And like he said, there's a number of common formats we support, PNG, JPEG, TIFF.

Pretty much almost every format that Image.io supports. So, yeah, no longer do you have to load a CG image, decode that, blit it into a CG image context, get that data back. Pretty much just one API call now to do that. One of the biggest problems developers have had is dealing with non-premultiplied data.

Now when you, you know, using our library, you can guarantee the image won't be premultiplied, which is a huge thing. So we have texture 2D support, QMap support, and a number of convenient loading options when you're loading your images. If you want to force premultiplication, that's an option you can set. We'll Y-flip the image for you if you want, put it in a native GL orientation, and you can turn mid-map generation on or off.

So basic usage, really simple. You set your Eagle context current. You make our single call to the class loading, to the class, one of the variant loading methods. And you get back this object, this GLK texture info object, which has all the pertinent information you need to do your rendering. So you get the most important texture name that you can bind with, the width and height of the image, alpha if it was pre-multiplied, or if there was no alpha pre-multiplied or not. The original origin of the image and if we map it for you or not.

So let's run through a little code example. So as you can see, just a couple lines of code here. We're going to set the context current, load a path, you know, pretty canonical way of loading your image. Here I'm just setting one option entering the dictionary, which is generate_mid_maps, probably pretty common. And we're going to make our single class method call to load the 2D texture.

And you can see-- so basically, if that call fails, you may want to do some error checking. That will return nil. Otherwise, you can check-- so if it returns nil, basically, you have error there, too, to check to see if you want to see what went wrong. Then we assume you do some work, and then there you go. There's a texture name to bind with. And that's pretty much all it takes to load a texture.

So in addition to a synchronous usage case, we have an asynchronous usage case. Take advantage of iPad 2 multi-core devices. Very similar usage pattern, but in this case, instead of using one of the class methods, you're going to actually allocate a GLK texture letter object and give us your contact share group.

And in this case, you're going to call one of the instance methods and provide us with a GCDQ and a completion block. So the GCDQ is basically where that completion block is going to be called on. And through that completion block, you'll get back a GLK texture, just like you did with the synchronous case and an error if there was one loading.

So the code example for that case, same thing, load the path. And here we're going to get the share group for your context. Give that to us when you create the GLK texture loader object. And there it is. Just make a single call. Give us the path and whatever your completion block handler is. And here I just kind of, in this example, kind of doing a simple wrapping on, you know, just informing my app that, hey, you know, this thing is completed and here's the texture info that I'm going to need. So really simple.

So the next thing we're really proud of is the GLK View and GLK View Controller. Like Gohkan said, this thing was, you know, you saw it on our template, probably a pain to deal with. Hundreds of lines of code. I don't think MSA was there by default. You know, something you had to code for, among other things. So it's pretty much everything you need to get OpenGL ES into view and on screen. Again, like Gohkan said, it's a part of UIKit, so it's not part of UIKit, but it's a UIView subclass. So it fits into the UIKit model really well.

Responds to set needs display if you want it to. So it behaves, you know, pretty much like a UIView in all of its associated methods. Except ours is a little bit different. You can set a delegate or you can subclass the class for drawing. And here's a number of things it automatically handles, which was normally done, partially done in the template for you, but now done automatically.

Basically creation and deletion for color, depth, stencil, turning on MSA or not. And that's performance win. We'll do the discard for you automatically. And then pretty much, you know, every time you draw, before you draw, you're always going to set your context and your FBO current and then present that drawing afterwards. So we also do that for you.

So this is kind of even more similar to UIView in that, you know, if you're going to subclass this thing, you implement draw rect, setting the context, setting the FBO current, and then presenting afterwards, just like, you know, something you don't have to think about in UIView. And here it's all done for you.

And then additionally we have a snapshot support that returns a UI image for you if you want to use that. All right. So the next thing is the GLK view controller, subclass of UI view controller, so it fits into the view controller model nicely. And the main thing here is that it's coupled very tightly with the GLK view and it handles the redrawing of the view.

So kind of the main value you'd probably be setting most often, unless you want the default, is the preferred frames per second. And the preferred frames per second specifies, you know, maybe 30 or 60, a value that we'll try to get as close to as possible on the device or the display that your view currently resides without going over the display's refresh rate or your preferred value. And then additionally you pause and resume any time you like, and automatically when your app goes in the background, we'll pause for you, and when it comes in the foreground, we'll resume for you.

And that can be disabled if you want to do, you know, your own custom work. And then also to kind of help separate your scene logic or maybe your physics logic, we provide an update method that stays in sync with draw, so you can kind of keep those two things separated.

And then we provide a number of statistics that you can query for, such as number of frames displayed. All right, so here's a code example. This is the simplest case of just using the delegate example. Here I've got a view controller that I assume I loaded from XIB file. And here I'm going to query for the GLK view out of that.

If one wasn't set up in the XIB file, this will be automatically created for you, just like all other view controllers. So you get back the GLK view, and here I have this third class called game, which I'm assuming does all my, you know, my scenes. My rendering. And I just need to basically assign it to GLK view as a delegate and give the GLK view my context.

And then basically here I have a number of drawable formats I'm setting as an example. RGB 888 is just basically default, so if you didn't want depth and you didn't want multi-sample, you really wouldn't have to set any of these things. You would just have to set your delegate in context. Next I set the same delegate for my game because I want to make use of the update method. And I set a preferred frames per second, which is 30, which again is also the default.

Okay, so here, this would be like the game class. Pretty much all I would have to implement in that game class would be GLK view controller update and GLK view draw and rect. And next we'll look at the subclass and actually really quickly. Yeah, so basically when you get, you know, draw and rect, and it's actually true for the subclass case, the rect will be in points, not pixels, so you can use your content scale factor to figure that out. So the subclassing case, this common method you're used to, drawRect, just subclass that and basically even if you set this thing as a delegate, you would get this one called and then for the view controller update is what you'd implement.

All right, next is GLK Math. So this is a 3D graphics math library. We have over 175 functions that you don't have to implement. We have a number of common types, 4x4, 3x3 matrix, 4 and 3 and 2 component vector types, and even a quaternion type. And the idea is you don't have to write this for ES2, but we also wanted to, when we were looking at what would you guys have to do to port from ES1 to 2.0, you know, you have a lot of this list code that already deals with ES1-1 matrix stacks and such.

So we also additionally added a matrix stack functionality so that your port would almost be one-to-one with all the equivalent ES1-1 math functions. And we try to make it as high performance as possible, so, you know, functions are in line where possible, and we make use of all the great hardware in our devices. So here's kind of to impress upon you the number of functions that we have. Should really help out. It's kind of hard to see.

So here's our one example. This would be basically maybe some simple 1-1 code you'd have, and this is what it would look like in GL Kit. These are core foundation types. I'm creating a projection stack and a model view stack. And I'm basically actually using a GLK matrix 4 type to create a frustum matrix, and then I'm loading that onto my projection matrix stack.

And then I'm going to, based on my model view stack, load an identity and do a number of equivalent ES 1.1 transforms, translate, rotate, and scale. And yeah, pretty simple. And that's it for the math, and I'll handle it over to Gohkan now to talk about GLK effects. Thanks. Thank you.

Thanks, Eric. So GLK effects, as I said before, is basically re-implementation of fixed function pipeline features within the ES 2.0 context. So you get great visual effects with minimal effort. You don't have to write shaders for those. You don't have to basically create probably hundreds of different shaders depending on your state.

It's a great way to go from 1.1 to 2.0. And the main goal of it is interoperable with custom OpenGL ES 2.0 shaders. So you can basically do part of your rendering with GLK effects. And if you have something that's specific to your application that uses a special kind of shader, like a particle effect or GPU skin object, you can do still ES 2.0 shaders and basically mix and match them. So we come up with three effect classes, named effect classes.

We have base effect, reflection map effect, and skybox effect. And I'll talk about each one of them in a second. So the general architecture and usage of it is very simple. You need to configure your vertex state. This is basically standard OpenGL ES 2.0 vertex state setup. The only thing in here is that you use a predefined GLK effects vertex attribute name so that we can basically find out which vertices you specified and what they correspond to in our shaders.

And then you all look in at your GLK effect class instance, either the base effect, reflection map effect. And then configure the effect parameters. You can basically enable things or set some parameters. And then once you're done with configuring your effect, you basically call prepare to draw. In this case, what we're doing is we're not generating internal shaders every time you change a state. We basically want all your state to be set up, all your finalization to be done.

And at that point, we'll set the GL state. We'll generate or reuse the necessary shaders. And then we're ready to go for your rendering. And you would bind your VAO as the best practice. You should be using VAOs. And then you'll make a draw call for either draw arrays and draw elements.

So this is how it works. It's very simple. Let's talk about the base effect. Base effect basically captures most of the ES 1.1 functionality. It does things like lighting, multiple lights, material properties you can set for each light, and light properties you can set, multi-texturing, and fog, constant color, the transformations that you can do on your objects. The things it doesn't cover is basically texture combiners and clip planes from ES 1.1.

And here's an example. This part is basically how you set up your vertex state. It's no different than any other ES2 vertex state setup. In this case, you're basically generating a VAO and then your VBOs and putting data into your VBOs. The only point in here is basically when you pass a vertex attribute pointer, basically use names that are specific to GLK. In this case, GLK vertex attribute position and GLK vertex attribute normal when you set your position and normal raise.

So the second part is basically allocating and initializing your effect. So you get a base effect in here, and then you start setting your parameters. In this example, actually, we went a little further than what ES 1.1 did. We support also the per-pixel lighting. And so you basically set up the lighting time to per-pixel and enable a light. So you enable those things based on -- basically, we try to keep as close as possible to ES 1.1.

So all our light parameters, all our material parameters are very similar to ES 1.1. And so are the default values for those parameters. So if you don't set anything, you get basically the default ES 1.1 state kind of parameters. But if you're going to change it, this is the place to change. You set your material properties.

Here in this case, we enabled the light. We set a diffuse color. We set a shininess. And then we're ready to draw. So we basically let the effect note that we're ready to draw, and we call the prepare to draw API. At this point, all the state is settled.

All the necessary shaders are generated. We try to generate basically the optimum shader. So if the things are not enabled, there's no shader code to run non-enabled stuff. And you bind your VAO and make a draw call, and you get something drawn like on the right side here, per-pixel light.

Skybox effect is a little different. It provides a scene skybox for your application. If you can think of it as a 360-degree background image. And it has very few parameters. You can center it, change the center of your skybox or resize it. Since it's a skybox, there's no actual vertex array state initialization. There's no requirement for that.

Here's a code example for that. We allocate and initialize our skybox. Basically, we need to give it a -- our skybox is going to be a cube map. Here we're basically assigning a cube map, and we're actually using texture loader to load the cube map. So that's just one line of code.

And then we set the prepare to draw. So our skybox is ready to be rendered. And then we can call the draw API. The second difference is basically you can see the skybox has its own draw API, because we didn't set the vertex state. So you have to call the draw API on the skybox.

Reflection map effect builds on the base effect. So we added reflections to the base effect. So everything that you had for base effect, like lighting, materials, and textures, multi-texturing, are still available to you. It's a basic subclass of the base effect. You can set them as you did before. And on top of that, it adds reflection mapping, a cube map type environmental mapping. And we basically use the specification of the desktop specification OpenGL 2.1. So we try to stay as close as possible to that one. And we use a cube map for reflection samples.

So initialization of vertex array object is no different than the base effect. So in this example, we're not actually covering it. And we create our effect. So the difference here, for example, from the base effect, when we set the parameters, you can set the same parameters over here, but you need to assign a kubemap texture for reflections.

Now, when you have a reflection on an object, it is independent of the position of the viewer. So if you pass only the model view matrix to the reflection effect, it won't be able to figure out how to create the reflections. If you think about like a shiny car next to another car, it doesn't really matter where you're looking at the shiny car.

The reflections are always at the same place on the car. So therefore, and when you construct your model view matrix, you need to basically decompose it into the model part and the view part. And for the base effect part, you basically pass the model view part entirely to the base effect implementation of the reflection map. But you also pass the transpose of the model matrix to the reflection map effect so that it can actually calculate the reflections correctly.

As a base effect, all you need to do is once you set the parameters and pass the cube map and give the transfer of the model view matrix to the reflection map, you set the prepare to draw. Everything is ready to go. And you set bind your VAO and then call the drawways or draw elements. And you get basically your rendering of them.

So we have a technical demo for this. I say technical because there's not much artistic stuff in here as you can see. So we have a background image, background color really, and a sphere that doesn't look like a sphere because it's not lighted here. So we got this skybox from Emil Persson, also known as Humus. So you can see it's 360-degree background image if you can think of it that way.

And this is basically the skybox is loaded by the texture loader in here and it's rendered by the skybox effect. So on our object in the front, it's a sphere, we can basically texture it. And you can see that it is a sphere when it has a texture, it just shows it more. And you can do per vertex lighting. This is basically a base effect. All you do is set the light position and light parameter and then just say enable that and then prepare to draw and draw.

Last thing, or we can use a reflection map effect and add reflections to it. So you can see that it is reflecting the skybox that is around the sphere. And you can see basically it's just two lines of code. It really is setting up the same skybox cube map as the cube map for the reflection map effect. And then setting the transpose of the model matrix on the reflection map and then just go from there.

And one final thing. Both the reflection map and the base effect support per pixel lighting. So we can enable per pixel lighting. And it's very hard to see in this one because there's not strong speculars, but you would get per pixel lighting. Okay, so that's for our demo.

[Transcript missing]

Three categories. The first one is we basically got a lot of requests in terms of enhancing the OpenGL ES and AV foundation interaction. So we tried to solve that problem, the issues with that use model. The second set of extensions are basically to help you enhance image quality on your graphics applications. And the third category is the extensions and the new features that help you improve the performance of your graphics application.

So let's start with enhancing the OpenGL AV foundation interaction. So what we set out to do was basically create a direct path from video subset of the system to the graphics and the opposite as well, from graphics to the video. And what I mean by direct path is basically it's better understood when it's contrasted with iOS 4.

In iOS 4, if you were to get your video captured data into OpenGL, and in this example, I'm assuming that you are trying to convert them to ARGB and get into OpenGL, you will have your capture session with this AV foundation created buffer pools. You will capture your data into that.

You will be using image buffers and then you will be transformed from YUV space to ARGB space. And then to pass your data, you will call GL tech image and we would copy your data at that point from your storage to our storage in the OpenGL API. And basically you would have this extra copy happening on the OpenGL side. So this is basically -- the same applies for YUV data. You could basically use two different planes and create a texture from each one of them, but at the end of the day, both of those textures have to be copied over.

[Transcript missing]

So I have a small demo about this as well. So this is basically the texture cache of the GL text image. So when I move the camera, it's a little choppy because it's not running yet. It's trying to run at 60, but it's running at 48 because there's a lot of copies going on.

And when we turn the texture cache on, it's much smoother and it says 60. And basically, this is video data texture put on a highly tessellated quad. And so basically, there's a ripple effect that's changing the vertex positions to create the ripple effect. So I'm sure you guys are going to use it for something much more interesting, like things like augmented reality maybe, or doing barcode scanning, or getting your OpenGL images out and encoding on a video encoder. Okay, that was a Averil Foundation OpenGL ES interaction.

So the second category we identified that we wanted to work on was enhancing the image quality. And we have basically implemented the infrastructure for high dynamic range rendering.

[Transcript missing]

I have a demo for this one. So here we have basically a skybox that is high dynamic range, a 60-bit float skybox.

And so the first effect I talked about was how you go from indoors to outdoors and come back, and then when you go out, the exposure changes significantly, and your eyes have to adjust to it. So here's an example of that one. As you can see, this is possible to implement with high dynamic range targets. Going back.

I hope it's visible to you as much as it's visible to me over here. So the other thing we talked about was Basically, let me change the exposure. So you can see you can actually set your target exposure point to anywhere. and you would get whatever part of the exposure you want to capture. And this image in the back is actually a very high exposure range. It's, I think, nine different exposure levels that the image was taken with. Okay.

The other thing I talked about is basically you can change the bloom in the scene. You can see if I turn off the bloom, the light doesn't really interact with our semi-transparent bunny, which reflects and reflects light, but there's no bloom around it. In real life, when you have an object that is in strong light behind that object, you have this light coming around that object, surrounding that object. So you can capture those kind of effects with high dynamic range. You can go crazy with it if you want.

You can see. Let's put it into some scene point. And I talked about tone mapping. So tone mapping takes a 16-bit value and maps it into an 8-bit value in a smart way. And another smart way of doing it is basically just mapping and clamping it, just mapping the lower dynamic range and clamping it. And if I turn off the tone mapping in here, you can see our tone mapping is just basically clamping the values to 0 to 1 range. So the overexposed parts are not even -- you cannot tell the detail.

But if you go to underexposed parts -- sorry, if you go to the darker part, without tone mapping, just this part shows up, but the overexposed parts don't. Those do not show up. But with tone mapping, basically, within the same render target, you can capture both kind of exposure levels.

[Transcript missing]

Okay, another image quality enhancement we added to iOS 5 is improving the quality of the shadows that you can have, dynamic shadows you can have in your games.

So here's a real life example. There's a big window and then the sunlight, which is a planar light in this case, is being occluded by the big columns within the window. And you can see the light and the shadows. And there's a part that doesn't get any light at all except for the global ambient light. And it's called Umbra. Basically, the column entirely blocks the sunlight for that part.

So that's easy to capture with the shadow mapping technique in your games. But there's this part where part of the sunlight is blocked, but since it's not an exact point light, some other parts of the light is leaking in and creating the soft edge on the shadow. So this new extension that we have called Apple Shadow Samplers enables us and you to simulate that kind of softer edges on the shadows. It uses a technique called percentage closer filtering. So what it does is, if you think about the shadow mapping, shadow mapping is capturing the scene from the perspective of light.

And then when you're rendering it from the perspective of the viewer, projecting that to the view and comparing the depths and finding out if that pixel on the screen is seeing any contribution from that light or not. But it is a binary test when you're doing it on a per pixel basis. A percentage closer filtering basically moves it from a binary test to a four sample test that will allow you to generate any range between zero to one.

And basically you do four depth texture tests and compare and return a weighted average. And that tells you how much light contribution you get from that light, which also tells you how much shadow that pixel gets. And again, it's a feature that's only available on iPad 2. So here's an example of the shadow map.

The depth texture from shadow map is mapped onto the depth values of the viewer and from the viewer's perspective. And you can see on the right side, basically there's no confusion. There's no light leaking in from that light for that particular pixel. And the blue one over here is a pixel that is getting actually when on a projected shadow map is basically corresponding to four values from that shadow map.

And you can see that it is partially getting light and partially not getting light. So the light will be attenuated in this case. Okay. So we added new GLSL define called GLApple shadow samplers. New sampler type, sampler 2D shadow. And two new GLSL functions, shadow 2D apple and shadow 2D apple projection. Projection apple.

And here's a fragment shader example. So what we're doing here is basically our sample 2D shadow is getting the depth texture from the perspective of light. And we're also passing an attenuation factor from our vertex shader. And in the first line, we're basically comparing using the coordinates of light, light texture, how much of it is passing these comparisons. And these four samples are taken. And then using that value, we're finding out how much light contribution we're getting.

And the final fragment color is based on the original color and the light contribution multiplied with that color. Now, this gives you software shadows. And if you have significant compute bandwidth available to you, if you're not doing anything else, you can actually run -- you can jitter the texture coordinates of the light map, and you can do multiple tests and get a larger coverage area. Amen.

Okay, we have a demo for this one as well. So let's start with no shadows. So this is an interesting demo. It basically uses a technique known as light prepass, a deferred lighting technique. It's doing four passes, one shadow pass, and three rendering passes to end up with 64 point lights and one sunlight and this huge model, normal maps, per-pixel lighting everywhere. So it's very interesting, and I think we're going to talk about this as well at the session when we have it, I think, at 4:30 on best practices.

So I want to show you first the shadow map that I talked about, the depth values from the perspective of light. And you can compare it to the depth values on the right here from the perspective of viewer. So these are mapped onto the one on the right to basically do the comparisons. And without shadows, our scene looks really nice.

But it doesn't look realistic. When you add hard shadows, it becomes more realistic. But you can see on the edges of the columns that the shadows are per-pixel, and they show up being as per-pixel. And when it actually moves, it's more visible because the per-pixel, the operation moves up and down Chris Jitters.

And with percentage closer filtering, it is much softer. And we're doing just four samples in here. So you can turn it on and off a little a few times so you can see the details. Let me animate it. So this is all real-time lighting of both point lights and sunlight and with real-time shadows.

and Sun is gone now, so we don't have any shadows. When they come up? Just in a few seconds, I think. Then we can basically go to hard and soft shadows. Okay. So these are the things that you can do by taking advantage of our new extension. Okay.

So the third category we had is basically enhancing the performance of your games. And we'll start with the separate shader objects extension. Separate shader objects extension is trying to solve the problem of the multiple stdof shaders. In many games, you would have this case like you would have multiple vertex shaders that are run with multiple fragment shaders and paired. If you have n vertex shaders that can be paired with n fragment shaders, you would have the cross pairing of all of them.

Like an example would be if you have a vertex shader that just does transforms and another vertex shader that does transforms plus skeletal animation. And you have a fragment shader, one of them doing just texturing, another one doing texturing plus light, and third one doing texturing plus light plus fog. You might, depending on the object type, you might use either the first vertex shader with the second fragment shader.

All six combinations of them are perfect. And you can do the same thing with the second vertex shader. So you would have a set of five vertex shaders that are being paired with ten fragment shaders. So in that case, you would basically end up compiling your vertex shader 50 times, your fragment shader 50 times, and linking them together 50 times.

So Apple Separate Shader Objects tries to fix this problem, or help with this problem. So it creates a new object type called Program Pipeline Object, and it gives you a mix and match strategy, possibility of a mix and match strategy. So what it does is instead of having a vertex shader, you basically have a vertex program. So every stage of the pipeline, and we have two stages, vertex stage and the fragment stage, every stage of the pipeline, you create a program for that stage, and then you pair them later.

So if you have five vertex shaders and ten fragment shaders, it means that you need to create five vertex programs and ten fragment programs, and then you can pair them as much as you want. So instead of 50 compilations on vertex side. 50 compilations on fragment side, and 50 times linking, you basically do a lot less work on the CPU side.

Which means that you end up with shorter compilation and linking times. So you can basically take advantage of it in two different ways. You can have, basically, if you were keeping the number of shaders fewer than you would normally because you were thinking that your boot time or level load time was long, you can add those shaders back and keep the same amount of time on your level loading and have more variety of shaders and better effects.

Or alternatively, you can just take advantage of it and have faster boot up times or faster level load times. One disadvantage of this extension is when the compiler can see both the vertex stage and fragment stage, it can do some set of optimizations. It can do some set of dead code elimination. It can do, basically, elimination of unused variances. It doesn't go and interplay the variances that are not used.

So if your shader performance strictly depends on the kind of optimizations that the compiler does and it can see both of them, you probably either change your shaders or think how much gain you're getting from this one. But that is very rare that you would end up with this kind of benefits from the compiler normally.

So that's something to know about. So here's an example. In our vertex program, basically, we have a vertex shader just like you had on IS4. And you would basically compile and link that vertex shader and create a vertex program. You would do the same for your fragment shaders. You would create fragment programs out of those as well.

And when it's time to use them, you can basically say, "Hey, I want to use for this stage this program and the second stage this other program." And there's no relinking or recompiling happening in here, so it's really fast. And you can basically pair your first vertex program with the second fragment program in here without causing any new compilations and linkings. Same here. You can pair them up in any way you want.

So it has actually two usage models. It can create program pipeline objects for each combination. So you basically can create your vertex programs and fragment programs and pair them together at the very beginning instead of attaching and detaching them, which would be faster at runtime but slower at boot time a little.

And then -- or you can use the same program pipeline object, just like in my example, and attach and detach program stages on the fly, which would be a little slower on runtime because you have to basically attach and detach them when there's some work going on on the OpenGL ES side, but you will have faster boot time. So you can use either model depending on what your needs are.

It is -- if you're familiar with the ARP Separate Shader Objects extension and ARP Explicit Attribute Location extension from the desktop, you already know how this thing works because we basically took those two and then removed some of the things that are -- the ES constraints are not allowed us to do, like remove geometry stage, for example, or tessellation stage from that, and came up with this Apple Separate Shader Objects, and we just used the same extension. Okay, that's Separate Shader Objects.

The final performance optimization extension is Occlusion Query extension. So in your games, you probably have some visibility determination system to find out what needs to be drawn and what not needs to be drawn. Frostrum calling is an example, and you might have something more complicated. But there are times you would have an object that's within your Frostrum but might be occluded entirely by another object in front of it.

And so this extension, Apple Occlusion Query Boolean, Basically, it allows you to do a binary test to find out if any of the samples from this object that might be occluded is reaching to your render target. So, the advantage is basically if your object has 10,000s of vertices, you use a bonding box inside and you basically draw it without affecting the color and depth buffers.

And then the next frame, you go and check if it were actually drawn with any of the samples that are made to the render target. If it did, it means that your object is now visible and you can render it. If it didn't, it means that it were actually entirely occluded by another object, so you actually didn't have to push that many data into the rendering system.

And you basically repeat the same query over and over every frame. The only minor disadvantage is that your object might end up on the screen one frame later. Again, it's an iPad 2-only feature due to the hardware requirements. So the benefit of it is it enhances the performance by avoiding large draw calls.

So our GPUs are tile-based deferred renderers, so they basically don't have much overdraw. But if you're pushing 10,000 vertices and none of them are ending up on your render target, we are still running the vertex program on all of those vertices, and we're pushing data in and out for all of those vertices. So you don't have to do vertex processing. You gain not doing that. A code example here. The frame end is our current frame. We set a query by calling GL begin query. And basically, we put here the bonding box.

Like if you have a car, you probably have two rectangles that will cover the car as your bonding box. And you're passing eight vertices instead of all the details of your car into the scene to find out if any of the samples from the car is making to the render target. And in the next frame, you basically check the results of your test. If the result is positive, it means that some of the pixels from your car, some of the samples from your car is showing up on the screen, so it's time to draw it.

[Transcript missing]