Graphics, Media, and Games • iOS • 54:12
OpenGL ES provides access to the stunning graphics power of iPhone, iPad, and iPod touch. See how you can tap into the latest advances in OpenGL ES for iOS 5 and harness the programmable pipeline enabled by OpenGL ES 2.0. Get introduced to the new GL Kit framework and learn how your apps can take advantage of its built-in features and effects.
Speakers: Gokhan Avkarogullari, Eric Sunalp
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript was generated using Whisper, it may have transcription errors.
Welcome to Advances in OpenGL ES for iOS 5 session. My name is Gökhan Avkarulları and Eric Snot will join me on the stage to talk about some new cool features. Every year we release a new version of iOS and with that new release we basically make new features available. Two years ago we made OpenGL ES available on iOS platform. Last year we did it with MSAA, Discard, DevTextures and such. But sometimes releasing new hardware makes things possible. And this year we released iPad 2, and with iPad 2 you get a lot more graphics horsepower, you get a lot of CPU horsepower, and you get a lot of memory bandwidth, which enables you to do new things or do all things in a new and improved way.
I'm going to talk about an application on the App Store that took advantage of iPad 2's processing power. And they basically changed their pipeline to do something specific on iPad 2 that makes it look much nicer relative to iPad 1 and other devices. That's Real Racing 2 HD from Fireman. And I'm going to walk over a few examples of what they did on iPad 2.
The dashboard in this scene, as you can see, is almost one third of the screen space. And it's very visible to the player. And what they did on the dashboard, they basically used normal maps and per pixel lighting. That takes advantage of the compute power of iPad 2 and memory bandwidth as well.
And on the asphalt, they basically used multiple textures-- light maps, gloss maps, specular maps. And basically, they took advantage of iPad 2's very large memory bandwidth. They implemented dynamic shadows on the dashboard depending on where the sun is and the car is. There's dynamic shadows on the dashboard.
And MSAA. MSAA, if you can't change anything on iPad 2, but only one thing, you probably should enable MSAA and see how it works. It should probably just work fine and you get instant image quality enhancement. But you should do more. If you can do only one, do MSAA.
And finally, with iPad 2's processing power, they were able to push a lot more polygons into the device. They have much higher poly models they're using in the games. Okay, so that was what the hardware gives you. But we also added some software features. I'm going to talk about -- we're going to talk about GL-CAD, a new framework that we are introducing with iOS 5, and the new features that we enabled through OpenGL extensions, OpenGL ES extensions. So let's start with GL-CAD.
Before I start with GLKit, I want to talk about using OpenGL ES2 in your games. And I want to talk about an example again here. Brad Larson has an application called Molecles App that has been on the App Store for some time and was using ES1.1 as its rendering backend. And this was the kind of rendering he was getting using ES1.1. And when he actually set out to improve it, he decided to use OpenGL ES2.0 and the power that the programmability of OpenGL ES 2.0 gives it. And the end result, as you can see on the right side, is basically a much better looking model in this case at molecule visualization, so visualization part is very important for this application. And he uses techniques of multiple passes and through the programmability of ambient occlusion, per pixel lighting, great speculars, as you can see. So these are the kind of things you can do with OpenGL ES 2.0. And when we set out to do GLK, that was one of our goals, to make these kind of things possible. So we had two primary goals. The first one is basically making life easier for you, the developers. And the best way to do it is basically find problems that are common to every one of you and then provide solutions for those. The second goal we had was to basically encourage a unique look for each graphics application.
The ES 1.1 fixed function pipeline is basically very powerful, but it's limited in the ways of how it can do rendering. does Blinn Fong shading, for example, and textures and such. You can differentiate through textures, but it kind of limits you what you can do. So the idea is that to allow you to basically use shaders, but we also understand that you have a lot of investment in ES 1.1.
and we wanted to help you with porting your ES1 applications in 2.0 and make it easier. So these are two goals of the GL kit, basically making life easier through common solutions and allowing you to easily port your games from ES1 to 2.0. And it has four sub-parts. I'm going to talk first about the texture loader, just briefly. So basically, the idea of texture loader is you give us a reference, whether it's on the network, on the file system, and you get an OpenGL as texture object back. So you don't need to deal with image IO, libping, the PNG and such.
The second subpart is the view and view controller. So we looked at our template, and there's hundreds of lines of code to set up FBOs for multisending and not, and set up the display links and such. And we thought that we can make it much simpler by giving you this UIKit-style view and view controller that also works really well with the UIKit hierarchy.
And the third part is basically a 3D graphics math library. OpenGL ES 2.0 lacks the transform API, the matrix stack API of ES 1.1. And also we've seen that a lot of people implementing vector and math libraries themselves. And basically the GLKM library gives you that, all the functionality of the OpenGL ES 1.1 that you can reuse on OpenGL ES 2.0 and also adds vector and matrix stacks to that.
And finally, the GLK effects. Basically, GLK effects are fixed-function pipeline features implemented in ES2.0 context. I'm going to have Eric come over here and talk about some of these. Thank you. Thanks, Gokhan. All right, hi, everyone. So we're going to start with the GLK Texture Loader. So like Gokin said, we want to make texture loading as simple as possible. You know, you give us a file reference, we decode the image, we load it, create a GL texture out of it, and there's pretty much just, you know, one API call to get this thing loaded, so you don't have to worry about setting all your OpenGL states. We're going to make this as simple as possible. And like he said, there's a number of common formats we support, PNG, JPEG, TIFF.
pretty much almost every format that Image.io supports. So, yeah, no longer do you have to load a CG image, decode that, blit it into a CG image context, get that data back. Pretty much just one API call now to do that. One of the biggest problems developers have had is dealing with non-premultiplied data. Now when you, you know, using our library, you can guarantee the image won't be premultiplied, which is a huge thing. So we have texture 2D support, QMap support, and a number of convenient loading options when you're loading your images. If you want to force pre-multiplication, that's an option you can set. We'll Y-flip the image for you if you want, put it in a native GL orientation, and you can turn mid-map generation on or off.
So basic usage, really simple. You set your EAGL context current. You make our single call to the class loading, to the class, one of the variant loading methods. And you get back this object, this glkTextureInfo object, which has all the pertinent information you need to do your rendering. So you get the most important texture name that you can bind with, the width and height of the image, alpha if it was pre-multiplied, or if there was no alpha pre-multiplied or not, the original origin of the image, and if we MIT mapped it for you or not.
So let's run through a little code example. So as you can see, it's just a couple lines of code here. We're going to set the context current, load a path, you know, pretty canonical way of loading your image. Here I'm just setting one option entering the dictionary, which is generate_mid_maps, probably pretty common. And we're going to make our single class method call to load the 2D texture. And you can see-- so basically, if that call fails, you may want to do some error checking. That will return nil. Otherwise, you can check-- So if it returns nil, basically you have error there too to check to see if you want to see what went wrong. Then we assume you do some work, and then there you go. There's a texture name to bind with, and that's pretty much all it takes to load a texture.
So in addition to a synchronous usage case, we have an asynchronous usage case. Take advantage of iPad 2 multi-core devices. Very similar usage pattern, but in this case, instead of using one of the class methods, you're going to actually allocate a GLK texture letter object and give us your contact share group. And in this case, you're going to call one of the instance methods and provide us with a GCDQ and a completion block. So the GCDQ is basically where that completion block is going to be called on. And through that completion block, you'll get back a GLK texture just like you did with the synchronous case and an error if there was one loading. So the code example for that case, same thing, load the path. And here we're going to get the share group for your context. Give that to us when you create the GLK texture loader object.
And there it is. Just make a single call, give us the path, and whatever your completion block handler is. And here I just kind of, in this example, wrapping on, you know, just informing my app that, "Hey, you know, this thing is completed, and here's the texture info that I'm gonna need." So really simple.
So the next thing we're really proud of is the GLK View and GLK View Controller. Like Okun said, this thing was-- you saw it on our template-- probably a pain to deal with. Hundreds of lines of code. I don't think MSA was there by default, something you had to code for, among other things.
So it's pretty much everything you need to get OpenGL as in a view and on screen. Again, like Okun said, it's part of UIKit. So it's not part of UIKit, but it's UIView subclass. So it fits into the UIKit model really well. Responds to set needs display if you wanted to. So it behaves pretty much like a UI view in all of its associated methods. XADAR's a little bit different. You can set a delegate or you can subclass the class for drawing. And here's a number of things it automatically handles, which was normally partially done in the template for you, but now done automatically. basically creation and deletion for color, depth, stencil, turn on MSAA or not, and that's performance win. We'll do the discard for you automatically. And then pretty much every time you draw, before you draw, you're always gonna set your context and your FBO current, and then present that drawable afterwards. So we also do that for you. So this is kind of even more similar to UIView in that if you're gonna subclass this thing, you implement draw rect, setting the context, setting the FBO current and then presenting afterwards, just like something you don't have to think about in UIView. And here it's all done for you. And then additionally, we have a snapshot support that returns a UI image for you if you want to use that. All right, so the next thing is the GLKViewController, subclass of UIViewController, so it fits into the ViewController model nicely. And the main thing here is that it's coupled very tightly with the GLKView, and it handles the redrawing of the view. So kind of the main value you'd probably be setting most often, unless you want the default, is the preferred frames per second.
And the preferred frames per second specifies maybe 30 or 60, a value that we'll try to get as close to as possible on the device or the display that your view currently resides without going over the display's refresh rate or your preferred value. And then additionally, you pause and resume any time you like. And automatically, when your app goes in the background, we'll pause for you. when it comes in the foreground, it will resume for you. And that can be disabled if you want to do your own custom work. And then also to kind of help separate your scene logic or maybe your physics logic, we provide an update method that stays in sync with draw. So you can kind of keep those two things separated. So basically, you update with a call, and then your draw will get called right afterwards. And then we provide a number of statistics that you can query for, such as the number of frames displayed. All right, so here's a code example. This is the simplest case of just using the delegate example. Here I've got a view controller that I assume I loaded from XIB file.
Here I'm going to query for the GLK view out of that. If one wasn't set up in the XIB file, this will be automatically created for you, just like all other view controllers. So get back to GLK view. And here I have this third class called game, which I'm assuming does all my, you know, my scene, my rendering. And I just need to basically assign it to GLK view as a delegate and give the GLK view my context. And then basically here I have a number of drawable formats I'm setting as an example. RGB 888 is just basically default. So if you didn't want depth and you didn't want multi-sample, you really wouldn't have to set any of these things. You would just have to set your delegate in context. Next I set the same delegate for my game because I want to make use of the update method. And I set a preferred frames per second, which is 30, which again is also the default.
Okay. So here, this would be like the game class. Pretty much all I would have to implement in that game class would be glkViewControllerUpdate and glkViewDrawnRect. And next we'll look at the subclass and actually really quickly -- Yeah, so basically when you get, you know, drawnRect and it's actually true for the subclass case, the rect will be in points, not pixels, so you can use your content scale factor to figure that out. So the subclassing case, this common method you're used to, drawRect, just subclass that, and basically, even if you set this thing as a delegate, you would get this one called, and then for the view controller update is what you'd implement.
All right, next is GLKMath. So this is a 3D graphics math library. We have over 175 functions that you don't have to implement. We have a number of common types, 4x4, 3x3 matrix, 4 and 3 and 2 component vector types, and even a quaternion type. And, you know, the idea is, you know, you don't have to write this for ES2, but we also wanted to, you know, when we were looking at, you know, what would you guys have to do to port from ES1 to 2.0, you know, you have a lot of this list code that already deals with ES11 matrix stacks and such. So we also additionally added a matrix stack functionality so that your port would almost be one-to-one with all the equivalent ES11 math functions. And we tried to make it as high performance as possible, so, you know, functions are in line where possible, and we'd make use of all the great hardware in our devices. So here's kind of to impress upon you the number of functions that we have. should really help out, and it's kind of hard to see.
So here's our one example. This would be basically maybe some simple 1-1 code you'd have, and this is what it would look like in GLKit. Here, these are core foundation types. I'm creating a projection stack and a model view stack, and I'm setting the project-- I'm basically actually using a GLK matrix 4 type to create a frustum matrix, and then I'm loading that onto my projection matrix stack, and then I'm going to, based on my model view stack, load and identity and do a number of equivalent ES11 transforms, translate, rotate and scale. And yeah, pretty simple. And that's it for the math. And I'll hand it over to Gokhan now to talk about GLK effects.
Thanks, Eric. So GLK effects, as I said before, is basically re-implementation of fixed function pipeline features within the ES2.0 context. So you get great visual effects with minimal effort. You don't have to write shaders for those. You don't have to basically create probably hundreds of different shaders depending on your state. It's a great way to go from 1.1 to 2.0. And the main goal of it is it's interoperable with custom OpenGL ES2.0 shaders. So you can basically do part of your rendering with GLK effects, and if you have something that's specific to your application that uses a special kind of shader, like a particle effect or GPU skin object, you can do still ES2.0 shaders, and basically mix and match them. So we come up with three effect classes, named effect classes. We have base effect, reflection map effect, and skybox effect, and I'll talk about each one of them in a second. So the general architecture and usage of it is very simple. You need to configure your vertex state. This is basically standard OpenGL ES2 vertex state setup. The only thing in here is that you use a predefined GLKFX vertex attribute name so that we can basically find out which vertices you specified and what they correspond to in our shaders. And then you are looking at your GLKFX class instance, the base effect, reflection map effect, and then configure the effect parameters.
You or set some parameters. And then once you're done with configuring your effect, you're basically called prepare to draw. In this case, what we're doing is we're not generating internal shaders every time you change a state. We basically want all your state to be set up, all your finalization to be done. And at that point, we'll set the GL state. We'll generate or reuse the necessary shaders. And then we're ready to go for your rendering. And you would bind your VAO as the best practice. You should be using VAOs. And then you'll make a draw call for either draw arrays and draw elements.
So this is how it works. It's very simple. Let's talk about the base effect. Base effect basically captures most of the ES-11 functionality. It does things like lighting, multiple lights, material properties you can set for each light and light properties you can set, multi-texturing and fog, constant color, the transformations that you can do on your objects. The things it doesn't cover is basically texture combiners and clip planes from ES-11.
And here's an example. This part is basically how you set up your vertex state. It's no different than any other ES2 vertex state setup. In this case, you're basically generating a VAO and then your VBOs and putting data into your VBOs. The only point in here is basically when you pass a vertex attribute pointer, basically use names that are specific to GLK. In this case, GLK vertex attribute position and GLK vertex attribute normal when you set your position and normal race.
So the second part is basically allocating and initializing your effect. So you get a base effect in here. And then you start setting your parameters. In this example, actually we went a little further than what ES11 did. We support also the perpixel lighting. And so you basically set up the lighting time to perpixel and enable a light. So you enable those things based on-- basically, we try to keep as close as possible to ES11. So all our light parameters, all our material parameters are very similar to ES 1.1, and so are the default values for those parameters. So if you don't set anything, you get basically the default ES 1.1 state kind of parameters. But if you're going to change it, this is the place to change. You set your material properties. Here in this case, we enabled the light. We set a diffuse color. We set a shininess. And then we're ready to draw. So we basically let the effect know that we're ready to draw, and we call the prepare to draw API. At this point, all the state is settled. All the necessary shaders are generated. We try to generate basically optimum shaders. so if things are not enabled, there's no shader code to run non-enabled stuff. And you bind your VAO and make a draw call, and you get something drawn on the right side here. Perfect cell light. Skybox effect is a little different. It provides a scene skybox for your application.
If you can think of it as 360-degree background image. And it has very few parameters. You can center it, change the center of your skybox or resize it. Since it's a skybox, there's no actual vertex array state initialization. There's no requirement for that. Here's a code example for that. We allocate and initialize our skybox. Basically we need to give it -- our skybox is going to be a cube map. Here we're basically assigning a cube map and we're actually using texture loader to load the cube map. So that's just one line of code. And we set the prepare to draw so our skybox is ready to be rendered and then we can call the draw API. The second difference is basically you can see the skybox has its own draw API because is we didn't set the vertex state. So you have to call the draw API on the skybox.
Reflection map effect builds on the base effect. So we added reflections to the base effect. So everything that you had for base effect, like lighting, materials, and textures, multi-texturing, are still available to you. It's a basic subclass of the base effect. You can set them as you did before. And on top of that, it adds reflection mapping, cubemap type environmental mapping. And we basically use the specification of the desktop specification OpenGL 2.1. And so we tried to stay as close as possible to that one. And we used a cubemap for reflection samples.
So initialization of vertex array object is no different than the Bayes effect. So in this example, we're not actually covering it. And we create our effect. So the difference here, for example, from the Bayes effect, when we set the parameters, you can set the same parameters over here. But you need to assign a cubemap texture for reflections. Now, when you have a reflection on an object, it is independent of the position of the viewer. So if you pass only the model view matrix to the reflection effect, it won't be able to figure out how to create the reflections. If you think about a shiny car next to another car, it doesn't really matter where you're looking at the shiny car.
The reflections are always at the same place on the car. So therefore, when you construct your model view matrix, you need to basically decompose it into the model part and the view part. And for the base effect part, you basically pass the model view part entirely to the base effect implementation of the reflection map. But you also pass the transpose of the model matrix to the reflection map effect so that it can actually calculate the reflections correctly.
As a base effect, all you need to do is, once you set the parameters and pass the cube map and give the transfer of the ModelViewMatrix to the reflection map, you set the prepare to draw. Everything is ready to go, and you set bind your VAO, and then call the drawways or draw elements, and you get basically your rendering of them. So we have a technical demo for this. I say technical because there's not much artistic stuff in here, as you can see. So we have a background image -- background color, really, and a sphere that doesn't look like a sphere because it's not lighted here. So we got this skybox from Emil Persson, also known as Humus. So you can see it's 360-degree background image, if you can think of it that way.
And this is basically the skybox is loaded by the texture loader in here, and it's rendered by the skybox effect. So on our object in the front, it's a sphere, we can basically texture it. And you can see that it is a sphere when it has a texture, it just shows it more. And you can do per-vortex lighting. This is basically a base effect. All you do is set the light position and light parameter, and then just say enable that, and then prepare to draw and draw.
My thing -- or we can use a reflection map effect and add reflections to it. So you can see that it is reflecting the skybox that is around the sphere. And you can see basically just two lines of code, really is setting up the same skybox cube map as the cube map for the reflection map effect, and then setting the transpose of the model view matrix on the reflection map, and then just go from there. And one final thing, both the reflection map and the base effect support per pixel lighting. So we can enable per pixel lighting. And it's very hard to see in this one because there's not strong speculars, but you would get per pixel lighting. Okay. So that's for our demo.
So as you can see, GLKit tries to make your life easier by the texture loader library is a way of loading your textures. You don't have to deal with lower level operations of loading files or going to the network. Or it can load from CG image. It can load from actually true NSData from the memory. And you don't have to deal with the lower level details. It can do asynchronous ways. So if you're on iPad, you can take advantage of the second processor and load two textures at the same time. Weave and WeaveController are first-class users of UIKit hierarchy. Basically, they can work with the tab controllers, navigation controllers and such, and it hides all the complexity of FBOs and such from your application. And you can easily enable multisampling, for example, with a single line of code change, or disable it depending on performance, and do things like that. And the math library is a 3D graphics math library. You can even use it on ES 1.1 for vector and matrix operation.
Or if you are basically using ES 2.0, you don't have to rewrite it. is pretty optimized implementation. And finally, the effects allows you to port your game from ES 1.1 to ES 2.0. You have the same kind of capabilities, and then you can do things like you can basically port your game using GLK effects and use your special source in ES 2.0, whatever you want to do, particle effects and things like that, and basically get much better visuals without having to go and create an entire system you know, shaders and shader stitching and things like that. Okay. So that was for GLKit. We also have other features that we implemented through OpenGL ES extensions. And we looked at it, and we basically identified -- Three categories. The first one is we basically got a lot of requests in terms of enhancing the OpenGL ES and AV foundation interaction. So we tried to solve that problem, the issues with that use model. The second set of extensions are basically to help you enhance image quality on your graphics applications. And the third category is the extensions and the new features that help you improve the performance of your graphics application.
So let's start with enhancing the OpenGL-AV foundation interaction. So what we set out to do was basically create a direct path from video subset of the system to the graphics, and the opposite as well, from graphics to the video. And what I mean by direct path is basically it's better understood when it's contrasted with iOS 4. In iOS 4, if you were to get your video captured data into OpenGL-- and in this example, I'm assuming that you are trying to convert them to ARGB and get into the OpenGL, you will have your capture session with this AV foundation created buffer pools. You will capture your data into that. You will be using image buffers. And then you will be able to transform from YUV space to ARGB space. And then to pass your data, you will call glTechImage. And we would copy your data at that point from your storage to our storage in the OpenGL API. And basically, you would have this extra copy happening on the OpenGL side. So this is basically-- the same applies for YUI data. You could basically use two different planes and create a texture from each one of them. But at the end of the day, both of those textures have to be copied over.
So on iOS 5, we're eliminating the second mem copy. Basically, if it is YUV data, you can basically get each one of the textures without any copies. And if it's an ARGB transform data, you can still get the ARGB into the OpenGL without any copies as well. So that improves your performance. It makes the CPU available to do something else, something interesting. And the way we did it was basically we did two changes. On OpenGL side, we introduced now R and RG, one component, two component textures, and render targets.
And on the AV Foundation OpenGL side, we added a new API called CVOpenGLESTextureCache. Actually, there's two APIs within AV Foundation that helps you implement this change. Let's look at the RNG textures. There's basically AppleTextureRGExtension. It adds the one component red texture type. So it's basically for Luma processing. If you were using a single plane from a CV image buffer before, you still have to deal with packing and unpacking, those kind of operations, and this one helps with that. And the two components, red and green, is for interleaved chroma. So they can be used both for texturing, so getting the data into OpenGL ES. Also, they can be used rendering, so you can get the OpenGL ES data out in two-plane format and then go to AU Foundation to do, for example, encoding. The hardware-- there's a hardware requirement to implement this thing, and it's only available on iPad 2 right now. Here's an example of using the CVE OpenGL ES texture cache. I'm assuming here we're using ARGB data. And basically what you do is you create a texture cache.
Behind the scenes, AV Foundation creates a buffer pool and then adds buffers as they come to that buffer pool and do the lifetime management and such. So you get your CVE image buffer from your capture session in this example, and then you would give it to the OpenGL by first creating a texture out of that CVE image buffer, calling CVE-open-gl-es-texture-cache, create texture from image, a long API. And at this point, you can see that there's no call to gl-techimage. There's no copying going on. Basically, we took your data that was already stored in the CVE image buffer and told openGL that it can use that data directly. There's no need for copying. And you probably want to set-- you need to bind that texture to basically use it for your rendering. And you also want to probably set some of the texture parameters, like properties of wrapping and such. And so you need to get the name. And you can use the CBOpnglisTextureGetName API for that. And then once you're done, basically you can use it, and you're rendering the texture.
So I have a small demo about this as well. So this is basically the texture cache of the GL text image. So when I move the camera, it's a little choppy because it's not running yet. It's trying to run at 60, but it's running at 48 because there's a lot of copies going on. And when we turn the texture cache on, it's much smoother. And it says 60. And basically, this is video data texture put on a highly tessellated quad. So basically there's a ripple effect that's changing the vertex positions to create the ripple effect. I'm sure you guys are going to use it for something much more interesting, like things like augmented reality maybe, or doing barcode scanning, or getting your OpenGL images out and encoding on a video encoder. Okay, that was a Averil Foundation Evangelia interaction.
So the second category we identified that we wanted to work on was enhancing the image quality. And basically, we implemented the infrastructure for high dynamic range rendering. High dynamic range rendering is widely used in modern games, basically using a render target that is high dynamic range, which allows you to capture a much wider brightness range within the same render target. You can basically have much darker or much brighter scenes within the same render target, and you can adjust your exposure to capture which part of it you want to capture. The display is still low dynamic range. It's basically 8-bit per color channel. It can display only 8-bit per color channel. So what you need to do is you render into a high dynamic range buffer, and then you read it back, and then the tone mapping, and very well-known technique, and then you end up having a wider brightness that capture through this technique, and then you can send it to the display or do something else. Basically, high dynamic range rendering enables things like the effect of when you go from an indoor to outdoors, that you would have your eye has adjusted exposure difference, and you slowly -- you don't see anything at the beginning, and then you start seeing the detail. You can capture those kind of effects with high dynamic range rendering. You can basically have very nice blooms when you have high dynamic range rendering, HDR rendering capabilities. And it's also a deeper buffer, so you can actually do accumulations and have better motion blurs and things like that. And you can -- it's a 16-bit float buffer, so you can actually do scientific computation if that's what you need to do. And so the extension that enables this is Apple Color Buffer Half Float. It adds 16-bit float render targets. Last year, we had 16-bit and 32-bit float texturing, and this year, we're adding the render targets as well. And it can be a one-component or two-component or three-component or four-component render target. The regular clamping restrictions for render targets are losing, basically, relax for this extension. you should check out the extension spec. Again, it requires a feature that's only available on iPad 2, so it's an iPad 2-only feature. And multisampling is not supported with half-float render targets.
I have a demo for this one. So here we have basically a skybox that is high dynamic range, a 60-bit float skybox. And so the first effect I talked about was how you go from indoors to outdoors and come back, and then when you go out, the exposure changes significantly, and your eyes have to adjust to it. So here's an example of that one. So... As you can see, this is possible to implement with high dynamic range targets. Going back.
I hope it's visible to you as much as it's visible to me over here. So the other thing we talked about was Basically, let me change the exposure. So you can see, you can actually set your target exposure point to anywhere. and you would get whatever part of the exposure you want to capture. And this image in the back is actually a very high exposure range. It's, I think, nine different exposure levels that the image was taken with. Okay. The other thing I talked about is, basically, you can change the bloom in the scene.
You can see, if I turn off the bloom, the light doesn't really interact with our semi-transparent bunny, which reflects and reflects light, but there's no bloom around it. In real life, when you have an object that is -- and a strong light behind that object, you have this light coming around that object, surrounding that object. So you can capture those kind of effects with high dynamic range, and you can go crazy with it if you want. You can see. Let's put it into some same point. And I talked about tone mapping. So tone mapping takes a 16-bit value and maps it into an 8-bit value in a smart way. And another smart way of doing it is basically just mapping and clamping it, just mapping the lower dynamic range and clamping it. And if I turn off the tone mapping in here, you can see our tone mapping is just basically clamping the values to zero to one range.
So the overexposed parts are not even -- you cannot tell the detail. But if you go to underexposed parts -- sorry, if you go to the darker part, without tone mapping, just this part shows up, but the overexposed parts do not show up. But with tone mapping, basically, within the same render target, you can capture both kind of exposure levels. Okay, that was a demo for high dynamic range running.
OK, another image quality enhancement we added to iOS 5 is improving the quality of the shadows that you can have, dynamic shadows you can have in your games. So here's a real life example. There's a big window, and then the sunlight, which is a planar light in this case, is being occluded by the big columns within the window. And you can see the light and the shadows. And there's a part that doesn't get any light at all, except for the global ambient light. And it's called the umbrella, basically. The column entirely blocks the sunlight for that part.
So that's easy to capture with the shadow mapping technique in your games. But there's this part where part of the sunlight is blocked, but since it's not an exact point light, some other parts of the light is leaking in and creating this soft edge on the shadow. So this new extension that we have called Apple Shadow Samplers enables us and you to simulate that kind of softer edges on the shadows. It uses a technique called percentage closer filtering. So what it does is, if you think about the shadow mapping, shadow mapping is capturing the scene from the perspective of light. And then when you're rendering it from the perspective of the viewer, projecting that to the view and comparing the depths and finding out if that pixel on the screen is seeing any contribution from that light or not. But it is a binary test when you're doing it on a per pixel basis. So percentage closer filtering basically moves it from a binary test to a four sample test that will allow you to generate any range between zero to one. And basically you do four depth texture tests and compare and return a weighted average. And that tells you how much light contribution you got from that light, which also tells you how much shadow that pixel gets. And again, it's a feature that's only available on iPad 2. So here's an example of the shadow map, the depth texture from the shadow map is mapped onto the depth values of the viewer, from the viewer's perspective. And you can see on the right side, basically there's no confusion, there's no light leaking in from that light for that particular pixel. And the blue one over here is a pixel that is getting actually projected shadow map, is corresponding to four values from that shadow map, and you can see that it is partially getting light and partially not getting light, so the light will be attenuated in this case. Okay, so we added new GLSL define called GLAppleShadowSamplers, new sampler type, Sampler2DShadow, and two new GLSL functions, Shadow2DApple and Shadow2DAppleProjectionApple.
And here's a fragment shader example. So what we're doing here is basically our sample 2D shadow is getting the depth texture from the perspective of light, and we're also passing an attenuation factor from our vertex shader. And in the first line, we're basically comparing, using the coordinates of light, light texture, how much of it is passing these comparisons, and these four samples are taken. And then, using that value, we're finding out how much light contribution we're getting. And the final fragment color is based on the original color and the light contribution multiplied with that color. Now, this gives you software shadows, and if you have significant compute bandwidth available to you, if you're not doing anything else, you can actually run -- you can jitter the texture coordinates of the light map, and you can do multiple tests and get a larger coverage area.
Okay, we have a demo for this one as well. So, let's start with no shadows. So, this is an interesting demo. It basically uses a technique known as light prepass, a deferred lighting technique. It's doing four passes, one shadow pass, and three rendering passes to end up with 64 point lights and one sunlight, and this huge model, normal maps, per-pixel lighting everywhere.
So, it's very interesting, and I think we're going to talk about this as well at the session when we have, I think, 4:30 on best practices. So I want to show you first the shadow map that I talked about, the depth values from the perspective of light. And you can compare it to the depth values on the right here from the perspective of viewer. So these are mapped onto the one on the right to basically do the comparisons. And without shadows, our scene looks really nice, but it doesn't look realistic. When you add hard shadows, it becomes more realistic, but you can see on the edges of the columns that the shadows are per pixel, and they show up being as per pixel. And when it actually moves, it's more visible because the per pixel, the operation moves up and down Chris Jitters. And with percentage closer filtering, it is much softer. And we're doing just four samples in here, so you can turn it on and off a few times so you can see the details. Let me animate it. So this is all real-time lighting of both point lights and sunlight and with real-time shadows.
And the sun is gone now, so we don't have any shadows. When they come up, just in a few seconds, I think. then we can basically go to hard and soft shadows. Okay. So these are the things that you can do by taking advantage of our new extension. Okay.
So the third category we had is basically enhancing the performance of your games. And we'll start with the separate shader objects extension. Separate shader objects extension is trying to solve the problem of the multiple stdof shaders. In many games, you would have this case, like you would have multiple vertex shaders that are run with multiple fragment shaders and paired, if you have n vertex shaders, that can be paired with M fragment shaders, you would have the cross pairing of all of them. Like an example would be if you have a vertex shader that just does transforms and another vertex shader that does transforms plus skeletal animation, and you have a fragment shader, one of them doing just texturing, another one doing texturing plus light, and third one doing texturing plus light plus fog, you might, depending on the object type, you might use either the first vertex shader with the second fragment shader. All six combinations of them are possible. Actually, we've seen games that have like five vertex shaders that are being paired with 10 fragment shaders. So in that case, you would basically end up compiling your vertex shader 50 times, your fragment shader 50 times, and linking them together 50 times.
So, Apple Separate Shader Objects tries to fix this problem, or help with this problem. So it creates a new object type called Program Pipeline Object, and it gives you a mix and match strategy, possibility of a mix and match strategy. So what it does is instead of having a vertex shader, you basically have a vertex program. So every stage of the pipeline, and we have two stages, vertex stage and the fragment stage, every stage of the pipeline, you create a program for that stage. and then you pair them later. So if you have five vertex shaders and 10 fragment shaders, it means that you need to create five vertex programs and 10 fragment programs, and then you can pair them as much as you want. So instead of 50 compilations on vertex side, 50 compilations on fragment side, and 50 times linking, you basically do a lot less work on the CPU side.
which means that you end up with shorter compilation and linking times. So, you can basically take advantage of it in two different ways. You can have, basically, if you were keeping the number of shaders fewer than you would normally, because you were thinking that your boot time or level load time was long, you can add those shaders back and keep the same amount of time on your level loading and have more variety of shaders and better effects.
Or alternatively, you can just take advantage of it and have faster boot up times or faster level load times in your game. One disadvantage of this extension is when the compiler can see both the vertex stage and fragment stage, it can do some set of optimizations. It can do some set of dead code elimination.
It can do basically elimination of unused variances. It doesn't go and interplay variances that are not used. So if your shader performance strictly depends on the kind of optimizations that the compiler does when it can see both of them, you probably need to change your shaders or think how much gain you're getting from this one. But that is very rare that you would end up with this kind of benefits from the compiler normally. So that is something to know about. So here's an example. In our vertex program, we have a vertex shader just like you had on IS4. You would basically compile and link that vertex shader and create a vertex program. You would do the same for your fragment shaders and you create fragment programs out of those as well. When it's time to use them, you can basically say, hey, I want to use for this stage this program and the second stage this other program. There's no relinking or recompiling happening in here. It's really fast. You can basically pair your first vertex program with the second fragment program in here without causing any new compilations and linkings. Same here. You can pair them up in any way you want.
So it has actually two usage models. It can create program pipeline objects for each combination. So you basically can create your vertex programs and fragment programs and pair them together at the very beginning, instead of attaching and detaching them, which would be faster at runtime, but slower at boot time a little. Or you can use the same program pipeline object, just like in my example, and attach and detach program stages on the fly, which would be a little slower on runtime because you have to basically attach and detach them and there's some work going on on the OpenGL ES side, but you will have faster boot time. So you can use either model, depending on what your needs are. It is -- If you're familiar with the ARP Separate Shader Objects extension and ARP Explicit Attrib Location extension from the desktop, you already know how this thing works, because we basically took those two and then removed some of the things that are -- the ES constraints are not allowed us to do, like remove geometry stage, for example, or tessellation stage from that, and came up with this Apple Separate Shader Objects extension.
OK, that's separate shader objects. The final performance optimization extension is Occlusion Query extension. So in your games, you probably have some visibility determination system to find out what needs to be drawn and what not needs to be drawn. Prostrum calling is an example. And you might have something more complicated. But there are times you would have an object that's within your frustum, but might be occluded entirely by another object in front of it. And so this extension, Apple Occlusion Query Boolean, basically allows you to do a binary test to find out if any of the samples from this object that might be occluded is reaching to your render target. So the advantage is basically if your object has 10,000s of vertices, you use a bonding box inside and you basically draw it without affecting the color and depth buffers. And then the next frame, you can check if it were actually drawn with any of them, with any of the samples that are made to the render target. If it did, it means that your object is now visible and you can render it. If it didn't, it means that it were actually entirely occluded by another object, so you actually didn't have to push that many data into the rendering system.
And you basically repeat the same query over and over every frame. The only minor disadvantage is that your object might end up on the screen one frame later. Again, it's an iPad 2 only feature due to the hardware requirements. So the benefit of it is it enhances the performance by avoiding large draw calls. So our GPUs are tile based deferred renderers, so they basically don't have much overdraw. if you're pushing 10,000 vertices and none of them are ending up on your render target, we are still running the vertex program on all of those vertices and we're pushing data in and out for all of those vertices. So you don't have to do vertex processing. You gain not doing that.
A code example here, the frame is our current frame. We set a query by calling GL begin query. And basically we put here the bonding box. Like if you have like a car, you probably have two rectangles that will cover the car as your bonding box. And you're passing eight vertices instead of all the details of your car into the scene to find out if any of the samples from the car is making to the render target. And in the next frame, you basically check the results of your test. If the result is positive, it means that some of the pixels from your car, some of the samples from your car is showing up on the screen. So it's time to draw it.
One word of caution, though, if you do set up a test and check the results within the same frame, the only way the GL engine to find out what the results are is basically do the equivalent of a GL flush. It has to basically render everything and then get the results and return back the results, which means that you created a pipeline stall. You basically lost all the performance enhancements that pipelining of the CPU and GPU and different stages of GPU gives you. So, You should set up a query in one frame, but you should not check it in the same frame. You should check it in the next frame. Check the result of the query in the next frame. So these are the new OpenGL ES extensions. I talked about the texture RG, which is one and two component textures that works really well with the new AU Foundation API, CVE OpenGL ES texture cache. We talked about color buffer half float, 16-bit float render targets that you can use for high dynamic range rendering, motion blur, scientific computations. The shadow samplers, sampling of your shadow so that you can have soft edges. Separate shader objects gives you the possibility of keeping your shader stages independent and give you better compilation times. Occlusion query boolean gives you a visibility test.
And we have two debugging related extensions, Apple debug label and marker that you add to your code and they basically are used by the OpenGL ES frame debugger which we're going to talk about in the next session after this one. So use them, take advantage of them and get a lot of help out of that in your debugging. We have three of them here that we haven't, we didn't talk about. The element index uint adds 32-bit indexes into the API. So in the past we had 8 and 16-bit indices supported so if you have a mesh that had more than 64k You have to basically partition into multiple draw calls. You don't have to anymore. You can go up to, I think, 4 billion. And when we added the texture, flow texture to the system last year on iOS 4, we only supported the nearest sampling on those. And this year we're adding linear sampling as well. So you can actually use linear sampling when you get a flow texture into your rendering operation. So there are two related sessions today. The next one is basically we're going to talk about the tools that you can use. Primarily we're introducing a new tool called OpenGL ES debugger, frame debugger. It's an awesome tool.
We use it actually for our demos and find issues and fix issues through that tool. So you should probably definitely go attend this session. And then we're going to talk about best practices for OpenGL ES in iOS. Things like how you can do multi-threading and improve your performance of your game or graphics application by offloading some of the operations into the second CPU. Actually multi-threading helps you find out that even on a single CPU by doing better cache, using the cache of the CPU better and basically separating your blocking points within the pipeline. We're going to talk about the view controllers, how to use them, and also we're going to talk about in that session, how to scale your games for taking advantage of iPad 2, what you can do, how we can do it. Ellen Schaffer is our graphics and game technology evangelist. We have great documentation. We work with the documentation team to give you the information about best practices and such. Thanks for coming to the session.