Graphics and Games • OS X • 57:32
OpenGL is the foundation for GPU-accelerated graphics on OS X, enabling a broad range of applications including games, animation software, and imaging solutions. See how your apps can deliver incredible visuals and high performance using the OpenGL 4.1 Core Profile. Learn how to take advantage of multiple GPUs and access the computational capabilities of OpenCL.
Speaker: Chris Niederauer
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
[Silence]
[Applause] Welcome, hello. So I'm Chris Niederauer. I work on the GPU Software Team and I'm here to talk today about what's new in OpenGL for OS X. So during this talk I'm going to go into some of the feature support update. What's new in OS X Mavericks in particular.
And then after that I'm going to go into a few of the key features that we think you're probably going to want to be using in your applications. After that we're going to talk about using Compute with OpenGL using OpenCL. And then finally towards the end I'm going to do a quick run through -- trying to get your applications into OpenGL Core Profile so that you can take advantage of all these new features.
So let's start with the features. In OpenGL as of Lion we've had support for Core Profile and this has a bunch of features that are pretty much guaranteed if you ask for [inaudible] Core Profile; frameBuffer Objects, vertex array objects, the usual. New in Mavericks; we've got a whole bunch of new extensions like texture swizzle, sampler objects, texture storage and then additionally on modern GPUs we have support for tessellation shaders and all the other OpenGL 4.1 features plus a little bit more. So that's an update on where we are. And so these are some of the features I'm going to go into. Explain how you can get your applications to take advantage of these features. So let's start with probably the big new feature on -- in Mavericks; tessellation shaders.
So what tessellation shaders do is they allow you to use a GPU in order to generate geometry to a specific how course, how fine you want your geometry to be and you generate it on the fly where you need it to be, where you want to spend your Vertices. Let's see. Define using shaders and the benefits. It allows you to dynamically increase your polygon density. So you're able to decrease your vertex bandwidth because you're only uploading a course mesh.
And then you get to decide how to refine that mesh. And it's used often for displacement mapping, terrain rendering, high-order surfaces like NERPS. And it's available on the modern GPUs, so all the hardware we ship today. If you -- so you'll need to make sure to check the extension stream using glGetStringi for ARB underscore tessellation underscore shader.
So we have here an application that's been -- is out on the Mac which is called Unigine Heaven and actually it's been updated to be able to take advantage of tessellation shaders. So we sort of before and after. We have the stairs are just a flat polygon there and then the dragon's neck. There's little points where there should be spikes but there's no spikes there. With tessellation shaders you see how it generates geometry, it uses displacement map, it creates spikes on the dragon and the stairs suddenly become actually stair shaped.
So to get a little bit into -- I'm going to describe how tessellation shaders work. So I also want to give this example screenshot where you can see that the application is choosing to dynamically tessellate this geometry based on the distance to the camera. So you're using the vertex bandwidth only where you need to, you know, only where you think this thing needs it. So closer objects you'll probably tessellate a lot more than objects in the distance.
So we start with a new type called a patch. And we set up our patches -- we say how many vertices there are going to be in each patch. So in this case with a triangle patch we have 3 vertices and then when we draw -- we call DrawArrays.
So in the shaders there's two parts to the shader. The first part of the shader we're going to control how tessellated we want it to be. So first in our control shader we're going to be setting the outer levels and we get to pick per side of this triangle how tessellated we want it to be. And so we see in the -- on the left side of the triangle we did a little bit more tessellation there than we did on the bottom and the right. Additionally you get to control the inner levels tessellation. So we're adding some geometry there.
And then once we have these -- this data output it gets output to an evaluation shader and that shader gets a -- the control points that you originally got and we evaluate where those positions should be. So we have test coordinates and in this case because we had a triangle we have three different TessCoords and you can see how it's very centric and each of those is weighted towards the original control points. And so using this is our evaluation shader we can now figure out where those points should be and push it out with displacement -- with a displacement map or do whatever you want at that point.
So the OpenGL 4 Pipeline; you're probably aware of what it looks like before. We have vertex shaders on the left, fragment shaders on the right and then tessellation and geometry shaders are both optional. So tessellation fits right in the middle there and it's actually made out of two different shaders that are -- that I will go into. So as I was saying there's a control shader and an evaluation shader so you get to control how tessellated it is and then you evaluate where each of those vertices should be put.
So the control shader. It takes as inputs the control points from the patches and then basically the array of control points and the original patches. And the outputs is setting how tessellated we should tessellate each of the edges and then we also have inner tessellation levels, as well.
And a tip for when you do have patches that are touching each other. You want to make sure that you have the same amount of tessellation on those touching edges. So let's get onto looking at an actual control shader with this triangle example. We have -- we set up the layout with the vertices saying that there's three control points per patch, and we're going to have our input vertex position from the original patch points and we're going to output control positions.
So first for every input we're going to copy the vertex position to the control position. And then additionally once per patch this InvocationID -- once per patch we're checking for only doing this once. We're going to calculate what those tessellation outer levels should be. What the TessCoords -- and it will generate TessCoords from there.
And then the evaluation shader. That takes and evaluates where the output from the control shader should be within -- to pass on to the geometry or fragment stage. So it takes the original patch and the tessellation coordinates sort of waning towards each of those original patch coordinates. And then it outputs; your position, your TexCoordinates and any other attributes that you may have.
So here is an example with a triangle evaluation shader. We're specifying how the -- oops, let's see. We've got the controls as input and we've also got a model view protection matrix. And basically we're treating the TessCoords -- we have three TessCoord inputs which are barycentric weights towards each of those original control points. And so we're multiplying -- doing a barycentric multiply here to get the output of where those should be. So doing just this simple math here gets us the evenly distributed points as we specified in our control shader.
And then after we've done that we can -- here was have a model view projection matrix multiplying by our position that we calculated passed into a custom function which is doing our displacement. So we started with our triangle patch just three points originally. We controlled how tessellated it was, and then we evaluated exactly where those positions should be to pass on to the fragment shader.
The quad -- just as a further example, with the quad -- the control shader is very similar to what the triangle except for this time we're specifying the vertices as four. We also set patch parameter I for the number of control points to be four. And then similarly we're passing through the original vertex position for the new control positions and then calculating our tessellation levels. And in this case we have more inner levels and outer levels to calculate because we have four sides to the quad and additionally for the inner levels we're controlling how split should it be horizontally and vertically.
Then in the evaluation shader taking the control points in again, there's four of them this time, and because we have a quad we're actually able to treat those barycentric weights as just UV coordinates basically within our quad. So we just -- we're just doing a simple mix here to figure out what those positions should be for each of the points. And then again we pass that calculated position into the custom displacement map to get our position out.
So with our quad we started with just the four points. We tessellated it and then evaluated where each of those places should be. So in summary tessellation shader allows you to add detail where you need it in your scene and you can start with triangles, quads, arbitrary geometry. Even like isolines.
And generates this data on the GPU instead of you having to submit it. So you have a low resolution model potentially that you are able to only tessellate as the model gets closer to the camera, or just too simply add displacement or extra geometry to make a character realistic in your scenes.
And so again it's available on modern hardware so you'll need to check for the existence of tessellation shader with GetStringi, and be sure to match your outer patches because otherwise you may have cracks in between your touching patches. So also another feature I want to go over is instancing.
And instancing is allowing you to draw many instances of an object with only a single draw call. And it's a big performance boost because each draw call does take a little bit of overhead in order to get that to the GPU. So instead we're just passing it all down at once and allowing the GPU to do that work as a single draw call.
And so each of these instances can have their own unique parameters. So the offset of a one instance from another, colors, skeletal attributes and you define all these in external buffers. And this is actually -- when you ask for a Core Profile you're guaranteed on OS X that you have support for this extension. And there's two forms of instancing.
There's instanced arrays using ARB underscore instanced underscore arrays where you get to have a divisor that says how often your attributes should be submitted per vertex. So as opposed to having -- per instance, sorry. So if you wanted to submit an attribute for every one instance you would pass a vertex [inaudible] divisor of one.
If you wanted to have a different attribute every two instances, two and so forth. Also draw instanced provides a variable shader instance ID. So in your draw call from your vertex or fragment -- from your vertex shader you can decide what you can see, which instance ID you are in. So you can tell -- do an offset into a buffer for instance based in an instance ID.
And I'll go into a little bit -- some examples of doing that in a short bit. But I'm not actually going to go into this too deeply right now because we actually announced support for both of these features in iOS 7. And Dan my colleague gave a talk this morning; Advanced is OpenGL ES where he went into depth in instancing.
So that was instancing. Let's go on to how we pass up some data like uniform buffer objects is one way to get some data up to the shaders. And basically it's a buffer object to store your uniform data. It's faster than using glUniform. You can share a single uniform buffer object among different GLSL shaders and you can quickly switch between sets.
If you have some prebaked sets you can just choose on the fly which one you're using. And additionally because it's a buffer object as all buffer objects you can generate your data on the GPU and use the output from that in your UBO, in the shader without having to do a read back.
And it's used for skinning character animation, instancing, pretty much it's whatever you want to use it for. So we have a shader example here. And what a UBO is, is it's basically a uniform buffer object is a CSTRUCK like layout defined interface where we're -- we have here of the layout one standard 140 which is the defining -- which defines how our vectors and variables are packed. And we've called this uniform MyUBO. And so this MyUBO which we named MyBlock underneath we're able to access that from our shader similar to how the CSTRUCK would work.
Pretty straight forward. And for setting that up in the API we would have just created a buffer object of type uniform buffer, set up the size of that and then we have to get the location. Instead of the get uniform location we're getting uniform block index for this UBO structure and of type MyUBO. And then we just bind that to one of our binding indexes that we have for making -- for knowing how to provide that data to the proper UBO in the shader.
So in summary you can upload all of your uniform values at once. The -- however you want to make sure that you're making sure that you're not modifying your UBOs every single time. So if you have for instance some variables that you're updating a lot more often than other variables you should split those into two UBOs. So one of your UBOs may be static, and another may be updated once a frame.
And then additionally if you are trying to modify the data even more often than that you could orphan your buffer objects by calling buffer data with a null pointer, or you could just double buffer, or triple buffer etcetera your UBOs to ensure that you're able to pass data to the GPU and still modify something one the CPU to pass that for the next call that the GPU will execute. And one key point I want to make note of is that each of the UBOs are limited to a size of about 64KB.
So as an alternative to UBOs there's also texture buffer objects. And texture buffer objects is a buffer object that allows you to store a 1D array of data as texels. And again like all buffer objects it gives you to GPU generated data and beyond UBOs it also gives you access to a large amount of data within a shader. So you can have a very large UBO. And additionally it uses -- it takes advantage of the GPUs texture cache. So just like TBOs -- just like UBOs, TBOs are also used for skinning, character animation, instancing, whatever you want to be using it with really.
And here's an example shader that's using TBO. So we basically have a new sampler type. Here we have samplerBuffer. There's also like isampleBuffer, usampleBuffer. And we're naming our texture buffer object reference myTBO. And it's as simple as just doing a texelFetch from it. Passing our offset into there.
And since this really is just raw data I modified the shader now to read back four values from texelFetch and a result I'm able to with a single texture sampler -- texture object I'm able to actually read back the equivalent of what would have had to do with four vertex attributes before.
So in the API to set this up we're just making again a texture buffer object, a buffer object as you normally would. And then setting it up similar to a texture where we're going to attach that buffer object to the texture object using Tex buffer and we're specifying here that it's a type RGBA32F.
But you can use whatever format you want that's supported by textures. And then finally we do get the uniform location for it and send it just like a texture sampler would be set up. So it gives you access to a lot more data than you would have gotten 64MB or more with a UBO.
And it's very useful for instancing where you do have a lot of data that you need to pass down that you may not have been able to have enough vertex attributes for or any complicated things like that. And again just like UBOs and buffer objects in general try not to use -- try not to modify a TBO while it's being used to draw on the GPU. So again double buffering, orphaning that -- those TBOs will help you ensure you're not stalling your CPU waiting for the GPU to complete its work.
So another feature that's new in Mavericks is draw indirect. And draw indirect allows you to specify the arguments to your draw calls from a buffer object. So for instance normally a draw raise will take a count and then there's draw raise instance where you have an instanceCount. And you also specify first, baseVertex. I'll show you an example in a second.
So when you've generated data for instance with OpenCL no longer do you need to know how many vertices you may have generated there. So in those cases you'll be able to find the buffer object as a draw underscore indirect buffer and avoid that round trip of having to otherwise read back what those variables should have been that you were going to pass into your draw call. And this is similar to tessellation shaders. It's available on all modern hardware and it's -- so check for the extension string with glGetStringi using GL underscore ARB underscore draw underscore indirect.
So here's an example of using DrawArrays Indirect. First I've got a comment at the top showing the general structure of what I need to be outputting into this buffer object in order to ensure that the GPU is able to know what those arguments are. So we're going to match this. It's got a count, and instanceCount and a first because we're going to be calling DrawArrays instance and then finally a reservedMustBeZero variable after that.
So we would -- so first we generate our data with OpenCL and write into our indirect buffer object. And then in OpenGL we're going to bind to that indirect buffer object and then we're always going to setup all our vertex attribute pointers and -- with our BindVertexArray call.
And then finally just call draw indirect. And it still takes a mode so you're still saying GL underscore TRIANGLES, glPoints, whatever you're using. And then the indirect offset -- you take -- you pass in an offset of your indirect buffer object. So where you put that data, the count, the instanceCount, and the first.
DrawElements, very similar. I've highlighted what's different here. We've got a firstIndex instead of a first and then additionally we have a baseVertex offset for each of the elements in your element array. And then -- so we've created our data using OpenCL and OpenGL. We then bind that -- the location of that indirect buffer, what buffer we were writing into.
Again we bind the vertex array objects so we have all the vertex attributes setup and then we are additionally going to have to set our buffer point for where the elements are going to read out of. So we're also doing that binding. And then finally we call DrawElementsIndirect and this -- instead of passing down the node, the counts, and the instanceCount and the baseVertex and firstIndex that's all being read out of your indirect buffer object. And instead we only have to pass down the mode like GL underscore TRIANGLES again and the element type saying that our elements in the element array buffer are type unsigned int, unsigned short, whatever you may want to be using there.
So let's go over just a few more extensions. We've got some new extensions here. Separate shader objects available on all hardware on Mavericks. It enables you to mix and match your GLSL shaders. So if you have one vertex shader that you're using with five fragment shaders no longer do you have to recompile that vertex shader five times to be used with the -- you don't have to link -- you can link really quickly. You don't have to redo that work for that vertex shader five times in order to use it with five fragment shaders. Instead you just create that program once and then link it with all five of those shaders pretty easily.
Additionally we've got ES2 compatibility and this is probably interesting if you have an iOS application that you're trying to port to OS X. It allows you to use version 100 of GLSL on the desktop. So on OS X. However, you are limited to the functionality that GLSL 100 specifies. So you aren't able for instance to take advantage of tessellation shaders if you'll be using this version of GLSL.
Another nifty new extension in Mavericks is NV Texture Barrier and it allows you to bind the same textures, both the rendered target and the texture source. So now if you have -- it's similar to Apple Frame Buffer Shader Fetch -- Shader Frame Buffer Fetch on iOS where you can do programmable blending there. However, this is limited to cases where there is no overlap between a single draw call where your depth complexity of your scenes is one.
So basically it saves you a little bit of vRAM and not having to create a copy of your texture if you're trying to do ping pong back and forth between two buffer objects. You might be able to update your application to just render right back into itself and use itself as a texture source. And then additionally we actually added in 10.8.3 our texture swizzle, and this is a heavy extension for supporting older applications which might have been using a format like GL underscore LUMINANCE.
And so instead of having to modify all of your shaders to be able to take in RGBA data and LUMINANCE data you can specify up front that LUNINANCE should be interpreted as red, red, red one. Or LUMINANCE Alpha would be red, red, red, green; a red green texture passing out and trying to pretend that it's LUMINANCE Alpha.
So those are the features that -- some of the features in Mavericks that I wanted to go over and now I wanted to go into using OpenCL as a compute for with Open GL. So OpenGL and OpenCL on our platform were created together and use the same infrastructure in order to talk to the GPUs.
And as a result of that cooperation you're able to be able to share things like buffers, objects, textures between OpenGL and OpenCL. There's no need for any reading back data and then put it back up just to switch between APIs. It's a very simple integration into your render loop and I'm going to go into that right now.
So some of the use cases for Compute. We may use OpenCL here, for instance, to generate or modify geometry data. So I'm going to go into generating a tea pot in OpenCL and drawing that in OpenGL. Also I got to go in after that to post-processing an image that you may have generated with OpenGL and then displaying that.
So first starting out with filling up the VBO with vertex data using OpenCL, and then rendering that with OpenGL. We're going to have our one time setup of setting up our OpenGL and OpenCL context to share. And then we create that vertex buffer object in OpenGL and we'll specify that we can share that object in OpenCL in order to fill it in there. And then every frame enqueue our CL command in order to create that -- the data that goes into that VBO.
Then after that we'll flush the CL enqueued commands in order to ensure that when we use that data with OpenGL that it's now been pushed to the GPU to be filled in so it won't -- so you're not using your -- you're not using data that has not yet been specified.
So after that's been done you just draw it with OpenGL. So let's start out looking a little bit closer at that. So first we're going to create our OpenGL context and we're setting up the pixel format for NSOpenGL and here we're adding a new PFA, pixel format attribute of NSOpenGLPFAAcceleratedCompute. And that specifies that I want to have an OpenGL context that is capable of accessing the GPUs that have OpenCL support.
Then after I've created our -- my context I'm going to get the share group from that context and with that share group I can use the CL APIs in order to get the -- in order to get the matching CL device IDs that will match up with my OpenGL virtual screen list and so I'll be able to share between these two contexts.
And then I'll create my context from that CL device list and then back in OpenGL we're going to be creating -- or creating a vertex buffer object here to be able to fill in and fill in with the tea pot. Specify the size, we're going to flush that to the GPU and then we just use this API [inaudible] from GL buffer, and now we have a CL mem object that points exactly to where that VBO in OpenGL is.
So that's a onetime set up. And then every time we're going to be drawing you can check for the CL/GL DEVICE underscore FOR underscore CURRENT underscore VS underscore APPLE, didn't quite fit there. So by using -- by looking that up you can look at what virtual screen OpenGL is currently using in order to do its rendering.
So if you do have say multiple GPUs in your system in this case I'm using this query in order to check which GPU OpenGL is on and I'm having OpenCL follow it so that it can do its computations and not have to copy back to the other GPU any of the data that I'm going to be creating here. so I've not picked the CL device ID that I want to submit my data to and I enqueue a OpenCL kernel and this OpenCL kernel will be generating our vertex data that we're going to then consume with OpenGL.
So after that's been done we flush that to the GPU. The GPU starts creating this data for us, and I have a barrier here just to say that if you are doing some more complicated things, if you're doing some threading and doing things in OpenGL and OpenCL that may not be interacting with each other you can continue doing some of that work in the in between time but you need to make sure that this flush here is done before we try to use that data in OpenGL. So we're using here the new function GL DrawElements and Direct and that allowed us to draw this tea pot without even knowing how many vertices were in remodel and so that's how you get a vertex buffer object from OpenCL.
So the opposite is also true -- or not the opposite but the other side of the pipeline you can do image processing and well with OpenCL. So for this case we have similar onetime setup. We're going to setup OpenGL and OpenCL to share just like we did before. However this time, we're going to create a texture object and share that between OpenGL and OpenCL.
And so every time -- every frame that we want to do this post-processing will draw that texture using OpenGL, flush OpenGL's command buffer to the GPU calling glFlushRenderApple and then, in OpenCL we can enqueue the commands to process that texture. And then finally if you want to display that back on the screen you're going to flush the OpenCL commands and then you can blip or swap back to the screen.
So here we're again, picking a pixel format that has access to the GPUs which have Compute capabilities with OpenCL. We get the share group form which we're going to get CL device IDs that match up with the OpenGL context that we've created and then we'll create a context from that. And this time we have a texture that we're going to be sharing with OpenCL so we'll bind to that texture, setup how big it is to store using techImage 2D. And then FlushRenderApple -- and after which you can create a CL meme object from that GL texture.
So now every frame we're going to be doing or normal drawing in OpenGL. We'll render to the texture say DrawElements and other OpenGL draw calls. We'll then flush that data to the GPU to start processing it. We happen to be drawing the tea pot that we've already done with OpenCL here.
Again the barrier. So after that FlushRenderApple and only after that FlushRenderApple should you start doing your work in OpenCL that depends on the results of the draw calls that were modifying that texture that we're sharing. So here again we're going to check which virtual screen OpenGL is on, and match the CL device ID there. So OpenCL is generating the data on the same GPU that OpenGL created that data in.
And then we're going to enqueue our post-process kernel there and do our calculations such as like edge detection or blurring. And then finally if we want to pass that data back to GL we're going to flush that results using CLFlush and in OpenGL we'll then be able to just bind to that texture and blip it to the screen if we want to.
So that's passing data back and forth between OpenGL and OpenCL. You get best of both worlds. You a graphics API with full Compute capabilities and just sharing in between with very little cost. And it's -- additionally wanted to reiterate that if you are creating data but you don't know how big it is for vertex geometry then it's great for using with ARB underscore Draw underscore Indirect.
And after this talk we're going to be talking about OpenCL actually and how to use that a little bit more explicitly and even going into some OpenGL/OpenCL sharing there, as well. So if you're not familiar with OpenCL yet and you are hoping to do Compute in your projects I recommend you stay right after for that talk. So we've gone over Compute so let's now talk about how to get your context into Core Profile in order to take advantage of all these new features that we're supporting in Mavericks.
So first to start out OpenGL Core Profile is a profile that gives you access to the latest GPU features. The API -- I don't know if you're familiar with OpenGL ES2 versus OpenGL ES1. It's very akin to that transition where the -- we -- the API was trimmed down to be streamlined and high-performance and it's more in tune with how the GPU actually works.
So there's no matrix math or things that the GPU wouldn't have actually been doing in the first place. And it gives you the full control over the rendering pipeline so we're specifying everything as shaders. And similar -- I was saying it's similar to OpenGL ES2 actually so much so that it's pretty easy to port back and forth between OpenGL ES2 and OpenGL Core Profile.
So let's go over conceptual overview of what you're going to need to do in your applications in order to get them working with Core Profile. And many of these are actually things that you want to do to your application even if you're not trying to get it in Core Profile. If you're not doing it for the reason of going into Core Profile it's at least an enhancement to your application to take -- to switch over to these new ways of doing your existing applications.
So for instance a media mode vertex drawing should be replaced with using vertex array objects and vertex buffer objects. And there's obvious increase in performance by just switching your application to use vertex buffer objects instead of having to provide the data each frame, every single time that you're drawing with it. Fixed function state gets replaced with GLSL shaders.
So you're going to have to specify vertex shader and a fragment shader, and then optionally as well the tessellation shaders and geometry shaders. And the matrix math is now no longer part of OpenGL. So you do have to provide your own custom matrix math. And then older shaders which are in say 110 or 120, version 110, 120 need to be updated to version 150 or above.
However, on our platform in Mountain Lion we also introduced support for GL Kit, and GL Kit actually solves a couple of these transition steps. You're still going to have to update your application to take advantage of vertex array objects and vertex buffer objects. But for your fixed function state, for your application that you made but are just trying to get to work in Core Profile you can use GLKBaseEffect in order to achieve the same affects that you would have been trying to do with your lighting and so forth. And your Legacy OpenGL application.
And then additionally the matrix math that I said was gone in OpenGL is now fully replaced by math libraries that GLK Math provides. So let's talk about creating that Core Profile Context. So all we do is in our pixel format attribute lists that we're going to use to create our context from we pass in an NSOpenGLPFA, OpenGL profile. And in this attribute list we're passing down NSOpenGLProfile3 underscore 2Core.
And this gets you access to all the new features of OpenGL 4.1. And OpenGL 4.1 is fully backwards compatible with OpenGL 3.2. So that's how -- why picking this enables you to take advantage of all those new features. And so now that we've picked that pixel format we're going to create our context from that.
So to go over briefly how to get your application switched over, what you're looking for and what you're replacing it with [laughter]. Basically saying you need to cash all your vertex data in vertex buffer objects. And additionally you're going to need to encapsulate those objects into vertex array objects that point where all the attributes are coming from for your shaders.
So we have code on the left which is what would have been in your application. So glBegin, GL underscore TRIANGLES, glEnd or glCallList with display lists and Core Profile on the right we now can change all those calls to just two calls we call glBindVertexArray, and then glDrawArrays or glDrawElements if you so choose to use elements. glBitmap, glDrawPixels are subsumed by uploading a texture. So glTexSubImage in this case and then we can just draw with that or call BitFramBuffer to do something similar to what it would have done -- what it would have been doing anyways for you in the Legacy profile.
And additionally, the pointers that used to exist for VertexPointer, TexCoordPointer, ColorPointer those are all subsumed by generic VertexAtribPointers. So instead we're going to call VertexAtribPointer and then we bind each of our pointer -- attributes by name. So we have like myVerts in this example and myglShellShader. I would have had to input myVerts and I can bind that attribute using BindAttribLocation. And then finally glEnableClientState no longer takes -- because we're no longer dealing with those color arrays, normal arrays and so forth. Instead we're just enabling the VertexAttribArray by the index that it's being passed up to the gl shell shader with.
For your math portions, for matrix math just use GLKMath. It's got the ability to replace all the built in transformations that OpenGL would have provided for you. And of course you can use your own matrix math library if you already have it but if you don't this is not a bad place to start. So we've got for instance our translate, rotate, and scale functions have been replaced by GLKMatrix4MakeTranslate, Rotate, Scale. And the first function there GLKMatrix4MakeTranslate is actually equivalent to -- the make there means it's equivalent to calling glLoadIdentity followed by glTranslate.
Additionally perspective we can call GLKMatrix4MakePerspective. Similarly that's a glLoadIdentiy followed by what gluPerspective would have done. And the GLKMath also provides you with MatrixStacks so you can push and pop your stacks. But when you've actually finally gotten your value out of your matrix, no longer are you going to use LoadMatrix to pass your data up to your GLSL shaders. But you're going to upload those as generic uniforms. So you use glUniformMatrix4fv in that case.
So here is a list of some of the functions in GLKMath. I hope you brought your binoculars. But it's basically -- I want to say it provides everything that you need for Core Profile in order to do -- for all your OpenGL Core Profile needs. It even has support for quatrains.
So now on to the next part of your application that you need to update. The fixed-function state that you may have in your app. Let's talk about using GLKBaseEffects and GL Kit in order to update your application to work with Core Profile. So you would have had fixed-function in the lighting, fixed-function materials, fixed-function texturing. That's no longer available. Instead we're passing everything up as shaders and then GLKEffect provides you with base shaders that you can treat similarly to how your code may have interacted with your Legacy OpenGL context.
So here we have our lights that we're passing up for instance. We can pass up a position, diffuse, specular values for our lights. Instead with the GLKBaseEffect we're just setting the light0.position, diffuse color and specular color as you would expect. Pretty straight forward. Additionally you can go enable it with an enabled bit and then we even have materials. So it's a very close correlation to how you would have been using fixed-function state before.
So afterwards -- some of you may already have some shaders already written using GLSL 110 and -- or 120 and there's some slight differences in getting it to work with 140, 150, 330 and 410. So again, I was already saying that we've got our client state enables are now switched to generic attributes.
So we're just going to be enabling our index here of our attributes that we're passing in. Matrices; we're no longer loading those matrices as a built in matrix like glModelViewProjectionMatrix and so forth. They're instead going to be generic uniforms that we pass into our shaders. And so we're going to be uploading those with glUniformMatrix4fv.
Additionally some of the current state that you would have set like glColor4fv you can set those either using similarly glVertexAttrib4fv for constant values or glUniform4fv as well for values that are not changing very often. And then additionally all the pointer calls get replaced with a generic VertexAttribPointer call.
So looking at the actually GLSL language itself there's some slight differences here where the ins and outs are now very explicit in GLSL 150. And then additionally for the frag- -- similarly to how the built-ins are removed for like the fixed-functions state and so forth the frag data output is replaced with an out that will be where we're going to be writing to which tells us which one of our draw buffers to provide a result to.
So we have up here the attributes that would have been attributes in GLSL 110. In 150 those become in because the attributes are going into your vertex shader so that we're just passing in vec4 data here. And then our varying's that we would have produced with our vertex shader and then consumed with our fragment shader are no longer called varying's. They're more explicitly named out from the vertex shader and in in the fragment shader. So our texCoordinates for instance we're outputting that from the vertex shader here and then inputting it to the fragment.
And then finally as I was mentioning just a moment again -- ago, gl underscore FragColor is replaced with a binding of your choice. So we've made an out to vec4 here called myColor and prior to linking our GLSL program I made sure to BindFragDataLocation for the -- for myColor so that I'm specifying it to be writing to the [inaudible] for zero with myColor.
Some additional changes we've got in GLSL version 150 over 110. Now the GLSL version is also required. So 110 would have been implicit which version you were using. However, in Core Profile you're required to say version -- #version 150, 330 or 410 at the top of your shaders.
And then some examples here of the built ins that have been removed; gl underscore Vertex, Normal, MultiTexCoord are replaced with your own generic uniforms or vertex attributes here. vertPos, inNormal, texCoord that we've named ourselves and upload as vertex attributes. And then additionally some of the uniform variables like our ModelViewProjectionMatrix and the gl underscore NormalMatrix we could -- we upload those as glUniforms here.
And then finally small change; texture2D, texture3D are replaced by just a simple texture call. And the sampler type overloads how that texture call should be sampled from. So now that we've got our GLSL shaders pretty much working with Core Profile let's go over a little bit more of the other API differences here.
We've got of course different headers in OpenGL3 and so if you can modify your code to only include gl3.h and gl3ext.h you can assure that your code is building cleanly and as a result that you're not calling any calls that may have been removed from Core Profile. And if you are for instance to call glCallList in Core Profile that would throw an invalid up. And so instead of having to figure out at runtime where you may have errors just getting rid of the gl.h and glext.h in your file can allow you to at compile time know which functionality is -- needs to be replaced.
Additionally getting extension strings is slightly different. Instead of getting one huge string like you would have in Legacy Profile it's now split up into an index string where you have to get the number of extensions that are available and you go through that loop, and can get each of the extensions one by one.
And then finally APPLEFence is replaced by Sync objects. So say that FenceAPPLE becomes glFenceSync, glTestFenceAPPLE gets replaces with glWaitSync and then some of the functionality -- some of the functions like VertexArray objects are replaced by the Core equivalents. So you'll call glVertexArrays instead of glGenVertexArraysAPPLE and so forth.
So of course a lot of you may have somewhat larger applications where you can't just go and switch immediately from Legacy to Core Profile context. For you guys I'm suggesting a more piecemeal approach. So really you can do any of these operations by themselves and not affect the rest of your code.
So while still staying on Legacy Profile we're going to switch our application first to wherever we have some of the older draw calls. We can replace them with drawing with vertex buffer objects and vertex array objects. And this can be done to multiple pieces of code at your own timing. And you don't need to switch to Core Profile to start using vertex arrays and vertex buffer objects.
Secondly replacing the math. You can actually use GLKMath with Legacy OpenGL context. And it's because it doesn't really have a -- its profile agnostic. It just gives you back raw data. So with that raw data instead of calling gluPerspective for instance with the projection matrix instead I would have my projection matrix as just a variable that I've calculated on using GLKMath.
And then for instance on the CPU I may then multiply that by the ModelViewMatrix as well to get my ModelViewProjectionMatrix and then once I have that result -- because I don't need to do that multiple every single time in the vertex shader. I would pass that result in as a -- just using LoadMatrix.
And so using LoadMatrix you can pass in those -- the original matrices. Then for updating your existing shaders; if you have 110, 150 you can already move your -- while staying in 110 you can keep your attributes and replace them with generic attributes, use uniforms for any of the built-ins and by doing so you -- you're getting rid of your dependency on the built-ins that were very specific. Because you can use the generic attributes with Legacy and Core Profile alike. And then additionally EXT underscore gpu underscore shader4 enables you to have your outs -- your out color specified usingBindFragDataLocation just like in Core Profile.
So in doing this we can create our shaders in a way that even using say #define to define in the vertex shader attributes to end and varying's to out. And do those #defines such that you could easily switch your shaders from 110 to 150 when you do make the switch to Core Profile.
And then finally for places where you may have fixed-function use right now you will have to make -- you can make those into GLSL 110 shaders and do similar things to what you were doing with your existing shaders just before. And GLKEffects unfortunately depends on Core Profile. So this -- to do this piecemeal you will have to be replacing your fixed-function with shaders.
And so you can do any of these steps above, but one at a time and check that when you touch this one file and replace vertex buffer objects and vertex array objects that we're getting the expected result just like we used to get. And so we're able to debug our application on a more piece by piece basis and not doing one big switch all at once. So after you've made all these changes you switch to Core Profile by specifying Core Profile in your pixel format attribute and then update your shader versions ideally with -- by replacing those #defines with in, outs and so forth.
And just a tip for large code bases where you may have a bunch of code that's using Legacy, OpenGL context calls. You do a Grep for some of the streams and tokens that you have that are referencing like glBegin, glEnd, glLight and using that Grep and just doing a line count you can track how many lines of code you have left to switch to Core Profile and track over time that adoption.
So to summarize what we went over today we went over a bunch of new features in Mavericks and just how to get access to those by using the Core Profile. I also wanted to throw in a little mention about OpenGL Profiler here. It allows you to break on OpenGL errors for instance which is very useful for when doing a transition to Core Profile.
And so you no longer have to take glGetError in your code between every single place to find out what where that error's coming from. And you should never have glGetError in shipping code anyways for release mode. So instead you can just use OpenGL Profiler and break on GL error as the screenshot there shows.
And so finally we did go over how to use OpenCl and OpenGL together in order to do computes and solve your computes needs. So if you have any questions Alan Schafer our Graphics and Games Technology Evangelist. We've got some great documentation at developer.apple.com/opengl and then of course you can interact with each other at devforums.apple.com. And so the related sessions to this, we had again this morning.