Graphics, Media, and Games • iOS, OS X • 49:10
Quartz 2D and Core Animation provide professional-strength graphics features and the layer-based animation system that powers the user experience of iOS and OS X. Walk through the process of optimizing a drawing app to take advantage of the Retina display while maintaining peak performance. Learn about enhancements that accelerate Quartz 2D and enable efficient screen capture.
Speakers: Mike Funk, Tim Oriol
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good afternoon, everyone, and welcome to Optimizing 2D Graphics and Animation Performance. I'm Tim Oriol, and I'm going to be joined a little bit later on by Mike Funk. So let's take a look at our schedule for today. We're going to talk about how you can support Retina display both in your OS X apps as well as your iOS apps.
We're going to talk a little bit about optimizing 2D drawing performance using some techniques with both Core Animation as well as Quartz 2D. And then we're going to point out some of the more common Retina display pitfalls and give you the information and guidance you need to navigate around those. And finally, we're going to introduce the new CG Display Stream API that will allow you to get real-time display updates.
A couple of prerequisites that we're expecting to have coming into this session is knowledge of the Core Animation framework, knowledge of Quartz 2D drawing techniques, and basic fundamental knowledge of NS view, UI view, how to lay out your stuff. So what changes with Retina displays? Well, now we have four times the pixels. We've doubled the dimension in the width. We doubled the dimension in the height.
And so we have four times the total number of pixels in the same amount of space. So now how do you lay your stuff out? Are you going to use pixels? Are you going to use the old pixels? What do we do? So if you start reading our developer documentation, you are no doubt going to encounter the term points.
So let's take a minute now and just clear up any discrepancies between points and pixels and figure out what we need to do in our app. So points have nothing to do with font points for this session. Font points. For our session, we're going to be talking in logical coordinates for points.
So points are what you want to use to lay out your buttons, what size everything is going to be, set up your entire app in this logical coordinate space of points. And then pixels on the other side of the spectrum are the actual device display units. When it's rendered on, like, say, the new iPad or even an old iPhone.
And the whole point is that one point is not always equal to one pixel. And so we're going to be using the term scale factor in this session. And what we mean when we use scale factor is the number of pixels per point that's in one dimension. So if we doubled both dimensions, our scale factor is going to be two.
And the whole idea is for you to be able to use points when you do your Quartz 2D, Core Animation, UI kit, app kit layout, and then you supply the scale factor to us, and we will automatically render it in the crisp, clean resolution that's perfect for the devices being run on.
Okay, so how do you get that scale factor to us, and where do you set it? If you're working with CA layers, then you're going to want to look at the content scale property on the layer, and you want to set this on any layer that you want to provide high-resolution content.
So if you have CA text layers, if you have CA shape layers, if you have your own layer that you're doing some quartz 2D drawing in there, these are all great candidates where you want to set that scale factor in there, and you're going to get an automatic upscale to the correct resolution for the display. Any layers where you've supplied images for the content, if you supply the high-resolution images, then you're going to want to set the scale factor.
If you don't, then you don't. You have to do both of these in tandem. So if you leave your original 1X artwork there and you set the scale factor to match the screen, you're telling us you have artwork for the screen scale, but you don't actually have it.
You may end up with some unexpected behavior, like maybe your image is only going to take up the bottom left-hand corner of the layer because you've set it to be laid out down there, and we don't have enough image data to supply it. So if you're going to leave your 1X artwork in there, leave the content scale at 1X.
And as always, any layers that were created by AppKit or UIKit, you shouldn't really mess with. They're going to set their content scale appropriately. You don't need to worry about that. You should really only be using these layers as a point where you can add your own custom sublayers.
If you're on iOS, you can get the scale factor of the main screen right here on the slide. The UI screen main screen method will get that for you. And that is most likely what you'd want to set for the content scale for your layer. I should mention now that for the course of this presentation, I'll be using iOS and UIKit metaphors, but there are parallel APIs on the desktop side.
So that was for if you're working with CA layers, if you're working with views, that gets set for you. If you're working with CG contexts, if you've already told us the scale factor for your layer or view, by the time you get the CG context in the draw rect or draw in context, that's going to be set up for you. You've already told us once. We'll take care of the rest.
The caveat comes in when you create a bitmap context or any CG context that's not directly tied to the view, you need to recognize that this is going to be specified in pixel dimensions, and then you won't have to adapt any of your drawing to match the scale that you intend to render at.
If you are on iOS, you're in luck. We have a helper method for you. The UI graphics method here will allow you to specify the size in points. You can let us know if you need any of these. You can let us know if you need transparency in your bitmap context. As always, if you can do opaque rendering, that's a lot faster for us to render. So if you are doing opaque rendering, opt in, let us know.
And then finally, the scale. That should be the scale factor equivalent of what you would set on a layer. And a neat little trick is if you pass in zero for the scale, we're going to automatically detect what the scale is on the main screen, and we'll set that up for your context.
So to recap, your Quartz 2D and CA-based drawing should be scaled using this scale factor. You provide the scale factor to us. This is all your lines, your paths, your shapes, any high-resolution images that you've provided. If you are going to supply high-resolution artwork, one way you can do this really nicely is to leave your existing 1X artwork there and you make another file with an at-2X suffix. If you do this when you load up your assets using UI image or NS image, we're going to automatically look for the resolution that's appropriate for your device so we don't waste memory and we always get pixel-perfect results.
The cost of this. So now that we have all these extra pixels and we're using them, we have four times the pixels. This is going to magnify any performance issues that we may have had in our app before. We may have been able to get away with it before. Now the hard truth is we simply cannot afford not to optimize our drawing code.
So where do we start? First thing you want to do is collect some data, do some profiling. Let's see if we can get a good starting point. And one of the great instruments that I like to use is the Core Animation instrument. It's got a number of useful debug options you can see in the lower corner there. A lot of people don't know they're there. And so I just want to take a few minutes and we're going to go through each one of those and see what they do.
So I'm in Instruments here. If you don't have the panel up, Command-D, and that'll bring it right up. It doesn't happen by default. And so we're going to go through each one of these options here, and I'm going to switch you into -- so you can see the iPad while I do this. And we'll just see how each one of those affects the display on the iPad.
So throughout the session today, we're going to be using this finger-painting app to do some of our examples. We just track the user's point. We have a custom CA layer subclass here that we're going to do Quartz 2D drawing. We're just going to stroke a path that follows through the points that were put in.
So I'm going to go ahead and enable the first option, which is color blended layers. So when I turn this on, you notice that half the screen is green, half the screen is red. Well, ideally more than half the screen would be green. Because anywhere that's green is marked as opaque, and that's a region that we can composite a lot faster. So you see we've done the right thing with our layer here. We know that when we're doing our finger painting, we have an opaque background.
We're going to be putting opaque strokes on top so we know that our rendering is always going to be opaque, and we've let them know that, yes, we are opaque. You'll notice that some of the UI kit elements do have this red tint on them, and that's not necessarily a bad thing. I'm sure you would like, if you put a label on top of some of your animating content, if it actually showed through underneath. So labels, anything that has a little transparency around the borders will have by default this opaque.
So now I'm going to turn that one off, and we're going to move to color off-screen rendered yellow. So that's enabled, and we don't see anything. That's because we don't have anything being rendered off-screen. What a neat trick you can do is if you do the four-finger pinch gesture to dismiss an app, you can see that now our entire app has been colored this color. And what we do here is whenever you do this pinch to dismiss, we're going to save an off-screen image of your entire app in its current state, and we're going to use that to do this scaling and fading animation as you dismiss the app.
And when we get back to our normal state, you notice the yellow tint goes away because we're back in our app and we're rendering normally. So a related one, I'm going to turn this one off, and I'll turn on a related one, is color hits green and misses red. What does this mean? So this is if we do have to render an off-screen cache of the state of one of your layers or your app.
Let's do this again. You'll notice that it is green, and what green means is we were able to reuse that cache. So if we were not able to reuse that cache, let's say your app is changing a lot, let me bring up one that has more dynamic content here.
If I was to do it on here, you'll notice that this is red. So if you have a game or something like that that does change every frame, we're going to cache the bitmap so we can use it for the animation. But since it changes every frame, we're going to have to redo that every time. So in general, you want to see lots of green here as well.
Turn that off. Next one is color OpenGL Fast Path blue. This is super important for your OpenGL apps, but we aren't dealing with that today, so we're going to skip that. And then we have Flash updated regions. So Flash updated regions is going to paint yellow over any part of the screen that we actually update.
So you'll notice now we're being good. We're not actually updating any part of the screen when we're not doing anything. If I start drawing in the canvas, you'll notice that we do -- we are updating the canvas, and you notice that the performance metrics I've put in the bottom here, our frames per second and time spent in our draw call are being updated as well. So if I do a clear, we update the button state as well as the canvas.
If I move one of these controls for the size, we're going to redraw that control as well as the preview. What you don't want to see is updating the entire app every time you change one little thing. And we do a lot of this for you in AppKit and UIKit, but it's up to you to do that for your own elements.
We'll turn that one off. And we'll go to the most important one for this topic is color misaligned images. So what color misaligned images does is it's going to do a couple things. The first thing it's going to do is it's going to put a yellow tint over any content that is being scaled that doesn't have the native scale factor of the display.
So these are things that we haven't set up properly for retina display. You'll notice we have not yet enabled this for our painting canvas, so we still have the default scale of 1. And we're going to go ahead and change this. You will also notice this on the UI elements and you may think you want to go in and set those.
You don't. We've already talked about this before. Any layer that UIKit or AppKit creates, they're going to set that correctly. We do this intentionally to scale images for the buttons to be any length and we design them so that they will be shown appropriately. The other thing that color misaligned images will do is let me do this again, is we will put a magenta tint on any content that isn't pixel aligned.
So this is the right scale, but if it's not pixel aligned, you will get slight artifacting or aliasing when your images are drawn. And so you notice as these app icons are animating in, they do have that magenta tint, but when we let them finish, that goes away. So you want to finish on pixel boundaries, but it's perfectly normal to be in between pixel boundaries while animating.
Turn that off, and then I've put in a handy little switch here. We can set our content scale correctly. Let me just turn on the missile line again so we can see that. So we have that yellow tint now. If we do set our content scale to match the screen, in this case, 2, and then I update my painting canvas, you know that tint has gone away, and we can see through to the content.
You might also notice, as I'm painting here, I don't know if you can see, there's a little bit of ghosting, or the last path element looks kind of like it's fading in. And if you watch when we clear it, it kind of fades out instead of immediately going away.
So what we've encountered here is a feature of CA Layer, and you'll know if you've worked with them before. Whenever you change one of the properties on a layer, we're going to implicitly animate from the old property to the new one. And this is great if you want to move your layer around or flip it or fade it.
But what's happening here is whenever we invalidate our layer and we redraw via drawn context, Core Animation assumes, okay, the layer's content has changed. What are we going to do? We're going to animate from the old image to the new image. And so this isn't really what we want in the finger painting app, and we're also going to pay a performance price for that as well. So let me show you how to avoid that. We'll switch into Xcode for a sec.
This is what you want to do right here. So every CA layer, whenever you change a property, is going to call this action for key method. So this is the default animation or action to perform when that key has changed. And so what we're going to do is we're going to filter out any time the contents change, and then we're going to return nil. Nil means go directly to the new value, don't perform any animations or transitions. And we're going to just defer to super for the rest of our keys so we get the implicit behavior of the animations that we want for the other properties.
So now if we go back to our app, so this is what we have now with that sort of fading in. And now if we turn on that, you'll notice we get nice, smooth strokes there. The strokes are shorter because we're able to sample faster, and we don't have the ghosting unclear or drying.
Let's go back to slides. Okay. So that was a number of things we found using the Instruments tool, and we fixed a couple of them. Instruments tool is great, the Core Animation tool for your iOS apps. If you are writing an OS X app, Quartz Debug has some equivalent functionality. It will let you flash the updated regions. We can tint scaled artwork as well.
This is part of the Xcode suite, but it no longer ships with Xcode now. So you can get it by going into the Xcode menu bar, choosing Open Developer Tool, More Developer Tools. This will bring you to a website, and you just download the graphics tool for Xcode package, and you'll have all the old tools, including Quartz Debug.
So what can we do additionally within Quartz 2D to optimize some of this drawing that we're doing? First of all, the golden rule for all graphics is never draw more than you actually need to. It's a very simple rule, but it's very easy to ignore. So if we look at our finger-painting app here, and in this scenario, the user is painting this long path.
Maybe they're making a large drawing, and it's fine in the beginning, but after we get a few thousand points on our lines, we don't really need to be redrawing the entire path every time. We don't need to be updating our entire layer every time something little is added to the end of the path, especially not at Retina display.
So what do we need to do? So the set needs display in rect is not designed for you to pass in the rect of your view or your layer. This is actually for you to calculate the dirty region of your layer or view and pass that in. If you just called set needs display, we're going to automatically invalidate the entire view. So if you pass this in, the benefit you get is we're going to automatically set up your CG context to clip to that area. And then when you submit your draw calls, you don't have to change anything in your draw rect.
We're going to automatically call out anything that was submitted outside of that region, and we will only draw to the area that you've marked dirty. So you get a nice big win. You don't have to change any of your drawing code just by keeping track of what area has actually changed.
Another thing we can do is set up once and reuse. So inside our draw rect, we don't want to be querying UI elements and see what is selected. It would be great to create a variable for these. We'll hold the value of them when they change and then use it when we draw.
We don't want to be creating color spaces and then creating a color from that color space and then using that to do our drawing. This is something that we can set up once and reuse. The best would be to set it up on initialization and then reuse every time we draw.
So this would be colors, path, clip shapes, set up once, reuse. Even if you know it's going to change, like our drawing color here, it is going to change. People are going to want to paint with more than one color. But this is something that we should change when they actually change the color and not query it every time within our draw rect.
Another thing we can do is we can utilize these off-screen buffers or bitmap contexts to flatten some of our drawing into an image. So if we have a CG path, generally very quick to draw, once we add a few thousand points, like a finger painting demo like this, it becomes a little slower.
So if we're really only updating this part of the path, wouldn't it be great if we could just save the rest of it and only draw the part that actually changed? What we can do is we can create a bitmap context, we can draw into there, and then we can use that image to draw into our view, and then on top of that we'll just draw what's changed.
So let's take a look at a few of these optimizations and put them into the app. Let me just turn off that tinting. Okay. So first of all, we'll just take the initial behavior that we have here. If I just start going and drawing, we start at 60 frames per second. And if I keep going, we're going to drop. And it doesn't actually bottom out. We're going to keep going down and down and down until it becomes pretty much unusable. It's not even slow anymore. It's just unusable.
So what we can do is we can flatten these paths once we get over a certain number of points. We'll pick like 100, 200 points. Once we get that many points, we can flatten our current drawing into an image and then use that as our basis for the next segment of path that we're going to draw on top. So if I enable that option, you'll notice as I'm going, we're still going to drop. We're still going to drop a lot.
But we should stay above 10. So what happens there is every once in a while, we're going to flatten all of our content into the bitmap and then use that. So we're never actually drawing more than a couple hundred points. And so this is going to give us a consistent frame rate, and we're never going to go below here, no matter how many more points we add to the path.
So consistent frame rates are great if they're not 10 frames per second, right? So what are we going to do? What's going wrong? We're flattening our content. Why is it taking so long to draw? Well, we need to look at what we're actually drawing every frame, and maybe we're still drawing too much.
Because what we're doing right now is we're just setting set needs display. And we're going to be, even though we have this image for our flattened content, we're drawing this entire Retina resolution image into the view every single frame. So earlier I mentioned the instruments function where you can tint the areas that you're updating.
This works great for most cases, but it doesn't actually respect the clip rect for CG context. So we have a quick little fix that we can put in. I'm just going to draw a one-point red rectangle around the current clip rect whenever I get a draw call. So if I turn that on and I start drawing, I don't know if you can see it, but it's around the entire view there, and it doesn't go anywhere else. So every single time we draw, we're going to draw the entire view. So let's take a look at how we can fix that and go into Xcode.
Okay, so this is our draw function here. So if we have a bitmap image from before, we're going to draw that. If we don't have one yet, then we're just going to fill with our background color. And then if we have any additional path points to draw on top of there, we're going to add our path, use our line width that we've already set once the user selected it, and use our stroke color that we've already created once the user selected a new color, and use those to draw our path.
This here is if we have a single point at the end, we can't draw a path with a single point, so we're just going to draw an ellipse in the point where the finger is currently down. And this is what I did here just to give us a little visual feedback for what area we're actually drawing. So we just get the current clip rack, and we're going to just draw this rectangle, a red rectangle, around what the current clip rack is.
Now if we take a look at what we're doing when we add a new point to the path. So in here, first thing we do -- this is another way you can optimize a little bit -- is we're getting our current point. First we're going to check, is that point a significant distance away from the last one we've drawn? So if we know that these are almost exactly on the same point and we're just getting some very high-resolution touch information in from the system, we can maybe skip this point and wait for one that's more further away.
And we're going to keep track of our current point and our previous point, and we'll add the new point to our path. Then if we have too many points in our path, we'll go ahead and flatten all of our current content into that bitmap image that we're going to use that we saw in the draw rect when first we check and draw from there.
And then this is really what we should be doing. This is what we're doing now, just setting set needs display. We need to see a lot less of these. So just -- this is like five lines here for this particular case. And we're going to get an enormous benefit from doing this.
We just take the current point, the previous point. We'll figure out the bounding rectangle for there. We'll compensate for the line width that we're going to be drawing along the path. So we include all of the drawing. And then we'll set that instead of just doing set needs display. So let's turn some of these options on within the app.
So this is what we have currently, drawing the whole thing. And now, if we turn this on, you notice all these tiny little rectangles we get along the path. And that's the only area we're actually updating during that call. And we haven't changed our draw code one bit.
So we got a nice boost there. And if you've been watching the frame rate, you'll notice that we got a nice performance boost as well. We're staying at 60 now. You do see a little hiccup once in a while when we render to the bitmap, but in general, we can stay at a pretty responsive rate there. Okay. Go back to our slides.
So those were some great optimizations. We already got a lot of speed up just by looking at some of the stuff we can do with Quartz 2D and correctly specifying our set needs display and erect. We're going to take a look at a few other things we can do within Core Animation to provide a little benefit as well.
So the first thing would be an alternative to the bitmap context that we're using. What you can do if you're working with views and you know you have some content that's static, it's not going to change much, you can separate that all out into another view, put your canvas on top.
In this case, you would have to do transparent rendering for that. Let's say we wanted to render onto an actual image of a canvas sheet instead of a gray square. This would be a great candidate for that. And what Core Animation will automatically do for you is it will take any views in your hierarchy and will maintain a bitmap cache of those views and will composite them together in hardware when it's time to display. So this is something you can get for free. You don't even have to go down to the Core Animation level. You can just do this by separating out static content into static views.
If you are working with layers and you want to have this fine-tuned control on a per-layer basis, you can use the property should rasterize on CA layer. So this is the same effect that we're doing with the views, but you can specify it on any layer in your layer tree. And so what this is going to do is it's going to composite that layer, all of its children together into a bitmap cache, and then we'll use that to draw from again whenever it needs to be updated.
So if you have this layer rasterized and you scale it or rotate it or fade it, that's great. We can redraw again from the cache. If you have a layer whose contents changed a lot, like a movie or a particle system or something like that, this is not a good candidate because this is running into the case we were looking at earlier with those hits and misses. You want to open up instruments if you're getting red on there. If you're changing your content every frame, we have to regenerate the new bitmap.
It's actually a little bit extra work. When you do rasterize your layer, this is going to inherently lock it to a specific size. We create a size of this bitmap cache to rasterize your layer to, and then it's fixed to that size. So how do you specify what size you want? You want to use the rasterization scale property. You want to set that whenever you set should rasterize on your layer. In general, if you've set your content scale, you'd also want to set your rasterization scale to match that.
If you know that you're going to be scaling up your layer to a specific size, you want to set that. If you know that you're going to be scaling up your layer to a much larger size, you may even want to set this to four or higher to have that extra data there available for you when you want to render it at a larger size.
So let's take a little bit more in-depth look about how this works. So this is our standard layer tree. We have a blue square layer. We have an image layer as a child, and we have a text layer as a child as well. So normally when we render this to the screen, we draw the parent, and then we draw the image underneath, and then we draw the text underneath as well. If I go and enable should rasterize on the blue layer here, something a little different is going to happen.
The first time we render, we're going to create this cache buffer. We're going to render everything into there, and then we render from there to the screen. So that's the extra step I was talking about. If you're going to be doing this a lot, sometimes it can actually be a little bit slower than not doing it at all.
And the benefit for this is when we need to draw again, we can redraw right from the cache. So if I was to change the scale on this layer to be a quarter instead of a half, and then Core Animation is going to make this nice animation for me, every frame of that animation is going to redraw right from the cache.
We don't have to go to each individual layer. So this is a big benefit if you have a large layer tree. It doesn't need to be recomposited a lot. So if you have a complex something like that, you can save that off as an image and just redraw from there.
So a couple of caveats. This rasterization occurs before the mask is applied. So if you have a mask on your layer, you're looking to speed that up. This probably isn't the right option for that. And like I said, caching and not reusing is more expensive than not caching at all. And they do take up memory, these off-screen caches. So you can't just go around and set these on all of your layers. You're going to gobble up a ton of our RAM.
Alpha Blending, we've touched on a few times. If you can do it opaque, do it opaque. And if you're supplying image content to any of your views or layers, you want to make sure your images are opaque as well. If they need to be transparent, then if that's what you need for the visual field, then go for it.
But if you think that you're providing an opaque image, make sure you're actually providing an opaque image. A lot of times you've designed your image to be opaque within whatever image editor you use, and for whatever reason, they've included the alpha channel in the image. So in this case, we have an image who has an alpha channel who is all ones.
You want to check and make sure we have this here that says alpha channel no in there. If we do get an image from you and it has an alpha channel, we have to assume that it has some transparency and we are going to take the slower path.
Drop shadows look great. They are expensive. There's a couple of ways you could mitigate this cost. If you need drop shadows on your element and you know the general shape, the opaque shape of your layer, you can specify a shadow path, which is a CG path, to define that region of your layer.
And then we can generate the shadow based on that path instead of inspecting every pixel of your output and generating the shadow from there. The other one we just talked about, you can use rasterization. CA layer should rasterize to include this shadow. It does get included in the rasterized copy, so you can generate it once and then reuse it.
And if you do supply the shadow path and you scale or move your layer, we'll scale and move the shadow path as well so we can keep regenerating the shadow path based on the one you originally supplied. So if I have an image like this and I know I want a nice big drop shadow behind there, instead of just turning on the shadow, if I actually just supply this circle here, we can use that to generate it instead.
Next is a brand-new API we've added to Core Animation. It's called Draws Asynchronously. What this does is it introduces a second method of submitting your Core graphics drawing calls whenever you override the drawn context or draw rect methods to do your Quartz drawing. The normal drawing, if you submit a call, we're going to block, perform all of that rendering, and then we'll return control back to you, and you go on about your business doing whatever useful work you're going to do for your users. If you have enabled Draws Asynchronously, we can take all those commands and we'll execute the rendering in parallel and we'll return immediately to your app so you can do some processing at the same time as we're rendering the graphics.
So to take a look at this a little better, this is our normal drawing mode. If I have my custom CA layer subclass here, like, for instance, my canvas layer in the painting app, I get my drawn context. I'm going to say, okay, draw this image. Quartz goes, draws that image, and then come back to me, and I get to do whatever other work I'm going to do.
In the second scenario, if I've turned on draws asynchronously, we get the drawn context. Then I can submit any number of drawing calls that I want. I'm going to return immediately into my process, and I'll be able to finish doing whatever work I want while Quartz is rendering.
So first we should point out that this is not always a win, hence we've disabled it by default. So this is generally helpful when you're filling large areas of your context with images, rectangles, shadings, especially any non-opaque content if you are doing alpha blending within your context. and it really is a case-by-case basis.
There's no clear-cut way to say whether you want this on or not. So this is something you're going to want to open up instruments on, do some profiling, get some frame rate statistics, figure out whether this is actually a win or not for you. It might actually prove to be less performant than original rendering was. So it's really something you need to measure, measure, measure.
So now we'll take a look at a few of these and see if they actually do anything for our app. First bring up an app here that is doing rendering in a CG context as well. We have a lot of squares around here. It's going pretty well. We have about 60 frames per second on here.
If I start drawing non-opaque colors, you notice we take a little hit there. We're down to about 30. This also happens if I start drawing a lot of images as well and we're updating every frame. We're going down to about four frames per second in this case with the normal rendering. If I do turn on this asynchronous rendering for our layer, you notice we immediately speed back up. We can do the images.
We can do transparent colors. We can even bump up the number of objects on the screen to be a ridiculous amount. And then see what happens if we switch this back off. We kind of just crawl to a halt there. So this is the optimal case. It's a big win here. But it is something you want to test case-by-case basis.
So in light of that -- Let's get back to where we were in here. And this was our previous performance. We're holding 60. This was great. If you take a look at how much time we are spending in our draw call, it's about one millisecond, maybe a little bit more.
If I get rid of this and we turn on draws asynchronously, we've already done so much optimization already. We're still at 60, so it hasn't made a huge impact here. But what you will notice is if you look at the time we're spending in our draw call now, we're down to about 5% of what we were spending before. So you can use this to do some other useful functionality for your app.
Okay, well, we have taken an underperforming app that didn't support Retina display, and we've added a bunch of enhancements. We now fully support Retina display, and we provide a nice, fluid, smooth interaction experience for our users. And this concludes the portion of our session dealing with graphics optimization. And now I'd like to bring up Mike Funk to tell you a little bit about CG Display Stream.
Thank you, Tim. So I'm going to be talking about CG Display Stream. This is a new API for high-performance screen capture that we're introducing in Mountain Lion. So far, this is desktop-specific. Now, when I say screen capture, what I'm talking about is basically taking screenshots, which you can use for a lot of different things. Maybe you want to do a remote desktop or remote display type of an application, or maybe you just want to take screenshots to record to a movie or to save to a file for later.
So why is taking a screen capture a performance issue? Well, frequently, you're being tunneled down a very low-performance bottleneck. So for one thing, you end up doing round-trip copies between VRAM and RAM and VRAM. That's a very expensive operation. Every time you cross the boundary between VRAM and RAM, that ends up being very expensive. Of course, now we have four times as many pixels. That makes it a lot worse.
Ideally, what you want to do is you want to do the screen capture, have it stay in VRAM, and then immediately do whatever you need to do with the GPU while it's still there, and then pull it out to RAM to stream over the network or whatever you were going to do with it.
So what's the difference between a screen capture and a VRAM? To illustrate this, this is the traditional display capture situation. So everything above the line is VRAM, everything below is RAM. So your frame buffer contents start in VRAM by definition. Now usually the first thing that ends up happening is it gets copied over into RAM.
And again, this is a very expensive operation. But you're not done with it. You want to scale it or you want to do some sort of compression or encoding or color space conversion. And so--but you want to do that on the GPU, so you end up copying it back into RAM.
Then you can do your operations. And then finally, you pull it back over where it's ready for use by your application. So this is the traditional display capture pipeline. What we'd like to show you is what the CG Display Stream offers you, is the high-performance variation on this. Again, you start out in VRAM.
However, you do the capture right away and it stays there, and you can process the data right away. 90% of the things that you want to do with the GPU you can do before you ever have to pull it into RAM. So you're pretty much all done with it in VRAM. You copy it back out and now it is available to you.
So again, this is a traditional display capture pipeline, and this is the high-performance one that is much more desirable. So before we go any further, let's look at what some of your options are already for doing display capture. The simplest one is CG display create image. It's very simple, and if it does what you need it to do, continue using it.
Basically, you just provide it with a display ID of the display you're interested in. It gives you back a CG image that you can do whatever you want with. This is ideal for if you just want to do a one-shot thing. There's also a recording... API in AV Foundation. It's very simple. It's very effective. You start recording.
You record your data. You tell it when to stop, and it will save to a QuickTime file. Again, if that's all you want to do, then this is perfect. But for a lot of applications, you need more functionality than that. Finally, one thing people have resorted to is raw frame buffer access. This is very hacky and kludgy. It's very difficult to do. We don't really-- We don't offer any supported API for this right now, so if this is what you're doing, please do not.
Finally, so that brings us to CG Display Stream. So, again, it is a real-time display capture API. It is mountain-line only. You can use it for, like, the non-interactive applications. If you want to take individual screenshots or do a screen recording, you can use it for that. But unlike those other APIs that we had before, you can also use it for interactive real-time applications. So if you wanted to write a VNC server, this would be perfect for that. If you wanted to have remote display, this would be perfect for that.
There's also classes of display devices, like USB projectors, for example, which need to get the contents of the frame buffer to display, but they're not traditional display devices. They don't plug into a display port. So you have to have some mechanism for streaming them out over USB. This is perfect for that as well.
So, when do you want to use it? If you need to do real-time processing of screen updates, obviously that's pretty much your only choice. It is integrated with CFRunLoop and Dispatch Queues. You can use either one of those. It gives you GPU-based image scaling and color space conversion, among other things.
Another very handy feature is it gives you update recs for each new capture. So, whenever we give you a new capture, we'll give you a set of recs that tells you exactly what has changed since the previous one. So, for example, if you're doing a remote desktop type of an application, you only want to send over the network what has changed since the previous update. You want to minimize your network bandwidth, and this allows you to do that.
The API itself is very simple. First thing you have to do is create a display stream. This is one of two functions that you can use to do that. This is what you would use if you wanted to use this with CFRunLoop. There's another one that's very similar for if you wanted something you can use with dispatch queues.
So you just provide the display ID, the width and the height of the image that you want to capture. If it's different from what you are capturing, we will scale it on the GPU for you. And also the pixel format that you want it in. There's also a properties dictionary where you can specify a lot of different options. And then finally you provide a handler function that's going to be invoked every single time there's another screen capture for you.
Some of the properties here, you can specify a source rect, so if you're not interested in capturing the whole display, just a subset of it, you can do that. You can tell us that you're not interested in preserving the aspect ratio, for example. Right now, if the bounds that you give us for the capture are not the same aspect ratio as what you're capturing from, we'll put in black bars to avoid stretching. But if you don't want us to do that, we won't do that with this option. You can specify a color space for your output. You can also specify a queue depth.
And basically, this is, if you wanted to operate on these captures in parallel, you can do that. This is how many captures we keep around in our buffer at once for you to work with. It defaults to three. In practice, you probably should not go beyond eight, simply because at some point, you start using up a tremendous amount of memory and VRAM, and it's no longer a performance win for you.
So once you have your display stream object, these three functions are all there is to managing it. The first one allows you to get a run loop source object out of your display stream object. You just put that into your run loop. It will not immediately start capturing. For that, you have to tell it you have to use CG display stream start, and you basically just use these start and stop functions. You can start and stop as many times as you want. It's not a one-shot thing.
This is the prototype for the handler that you install when you create the display stream. You'll notice this is a block. This is not going to be just a function pointer, but it's an arbitrary block. And the parameters for this block are what we're, that's the information that we're providing to you with each one of these updates. There's a status flag, which is generally just going to say, we have a new capture for you. If you had stopped the stream, it might indicate that the stream is now stopped and you shouldn't expect anything more from us.
There is a timestamp. The timestamp is in Mach absolute time, so that is a very accurate, very high-resolution timestamp. We're giving you the capture itself in the form of an IOSurface, which is something that I'll explain in just a moment. But essentially, an IOSurface is an abstract object that... Can represent an image that's stored in VRAM or RAM, or it could be something that's synchronized across both.
And if something has to live in both places at the same time, it does the right thing as far as synchronizing between them. It's very high performance. And then finally, you get an update object, which is something you can query to get more information about the update that you just received.
One of the things that you can do with this update object is you can query it to get the update rects. Again, this is just telling you what has changed in this update since the last one. If you have multiple updates that you want to coalesce into a single update, the second function will do that for you. So, for example, if you do have a remote desktop server and you're streaming your screen captures out over the network, You can coalesce the updates if you know that you don't have the bandwidth to send the intervening updates to your client.
Now, before I mention I/O surfaces, which is--this is something that has been around for a long time in OS X. This has long been a part of the internals of OS X. It became API in Snow Leopard. It's just a very high-performance representation of a bitmap, which can migrate back and forth between VRAM and main memory. One of the nice things about it is you can share it between processes.
So each one has an ID, which you can designate as being global, and then another--hand that ID off to somebody else, and another process can use that ID to look it up and also handle that. So if you wanted to do a recording in one process and actually deal with the data in another one, this would allow you to do that.
I/O surfaces are interoperable with OpenGL, OpenCL, Core Image, and Core Video. They all have convenience functions that allow you to take an I/O surface and import it into whatever the best way you want. So you can see what the best representation of that surface is in that API.
So for example, in OpenGL, you can use CGL Text Image I/O Surface 2D to initialize an OpenGL texture with an I/O surface. Once you do that, you just use it like any other OpenGL texture. So I'll do a quick demo of this so you can see what I'm talking about.
Okay, basically what's going on here is the window on the left is capturing data from the screen on the right. So whatever you see on the left is being captured, represented as an IO surface, and then we're storing it as an OpenGL texture and blitting that back out into the window. Of course, because it's an OpenGL texture, you can do all sorts, anything that you would normally do with a texture you can do with a display stream.
So, for example, you can use fragment shaders if you wanted to do a grayscale transformation or sepia. If you wanted to get fancy, you could do edge detection. This specifically, this may or may not be useful to you, but it does demonstrate that you can use all sorts, all of your standard toolbox of shaders and OpenGL tips and tricks to do anything you need to do with display streams.
And that concludes our talk. For more information, you can contact Alan Schaefer. He's our graphics and game technologies evangelist. There's a mailing list for Quartz. The URL listed there is a hub for all of the graphics documentation online. Specifically, there's a new document, the high-resolution guidelines for OS X. So if you're concerned about making sure your app is ready for Retina displays, that should contain all of the information you need right there. And of course, there's always the Apple Developer Forums.
There's some related sessions. If you're interested in the Retina display stuff, I highly recommend the introduction to high resolution on OS X. This is by the same people that prepared that document that I just mentioned. There's also some other sessions related to getting the best performance out of AppKit and Core Animation and otherwise dealing with the higher resolution displays.