Core Image Effects and Techniques - WWDC 2013

Graphics and Games • iOS, OS X • 54:43

Core Image lets you create incredible visual effects in your photo and video apps. Learn how to harness the new filters added in iOS 7 and OS X 10.9. Check out the seamless integration with OpenGL and OpenCL on the Mac. Understand recommended practices for using Core Image efficiently and see how to maximize its powerful features.

Speakers: David Hayward, Alexandre Naaman

Unlisted on Apple Developer site

Downloads from Apple

HD Video (3.13 GB)
SD Video (546.8 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Thank you all for coming to on the last sessions of this week. I hope we've all had a great week here. Today, in our session, we're going to be talking about Core Image Effects and Techniques on iOS and Mac OS. So, Core Image. In a nutshell, Core Image is a foundational image processing framework on both iOS and Mac OS and it's used in a variety of great applications from Photo Booth to iPhoto on both desktop and embedded products.

It's also used in the new photos app for some of their image effects, the new filter effects that are available in iOS 7. And it's also in a wide variety of very successful App Store apps as well. So, it's a great foundational technology. You spend a lot of time making sure to get the best performance out of it and we want to tell you all about it today.

So, the key concepts we're going to be talking about today are going to be how to get started using the Core Image API? How to leverage the built-in filters that we provide? How to provide input images into these filters? How to render the output of those effects? And lastly, we have a great demo of how to bridge Core Image in OpenCL technologies together.

So, the key concepts of Core Image. It's actually a very simple concept and it's very simple to code. The idea is you have filters that allow you to perform per pixel operations on an image. So, in a very simple example, we have an input image and original image, pictures of boats, and we want to apply a Sepia Tone Filter to that and the results after that is a new image.

But we actually have lots of filters and you can chain them together into either chains or graphs. And this-- by combining these multiple filters, you can create very complex effects. In this slightly more complex example, we're taking an image, running it through Sepia Tone then running it to a Hue Adjustment Filter to make it into a blue tone image and then we're adding some contrast by using the color controls filter.

Now, while you can conceptually think of their-- being an intermediate image between every filter internally to improve performance, Core Image will concatenate these filters into a combined program in order to get the best possible performance and this is achieved by eliminating intermediate buffers which is a big benefit.

And then we also do additional runtime optimizations on the filter graph. For example, both the hue adjustment an example and the contrast or both matrix operations and if you have sequential matrix operations in your filter graph then Core Image will combine those into a combined matrix which will actually further improve both performance and quality.

So, let me give you a real quick example of this working in action. So, if I bring this up here, we have an application which we first used a little bit last year at WWDC and now we actually have fully vague version of the application. So, the idea here is you bring up the filters pop-up and that's allows you to add either input sources or filters to your rendering graph. In this case, I just want to bring in the video from the video feed and hopefully you can see that OK.

Once we have that effect, we can then add on additional adjustments like for example we can go and find another effect in here and with color controls, we can increase saturation or contrast and we can do these effects live. We can also delete them. If you want to do a slightly more complex effect, we can do a pattern here. We can go to dot screen.

And dot screen, hopefully you can see this turns your-- turns the video into a-- like a newsprint type dot pattern, we can adjust the size of the dot and the angle of the dot pattern. Now, let's say this doesn't quite suit our desires right now. This is a black and white pattern. We'd like to kind of combine this halftone pattern with the original color of the image. We can actually represent graphs in this list for you here. We can-- what we can do is we can add another instance of the input video.

So, now we've got two operations on this stack of filters and then we can then combine those with another combining filter. Yeah, here we go, [inaudible]. So, now, hopefully you can see it on the projector but we've got both the halftone pattern and the color from the original image shining through, all right.

So, let me pop this off, delete. That was the first demo of the Funhouse Application. Let me go back to my slides and the great news is the source code for this app is now available. So, this has been a much requested feature. We showed this last year a little bit and it should--

[ Applause ]

really great shape for you guys to look at this application and see how we did all this fun stuff. So, once you've look at the code, you can see very quickly that there's really three basic classes that you need to understand to use Core Image. The first class is the CIFilter class and this is mutable object that represents an effect that you want to apply and a filter has either an image input or numeric input parameters and also it has an output image parameter as well. And at the time you ask for the output image parameter, it will return in an object that represents the current state based on the current state of input parameters.

The second key object type that you need to understand is the CIImage object and this is an immutable object that represents the recipe for an image. And there're basically two types of images. There's an image that's come directly from a file or an input of some sort and-- or you can also have a CIImage that comes from the output of a CIFilter.

The third key data type that you needed to be aware of is a CIContext and a CIContext is the object through which Core Image will render its result. It can be either based on a CPU renderer or GPU renderer and that's really important to distinguish between those two and I'll talk about that a little bit later in the presentation.

So, as I mentioned in the intro, Core Image is available both on iOS and Mac OS and for the most part, they're very, very similar between the two platforms but there are a few platforms specifics that you might want to be aware off. First of all, in terms of built-in filters, on iOS Core Image has about over a hundred built-in filters now and we've added some more in iOS 7 as well. And on Mac OS X, we have over a 150 built-in filters and we also have the ability for your application to provide its own filters.

The core API is very similar between the two platforms. The key classes I mentioned earlier are CIFilters, CIImage and CIContext and they're available on both and they're largely identical APIs. On OS X, there are few other additional classes such as CIKernel and CIFilter shape which are useful if you're creating your own custom filters.

On both platforms, we have render-time optimizations and while the optimizations are slightly different due to the differing natures of the platforms, the idea is the same and that Core Image will take care of doing the best render-time optimizations that are possible to render your requested graph. There're a few similarities and differences regarding color management which is also something to be aware of. On iOS, Core Image supports either sRGB content or a non-color managed workflow if you decide that's what's best for your application.

On OS X, the-- you can either have a non-color managed workflow or you can support any ICC based colored profile using the CG color space graph object. In both cases, both in iOS and Mac OS, the internal working space that Core Image uses for its filters are unclamped linear data and this is useful to produce high quality results and predictable results across a variety of different color spaces.

Lastly, there're some important differences in terms of the rendering architecture that's used. On iOS, we have a CPU rendering path and we also have our GPU based rendering path that's based on OpenGL ES 2.0. And on OS X, we have also a CPU and a GPU based rendering path.

Our CPU rendering path is built up on top of OpenCL on it's-- using its CPU rendering. And also, new on Mavericks, Core Image will also use OpenCL in the GPU and I'd like to give you a little demo of that today 'cause we got some great benefits out of that.

For a wide variety of operations in Core Image, we get very, very, very high performance due to the fact that we leverage the GPU. For example, we can be adjusting the slide or real time on this 3K image or I think it's 3.5K by something image and we're getting very, very fluid results on the sliderand that's-- 'cause these are relatively simple operations.

One way we like to think about this, however, is how does this performance change as we start to do more complex operations and how do we make sure that the interactive behavior of Core Images is fluid as possible. So, we've been spending a lot of time on this in Mavericks and we came up with this demo application to help demonstrate performance.

One thing that really makes it easier to see the performance instead of trying to subjectively judge a slider is we have this little test mode where it will 50 renders in rapid succession as quickly as possible. And what it will do is it will take these filter operations that we've done and it will prepend to beginning of that filter operation in exposure adjustment and it will adjust that exposure and render it 50 times with 50 different exposures. And that will force Core Image to have to render everything after it in the filter graph again.

So, if we go through here, it will do a quick sweep of the image and you can see we're getting 0.83 seconds and that's an interesting number that turns out that that's how long it takes to render 50 frames if you're limited by 60 frames per second display time. So, that's good, that means we're hitting 60 frames per second or maybe we're actually even faster but we're limited by the frame, right.

So, the question is, however, what starts to happen is we start to do more complex operations and obviously if we start throwing in very complex operations like highlights and shadows adjustments and more importantly very large blurs. This blur is actually even more than the 50 pixels, 50-50 value that you're seeing in that slider. It's actually hundreds of pixels wide but hundred of pixels tall and that requires a lot of fetching from an image. So, obviously in this case, when we do a sweep, we're getting not quite real time performance.

And while it's, you know, impressive, you know, we could do better and this is one of the reasons we spend a lot of time in Mavericks changing the internals of Core Image so that it would use OpenCL instead and as you see as we turn on OpenCL on the GPU path, we're now back down to 60 frames per second on this complex rendering operation. So, we're really pleased with these results. The great thing also about this performance is that it particularly benefits operations that were complex. We were doing large complex render graphs. So that was again the demonstration of OpenCL on the GPU on OS X Mavericks.

So, as I've talked about today, we've had a lot of built-in filters and I'd like to give you a little bit more detail on those built-on filters and some we've added and give you some more information on how to use filters in your application. So, we have a ton of useful filters and it's probably barely even readable to see them all here so, I just want to highlight some today.

So, first of all the filters fall under different categories. We have whole bunch of filters for doing color effects and color adjustments. In my slides earlier, I called out three as an example, color controls, hue adjustment, and sepia tone. The other ones work similarly. They take input image and have parameters and produce an output image.

We've also added some new ones in both of iOS 7 in Mavericks that we think will be useful for different-- a variety of different uses. We have, for example, color polynomial and color cross polynomial that allows you to do polynomial operations that combine the red, green and blue channels in interesting ways. You can actually do some really interesting color effects with this. We also have a class of filters which fall into either geometry adjustments or distortion effects.

And for example, one of these is a fun effect called twirl distortion and we can actually demo that real quickly here. And you can see this adjusting a twirl on an image and this actually running in that presentation right now using Core Image. It's kind of recorded movie.

We also have several blur and sharpen effects and I mentioned blur and sharpen because blurs in particular one of the most foundational types of image processing you can perform, Gaussian blur for example is used as the basis of a whole variety of different effects such as sharpening and edge detection and the like.

We've also added some new blur or convolution effects to iOS 7 in Mavericks and we've picked some that were-- be particularly general so that they can be used in a variety of applications, very, very common to use either 3X3 or 5X5 convolutions and we've implemented those and optimized the heck out of them so they'll get really good performance. We've also added a horizontal and vertical convolution which is useful if your convolution is a separable operation and, again, we've optimized the heck out of these.

We also have a class of filters called generators and these are filters that don't take an input image but will produce an output image and these are things for effects like starburst and random textures and checkerboard patterns but we've added a new one in both iOS 7 in Mavericks called QR code generator and this is a filter that takes a string as an input parameter and also a quality setting and then also will produce as its output a chart image, bar chart image. So that can be useful on a lot of interesting applications as well.

We also, have a class called face detector and this not exactly a filter per se but it's-- you can think of it as a filter in the sense that it takes input images and produces output image-- out data and we've had this for a couple of releases now. The great thing is starting now in OS 7 in Mavericks, we've made some enhancement to that. In the past, you could take a face and it will return the bounding informa-- bounding rect for the face and it will also return the coordinates for the eyes and mouth.

But starting in Mavericks in iOS 7, there's a flag you can pass in that will also return information like whether a smile is present or whether the eye is blinking, so that's another nice new enhancement. So there's a brief overview of our 100 plus filters. One question we're commonly asked is how do we choose what filters we use or can we add this filter or-- and I wanted to just talk a moment about our process on that.

So the key thing we want to consider, these two key criterion, one is that a filter must be broadly usable. We want to make sure that we add filters like convolutions which are useful on a wide variety of usages so that we can implement them in a robust way and have them be useful to a wide variety of client needs. And also we want to make sure we choose a type of operations that can be well implemented and performant on our target platform.

So, as I mentioned in my brief introduction at the very beginning of the presentation, you can chain together multiple filters and I wanted to give you an idea in code of how easy this is to do. You start out with an input image, you create a filter object saying, "I'd the filter with a name," and you specify a filter with the name like CISepiaTone and at the same time, you specify the parameters such as the input image and the intensity amount and once you have the filter, you can ask it for its output image.

And that's basically one line of code that will apply a filter to an image. If we want to apply a second filter, it's just same idea, slightly different. What we're going to be doing here is we're going to be picking a different filter. We'll pick hue adjustment in this case. And the key difference is the input image, in this case, is the output image in the previous filter. So it's very, very simple, two lines of code and we've applied multiple filters to an image.

The other things that's important and great to keep in mind is that these-- at the time you're building up the render graph here, the filter graph, there's no actual work being performed. This is all very fast and could be done very quickly. The actual work of a rendering image is deferred until we actually get a request to render it and at that time that we can make out render-time optimizations to make the best possible performance on the destination context.

So another thing is that you can create your own custom filters. I'm going to actually do some of these on both iOS and OS X. We have over a hundred built-in filters and on iOS 7, while you cannot create your own custom kernels, you can create your own custom filters by building up filters out of other built-in filters. And this is a very effective way to create new and interesting effects. And again, we've chosen some of the new filters we've added in iOS 7 to be particularly useful for this goal. So how does this work?

So the idea is you want to create a CIFilter subclass and in that filter, you want to wrap a set of other filters. So there're set of things that you need to do. One is you need to declare properties for your filter subclass that declares what its input parameters is.

For example, you might have an input image or other numeric parameters. You want to override set defaults so that the default values are-- for your filter are setup appropriately if the calling code doesn't specify anything else. And lastly and most importantly, you're going to override output image. And it's in this method that you will return your filter graph.

And internally, Core Image actually uses this technique on some of its own built-in filters. As an example, there's a built-in filter called CIColor invert which inverts all the colors in an image. And if you think about it, really, that's just a special case of a color matrix operation. So, if you look at our source code for color invert, all it does in its output image method is create an instance of the filter for CIColor matrix and passing the appropriate parameters for the red, green, and blue, and bias vectors.

You can also do this kind of thing to build up really other interesting image effects. For example, let's say you wanted to do a Sobel Edge Detector in your application. Well, a Sobel Edge Detector is really just a special case of a 3X3 convolution. In fact, it's very simple convolution, all it is, is depending on whether you're doing a horizontal Sobel or a vertical Sobel, you're going to have some pattern of ones and twos and zeros in your 3X3 convolution.

One thing to keep in mind is, especially on iOS, because we don't want to add a bias term to this convolution. And the idea here is that we want to produce an output image that is grey where the image is flat and black and white where there are edges. And that's particularly important, because on iOS, our intermediate buffers are 8-bit buffers for these type of operations when they-- and they can only represent values between black and white.

One thing to keep in mind, however, is that by adding a bias, you are actually producing an infinite image, because outside the image where the image is flat and clear, you're going to have grey as the output of this Sobel Detector. So let me just give a little demo of that in action and I've actually-- in this particular demo, so it's a little bit more interesting to look at on stage, I'm recoloring the image so it looks-- so the flat areas look black and the edges look colorful.

So again, I've got an input video source and I can go and add a filter to it and I'm going to add a custom filter that we've implemented called Sobel Edge Detector. So as we can see here, hopefully that shows up on the display, OK. You can see my glasses and as I tilt my head, you can see the-- or you can see the stripes in my shirt.

And in this case, the images are being recolored so that the flat areas look black and the edges look colorful. And one thing you'll see is that these circles at the-- above my head are actually colorful and that's because the Sobel Edge Detector is working on each color plane separately.

And there's-- if there's a color fringe in the image, it will show up as a colorful edge in the Sobel Edge Detector. If we wanted to get rid of those colorful frames, all we have to do is append another filter. We can add in color controls and then we can desaturate that. And now we've got [inaudible] of monochrome edge detector, all right.

All right, so back to slides. And again, as I mentioned before, the source code for that filter and the-- is all available in the Core Image Fun House application. All right, so there's another great use for creating your own CIFilter subclasses and that's if you want to use Core Image in combination with the new Sprite Kit API.

So it's a great new API, the Sprite Kit API, and one of the things that supports is the ability to associate a CIFilter with several objects in your Sprite Kit application. For example, you can set a filter on an effect node, you can set a filter on a texture, or you can set a filter on a transition.

And it's a great API but one of the caveats is it you only can associate one filter. So if you actually want to have a more complicated render graph associated with either a transition or an object in your Spite Kit world, then you can create a CIFilter subclass.

And what you need to do in that subclass is you need to make sure that your filter has an input image parameter and you need-- if you're running a transition effect, you want to make sure it has an input time parameter. And you can have other inputs but you want to specify them at setup time before you pass it in the Sprite Kit.

So let me go on to the next section in my presentation today to talk about input images. So, the input in the filters is input images and we have a wide variety of different ways of getting images into your filters. One of the most commonly requested is to use images from a file and that's very, very easy to do, it's one line of code, create a CIImage from a URL.

Another common form-- source is bringing data in from your photo library and you can do that by just asking the ALAssetsLibrary class for a default representation. And then, once you have that, you can ask for the full screen image and that will return a CGImage and once you have a CGImage, you can create a CIImage from that CGImage.

Another example is bringing in data from a live video stream and this is the case that we use inside the Fun House application. And in this case, you'll get a call back method to produce a process of frame of video and that will give you a sample buffer object. Once you have the sample buffer object, you can ask for CMSampleBufferGetImageBuffer and that will return a CVImage buffer object. And a CVImage buffer object is really-- it's just a subclass of the CVPixel buffer object which you can use to create a CIImage from.

At the same time I'm talking about creating CIImages, we should also talk a little bit about image metadata which is a very important thing about images these days. You can ask an image for its properties and that will return a dictionary of metadata associated with that image. And it will turn the dictionary containing the same key value pairs that would be present if you would call the API CGImageSourceCopyProperties AtIndex. It contains, for some image, hundreds of properties.

The one that I want to call out today is the orientation property, kCGImagePropertyOrientation. And this really important because we all know with our cameras today, you image-- the camera can be held in any orientation and the images that saved into the camera roll has metadata associated that says what orientation it was in.

So, if you want to present that image to your user in the correct way, you need to read the orientation property and apply appropriate transform to it. The great thing is that metadata is all set up for you automatically if you use the image with URL or image with data APIs. If you're using other methods to instantiate an image, you can specify the metadata using the kCIImageProperties option.

Another thing we've added on both Mavericks in iOS 7 is much more robust support for YCbCr images. A CIImage can be based on a bi-planar YCC 420 data and this is a great way to get good performance out of video. On OS X, you want to use an IOSurface object to represent this data and on iOS, you want to use a CVPixelBuffer to represent this.

The great thing is Core Image takes care of all the hard work for you, it will take this bi-planar data and it will combine the full-res Y channel and the subsampled to CbCr planes into a full image and it will also apply the appropriate 3x4 color matrix to convert the YCC values into RGB values. If you are curious about all the math involved in this, I highly recommend the book by Poynton, "Digital Video and HD Algorithms" which goes over in great detail all the matrix math that you need to understand to correctly process YCC data.

The other thing you might want to keep in mind is if you're working on 420 data, you might be working on a video type workflow and in that case, you might want to tell on Mac OS, you might want to talk Core Image to use the rec709 linear working space rather than its default which is generic RGB and this can prevent some clipping errors to the color matrix operations. The third section I want to talk about today is rendering Core Image output.

If you have an image and you've applied a filter, there are several ways to render the output using Core Image. One of the most common is rendering an image to your photo library. And again this is very easy to do. There's one thing you want to be aware of is when you're saving images to your phot library, you could quite easily be working on a very high resolution image, 5 megapixels for example and resolutions of this size are actually bigger that a GPU limits that are supported on some of our devices. So, in order to render this image with Core Image, you want to tell Core Image to use a software renderer.

And this also has the advantage that if you're doing a bunch of exports in the background, you can do this while your app is suspended rather than if you're-- we try to use our GPU renderer. And once you've created the CIContext, there's an assets library method which we can use to write the JPEG into that library roll. The key image from key API to call is for Core Image to create a CGImage from a CIImage.

Another common way of rendering an image is to render it into a UIIimage view using a UIImage. And this code for this is actually very, very simple. All you do is you UIImage support CIImage so you can create a UIImage from an output of a filter and then you can just tell an image view to use that UIImage.

And this very, very easy to code but it's actually not the best from a performance perspective and let me talk a little bit about that. Internally, what UIImage is doing is asking Core Image to render it and turn it into a CGImage but what happens here is that when Core Image is rendering, it will upload the image to the GPU, it will perform the filter effect that's desired and because the image is being read back into a CGImage, it's being read back into CPU memory. And then when it comes time to actually display it in the UI view, it's then going back being rendered on the GPU using core animation.

And while this is effective and simple, it means that we're making several trips across the boundary between CPU and the GPU and that's not ideal, so we'd like to avoid that. Much better approach is to take an image, upload it once to the GPU and have CI do all the rendering directly to the display.

And that's actually quite easy to do in your application if you have a CAEAGLLayer for example, you can-- at the time that you're instantiating your object, you want to creat a CIContext at the same time. We created an EAGLContext with the-- of type OpenGL ES2 and then we tell CI to create a context from that EAGLContext.

Then when it comes time to update the display in your update screen method, we're going to do a couple things here. We're going to ask the-- our model object to create a CIImage to render. We're then going to set up the GL blend mode to be-- let's say in this case, source over.

This is actually a little subtle thing to change between iOS 6 and iOS 7. On iOS 6, we would always blend with source over blend mode. But there are a lot of interesting cases where you might want to use a different blend mode. So now if your app is linked on or after iOS 7, you have the ability to specify your own blend mode.

And then once we set up the blend mode, we tell Core Image to draw the image into the context which is based on the EAGLContext, and then lastly to actually present the images to the user, we're going to bind the render buffer and present that to the-- present that render buffer. The next thing I'd like to talk about is rendering to a CVPixelBufferFef, and this is another interesting thing that we talked about and show in the Fun House application where you may want to be applying a filter to a video and saving that to disk.

Now, if you want to make that a little bit more interesting a problem, you may also-- while you're saving the video to disk, you may also want to present it to the user as a view so they can see what's being recorded. So this is actually an interesting example and I want to talk a little bit about that in a few-- a little bit of code we have here.

All right, so again all this code is available in the Core Image Fun House. And what we have here is a scenario of where we want to record video and also display it to the user at the same time. And in our view object when we-- when our view object gets instantiated when the app launches, we're going to be creating an EAGLContext as I demonstrated in the slide. And then were going to be creating at the same time, we'll also create a CIContext at that same time with that EAGLContext.

We look later on it-- when it comes time to render, we have a callback method to capture output and again were going to be given a sample buffer here. If we look down further in this code, we are doing some basic rectangle math to make sure we render it in the correct place. We're going to take the source image that we get from the sample buffer and we're going to apply our filters to it.

Now we have this output image and we want to render that. Now, there're two scenarios here. One is when the app is running and it's just displaying live preview. In that case in this code here, all we're doing is setting up our blend mode and rendering the filtered image directly to the context of the display with the appropriate rectangle.

In the case when we're recording, we want to do two things. First of all, we [inaudible] start up our video writing object. And then we're going to ask core video for a pixel buffer to render into out of it's as-- out of its pool. Then we will ask Core Image to render the filtered image into that buffer and that will apply our filters into that.

Now that we have that rendered buffer, we're going to do two things with it, one is we're going to draw that image to the display with the appropriate rectangle and then we're also going to tell the writer object that we want to append this rendered buffer with the appropriate time stamp into the stream. And that's pretty much all there is to it and that's how we-- when I run the application to Fun House if you try it after the presentation, you can both record into your camera roll and preview at the same time.

So just a few last minute tips for best performance, you keep in mind that Core Image and filter objects are autoreleased objects. So if you're not using ARC in your application, you'll want to use autorelease pools to reduce memory pressure. Also, it's a very good idea not to creat a CIContext every time you render it, much better to create it once and reuse it.

And also be aware that both Core Animation and Core Image both make extensive use if the GPU. So if you want your Core Animations to be smooth, you want to either stagger your Core Image operations or use a CPU CIContext so that they don't put pressure on the GPU at the same time. Another couple of other tips and practices is to be aware that the GPU Context on iOS has limited dimensions and there's an inputMaximumImageSize and an outputMaximumImageSize and there're APIs that you can call to query that.

It's always a good idea from performance effecting to use small an image as possible. The performance of Core Image is largely dictated by the complexity of your filter graph and by the number of pixels in your output image, so there's some great APIs that allow you to reduce your image size. One that we used in this example for Fun House is when you ask for a library from your asset library, you can say, I want a full screen image rather than a full size image and that will return in appropriately sized image for your display.

So, the last section of our talk today is Bridging Core Image in OpenCL. And early on in the presentation, I was talking about the great performance whence we're getting out of Core Image on Mavericks by using OpenCL. And the great thing is we get improved performance to do advances in the OpenCL Compiler, and also the fact that OpenCL has less state management. So, we got some great performance whence.

And the other great thing is so there's nothing your application needs to do to change to accept this to reach us, to do this automatically. And there's actually some great technology behind the hood. All of the built-in kernels and your custom kernels that are all written in CI's Kernel language are all automatically converted into OpenCL code. So, it's really some great stuff that we have to make all this work behind the scene.

If we think about it a little bit, the Core Image kernel language has some really great advantages though. With its CIKernel language, you can write kernel once and it'll work across device, classes and also across different image formats. And it also automatically supports great things like tiling of large images and concatenation of complex graphs. However, there are some very interesting image-processing operations out there that cannot be well-expressed in Core Images kernel language due to the nature of the algorithm.

But some of those algorithms can be represented in OpenCL's language. And the question is, how can you bridge the best of both to these together? How can you bridge both Core Image and OpenCL together to get some really great image-processing on Mavericks? So, to talk about that, I'm going to bring up Alexander to talk about bridging Core Image in OpenCL, and have some great stuff to talk about.

Thank you, David. So, my name is Alexander Neiman [phonetic] and today I'm going talk to you a little bit about how we can use both Core Image and OpenCL together in order to create new and interesting effects which we wouldn't be able to do using just Core Image on its own.

So, we're going to start with an image that's pretty hazy that David took a little bit more than a year ago from an airship, and he said, "This picture suck, how can we make them better? Can we get rid of the haze?" And for the sake of the demo, we did. So, if we look closely here, we're just going to get a little animation of the desired result we're going to try to get here is we're literally going to peal the haze off this image.

So, how are we going to do this? Well, the basic idea is that haze is accumulated as a function of distance. And further away, you get the more haze there is. But if we were to look at any part of this image, so if we zoom in this little section here, there should be something that's blocking this image which is to say that if we were to look at this area under the arches, there should be either an object that's black or a really dark shadow. But because of the atmospheric haze that's accumulated, it's no longer black. It has-- it's colorful. And what we want to do is we want to remove that color.

So, the question is, how are we going to find out what's black and apply that to a greater area so that we can eventually get a dehazed image. So, if we were to look at the pixel somewhere in the middle of this little area, and then search for a certain area in X, and other area in Y, we get a search rectangle.

Now, if we look at the search rectangle, we can see that there is going to be a local minimum that we can use. And once we know that that should have been black, we can apply that amount of-- we know that how much haze has been applied because we know that that should have been black originally, and that amount of haze is probably going to be uniformed over the entire rectangle.

So, if we look at this visually, we're going to compute a morphological mean operation in the X direction first. So, we're going to have our-- we're going to search for the small value in the X direction, and then we're going to do the exact same thing in the Y direction. And then we get this kind of blacky pattern.

We're going to blur this result to some degree, and now we have a really good representation of how much haze there is, and we can subtract this from our original image using a Difference Blend Mode. And if we do that, we get a beautifully dehazed image. In terms of workflow, the way we're going to do this is we're going to start with our input image, and we're going to perform a morphological mean operation in the X direction first where we search for a minimum value, and the values just kind of come together.

The next thing we're going to do is we're going to perform a morphological mean operation in the Y direction, and these, they're exaggerated a little bit like your search is a little bit larger than we would actually do in real life to get this effect but just for the-- so you can actually see something on these slides where, exaggerated it.

Once we've got a morphological mean operation performed, we're then going to blur that result, and then we get a nice gradient which we can then use to subtract from our original image, and we'll get our dehazed image. So if we take our input image and our Gaussian blur image, and perform Difference Blending, we'll get our desired result. The generation of the input image and the Gaussian blur can be done using Core Image.

But we want to use OpenCL to perform the morphological mean in X and Y operations because those are operations which would be traditionally difficult to do in Core Image's kernel language. The problem is that Core Image doesn't know about clMemObject and OpenCL doesn't know about CIImages. So, how are we going to do this?

Well we're going to use IOSurface. And in OS X Mavericks, we've done a lot of work to improve how we use IOSurface and make sure that we stay on the GPU the whole time. And so we're going to go through all the steps today to show how we can do this and get maximum performance and combine all these APIs together. So, let's take a look at our workflow to process this image.

So, we're going to start by asking Core Image to down-sample the image because we don't need-- in order to generate a gradient which we're going to be then using to perform the subtraction, we don't need to run at full resolution. The next thing we're going to do is we're going to ask Core Image to render into an IOSurface.

Traditionally speaking, most of the time you render to a buffer as in like writing the file to disk or directly displaying on screen, but you can also render to IOSurface which as I mentioned something we've really improved in Mavericks. We're then going to use OpenCL to compute the minimum using some kernels that we're going to go over in detail.

Once we've got the output from OpenCL, we're then going to take that IOSurface that was tied to the clMemObject, and we're going to create a new CIImage. We're going to blur that result, perform a Difference Blending, and then we just render and we're done. So, let's take a look at all these steps in little more detail.

So, first thing is first, we're going to import an image with the URL, so we just create, we just call CIImage, image with URL. We then are going to down-sample it so that we have fewer pixels to process. Again, in order to compute this gradient as I mentioned earlier, we don't need the full resolution.

And then we're going to inset our rectangle a little bit such that if we were to have generated an image that wasn't integral, we would end up with-- on the boarder of the image, we might end up with some pixels that have a little bit of alpha value. And we don't want to make our kernel more complicated than it needs to be so we're just going to get rid of one pixel on the EDGE.

And then we're going to down-- and then we're going to crop that image to get rid of that one pixel of boarder. Also, I should mention before I forget that the sample code for this be also available for download on the Session Breakout page at some point later on today. So, don't worry if you don't follow everything here.

But let's get to the next step of this process. First thing we're going to do is we're going to create an IOSurface. So in order to that, we're going to specify a bunch of properties including the, you know, bytes per row, bytes per element, width, height and whatever pixel OS type, the pixel format that we'll be using for our input surface. We then create an IOSurface using IOSurface Create.

Once we've done that, we're going to want to create a CIContext which again as mentioned-- as David mentioned earlier, we're going to want to hold on to 'cause if we perform this effect multiple times, it's good to hold on to all the resources tied with that context. And we're going to make sure that we use-- that we initialize our CIContext with the OpenGL context that we eventually planned on rendering with. And then we're going to actually ask Core Image to render our scaled image down into the IOSurface, and we're going to make sure that we-- our output color space is equal to the input color space of our original image.

So now, let's get into the nitty-gritty of what we'll be doing with OpenCL. So, first things first, we're going to want to create a CL context which is going to be able to share all the data from the IOSurfaces that we created with OpenGL. In order to do that, we're going to get a shared group for the GL context that we plan on using and we're going to create a context using that shared group.

Once we've done that, we're then going to use a function called CL Create Image from IOSurface to the Apple which allows us to take in IOSurface and create a clMemObject. Now, this is going to correspond to our input image which was the result from what we generated that we asked CI to render in initially, the down-sampled image.

But we also need an image to write out to, and our algorithm is going to be a 2-pass approach, a separable approach, we're going to perform our morphological mean in the X direction and morphological mean Y direction. So, for the first pass, we're going to create an intermediate image which is the output of the first kernel which will then be using as the input for the second kernel. So, we've got our intermediate image here. And then we're going to need one more IOSurface based ClMemObjectfor our final output which is what we're going to hand back to Core Image to do the final rendering.

Let's take a little bit, let's take a look at like conceptually what we want to do. So, we've got zoomed in area of an image, and we're going to take a look at how the search is going to happen, and then eventually, we're going to go over each line of code that's involved in doing this. So, basically we're looking for a minimum value, and we're going to initialize it to a very large value which is to say 1,1,1,1, so all white and opaque.

And we're going to look for a minimum value, and we're basically just going to start searching from our left and go to our right, and we're going to keep updating the value of that mean V until we get the lowest value. And right now, we're looking at how this would work if we were operating in the X direction, and eventually we would do the exact same thing in the Y direction, and that's going to keep animating, and I'm going to start talking a little bit about the source code.

So, that's the entire source code for the filter that we're going to want to create, it does the morphological mean operation. So first things first, we're going to have three parameters to this function. The first parameter is going to be our input image, second parameter is going to be our output image, and the third parameter is going to tell us how far we need to search.

We're then going to create a sampler, and we're going to use unnormalized coordinates, and clamp to EDGE because the way this algorithm is designed, we will eventually search outside of the bounds of the image, and we want to make sure that we don't read black, but that we reach the value of the pixel on the EDGE of the image such that are we don't bleed in black and then get an incorrect results.

The next thing we're going to do is we're going to ask CL for the global ID in zero and one, and that's going to tell us basically effectively where we want to write our result out to and we're also going to use this to determine where we should be reading from in our input image.

So, the next thing we do is we initialize our minimum value to opaque white, and then we perform our For loop which searches in this case from left to right. So, if we're going to compute our new-- the new location where we're reading from for, and we're going to do this for, you know, span [phonetic] times 2, and so we are going to offsite all these, the location by 0.5.5 such that we're reading from the center of the pixel.

And we're also going to offsite the X location by the value of I. And if we do that and then read from the image of that location, we'll get a new value, and we can compare that with our current minimum. And we just keep updating that, and when we're done, we write that value out to our output image, and we're done.

And this is going to get ran on every kernel in a very similar fashion that you would do if you're writing a CIKernel. And although this may look like a relatively naive approach due to texture caching on GPUs, this is about its optimal as it gets. So, you can actually perform this at really high speed.

And we've tried a bunch of other approaches, and this ended up being as fast it gets. And if you were going to do this, you would also create a kernel that was very similar to this for the Y direction, and all you need to do is to change the relocation. Instead of incrementing I for X, you would increment I for Y, and the rest would remain the same.

So let's take a look at what we need to do to actually run the CIKernels, I'm sorry, CL kernels. First things first, we're going to give it a character string with our code that we looked at earlier. We're then going to create a CL program from that code with the context that we have-- we created earlier. We're then going to build that program for some number of devices.

And then we're going to create 2 kernels by looking into that program and asking for-- to look up the morphological mean X and morphological mean Y kernels. Once we have that, we're pretty much ready to go. All we need to do now is set up some parameters, and ask OpenCL to run.

So, as we saw earlier, the CL kernel took 3 parameters to the function. The first parameter is the input image. The second parameter is our output image. In this case, when we're doing the morphological mean operation in the X direction, it's going to be the intermediate image. And our third parameter is going to be the value of how far we want to search in the X direction.

Once we've done that, all we need to do is ask OpenCL to enqueue that kernel and run. And so here, we're gong to say, run the minimum X kernel, and we're going to give it some workgroup sizes, and the map for figuring out how the optimal workgroup size is on the source code that will be available later on today.

So once we've done our pass in the X direction, we're going to do the exact same thing. But instead of searching in X, we're going to search in Y and we're going to run that, and we just need to set our-- our input image is going to be the intermediate image and the output image is going to be the output image and we need to get a new span Y, we call clEnqueueNDRange once again with the minimum Y kernel and we're done.

The last thing we need to do before we hand this off back to Core Image is we need to call clFlush, and the reason we do this is because we want to make that all the work from OpenCL has been submitted to the GPU with no additional work get submitted such that when we start using the IOSurface inside of Core Image, the data is valid. And so this is a really important step, otherwise you're going to see image corruption. And that's all we need to do with OpenCL.

The next thing we're going to do is when we've got our input image from OpenCL that was-- that corresponds to an IOSurface, we then create a new CIImage from that IOSurface, and we specify the color space which is identical to the color space that we used at the very beginning to ask CI to render that down-sampled image. So, we're almost done.

The next thing we're going to do is we're going to blur the image, and in order to blur the image, the first thing we're going to do is we're going to perform and to find clamp which is going to basically give us a very similar effect to what we did when we asked for clamp to edge 'cause we don't want to be reading in black pixels when we perform our blur.

So, we're going to do and to find clamp, we're then going to call CI Gaussian blur, specify a radius and ask for the output image, and then we're going to crop that back to the original size of what the scaled image was. So now we have the blurred image that we were looking for which we can then use to the final Difference Blending.

So, in order to do the Difference Blending, it's very simple. We just create a filter called CI Difference Blend Mode. We set the input image to our original input image which was scaled down in this case. We use the blurred image that we just created from the IOSurface as our background image, and in order to generate the final image, we just call value for key and ask for the output image.

Once we've done that, the next thing we need to do is call context or CIContext draw image, our final image at a certain location and give it the balance of what we would like to render which is then in this case be equal to final image extent. Now, this can get kind of complicated when you start generating a lot of effects, and in Xcode 5 and on Mavericks, you can now hover over things in Xcode and get a description. And so in this case, you can see as I'm hovering over an image, I can see the IO-- the CIImage is based on an IOSurface and that it's got a certain size.

But we've also added something now on OS X Mavericks, such that if you were to click on the Quick Looks, you'll actually get a preview of what your CIImage looks like, and we're hoping to have this for iOS 7 as well in the near future. So, this can really help when you're debugging your apps.

So, let's take a look at the effect once more in action 'cause it was a lot of blood and sweat. So, it's worth one more animation I think. And in the meanwhile, we're going to talk about a few little caveats. One thing worth noting is that this algorithm performs really well in removing atmospheric haze to the extent where if you were actually to run this on an image that had a lot of sky and didn't have any dark data or anything black, no shadows, no nothing. It would actually get rid of the sky.

So, it would look black and that's not terribly interesting. So, you don't want to necessarily use this one wholesale but is a fair amount literature out there about how this is implemented and ours is actually pretty quick. You can get really good frame rates and were quite pleased with the results. The other thing is you can also-- because this is a function, atmosphere case basically accumulates exponentially, if you take the logarithm of those values, you get effectively what corresponds to a depth map.

So once you have the depth map, you could do really interesting effects such as refocusing an image afterwards which is the kind of thing where you can do like a fake tilt shift effects, et cetera, and we talked about that in WWDC a few years ago about how you could do that with Core Image as well.

So, some additional information, Allan Schaffer is our graphics and game technology's evangelist, and you could reach him at [email protected]. There's documentation at developer.apple.com, and then of course you can always go to devforums.apple.com to talk to other developers and get in touch with us if you have any questions.

Related sessions and labs, there are few additional sessions which you may also want to go back and look at later if you have additional-- if you're curious about the technologies that we talked a little bit about-- earlier today. And on that note, I would like to thank you all once again for coming, and I hope you enjoy the rest of WWDC. Thank you.

[ Applause ]