Graphics, Media, and Games • iOS, OS X • 53:38
Core Image lets you perform sophisticated image processing operations and create stunning visual effects. Get introduced to the powerful capabilities of Core Image in iOS 5 for adjusting still images and enhancing live video. See how to harness new APIs for facial feature detection on iOS and Mac OS X, and get details about the latest filters.
Speakers: David Hayward, Piotr Maj, Chendi Zhang
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good afternoon, everyone. Thank you so much for coming to today's session on Core Image. My name is David Hayward, and the mission of Core Image is to provide a powerful yet simple image processing framework for your applications to take advantage of. We'll be talking today about some additions we provided to Core Image on Mac OS Lion, but even more so, it's my true pleasure today to talk about how you can take advantage of Core Image in the exciting platform of iOS devices and applications.
So, on with the slides. So what will we be talking about today? I'll be introducing you today to Core Image on iOS 5. We'll be talking about some of the key concepts, the basic architecture, the basic classes that you'll be using in Core Image, as well as some platform-specific details of Core Image on iOS.
After that, we'll go into deeper detail on how you can really use Core Image in your application, how to initialize Core Image data types, how to filter images, and how to render them. There's a lot of good detail there. And lastly, we'll be talking about some new additions to Core Image, which we're calling image analysis operations.
So, first off, introducing Core Image in iOS 5. Here's the basic concept of Core Image for those of you in the room who are maybe new to this technology. The basic premise is that you can use Core Image to filter an image to perform per pixel operations. In a very simple example, like the one I have presented here, we have an original image, and we're going to apply a filter to it, in this case the sepia tone filter, and produce a new result image.
So that's a very simple example, but as it turns out, you can use Core Image to chain together multiple filters. And this will allow you to produce much more complex and interesting effects. As an example here, we start with the image, we apply sepia tone to it, we then apply a hue adjustment filter to it, which turns it from a brownish tone image to a blue tone image.
And then we apply another filter, which will increase the contrast of the image to produce a more artistic effect. You can conceptually think of an intermediate image existing in between each of these filters. But in order to improve performance, what Core Image does is to concatenate these filters. Again, the key concept here is that we implicitly eliminate intermediate buffers to improve performance.
There are several other image processing type operations, optimizations that we do on our filter chain. For example, both the hue adjustment and the contrast adjustment, can be equally well represented as a color matrix in RGB space. So what one of the optimizations Core Image can do is convert those to matrices, and since two matrices can be concatenated, we can concatenate those into one matrix. And again, this further improves performance and actually improves precision as well, reducing the amount of intermediate math.
So now let me go in a little bit of detail about the basic architecture of where Core Image fits into the rest of the operating system. The basic architecture is that we have applications at the top, and those applications are using image data types on the system, such as Core Graphics data structures, Core Video structures, or Image.io images that are returned from the Image.io framework.
You can then input these into the Core Image framework. Core Image has a set of built-in filters, which you can then combine in new and interesting ways, and those are optimized by the Core Image runtime. The Core Image runtime has two basic ways of rendering. One is a GPU rendering path, and the other is a CPU rendering path. The GPU rendering path is based on OpenGL ES 2.0 on iOS, and our CPU rendering path is based on optimized CPU code, which will run using libdispatch to take advantage of multi-core devices.
So now that you have a basic understanding of the architecture of Core Image, let me talk about what the classes that are available to your applications to use. There are three main classes that you can use in your application. The first is the CI Filter class. The CI Filter is a mutable object that represents an effect that you want to apply.
The CI Filter has multiple one or more input parameters, and those input parameters can be either images or other numerical parameters. And the output of a filter is a new image that's produced based on the current state of the filter at the time that the output is requested.
The second key data type is a CI image. A CI image is an immutable object that represents the recipe for an image, for how to produce that image. And it can represent either a file that's come just directly from disk, or the output of a filter or chain of filters. And the third key data type is a CI context. The CI context is the objects through which you will render your images to your output. And as I alluded to earlier, the CI context can either be CPU-based or GPU-based. And we'll talk in some detail about that.
One thing to keep in mind is both GPU and CPU context have their advantages. Both have their uses in your application. And there's, for example, CPUs generally have the improved fidelity because they're fully IEEE compliant. The GPUs-based context have the advantage of performance in most situations. CPU has the advantage that when you're using Core Image to render using the CPU, you can do other operations on the foreground task while you do rendering on the background thread. The GPU has the counter advantage that you can offload the effort from the GPU to do other tasks. We'll give some examples later on in the presentation of when you would want to use CPU and GPU in actual use case scenarios.
So as I mentioned when I first came up on stage, it's very, very easy to do powerful image processing in Core Image. And just to show you how easy it is, I can show you in just four bullet items how to do rendering using Core Image. The first step is we want to create a Core Image object.
We call CI image, image with URL or contents with URL, and that instantiates a CI image object. The next step, we want to create a filter object. We call CI filter, filter with name, and in this example, we'll be creating a sepia tone filter. The next thing we do on that filter is set some default parameters. We're going to set two parameters on this filter. The first one is the input image, and the second one is the amount of a sepia tone effect we want to apply.
The third thing we want to do is to create a CI context. We call context with options nil to get the default context behavior. And fourth, we want to render the output image through that context. First thing we do here is to get the output image of the filter, and in this very simple example, we're going to be creating a new CG image from the results of that filter by asking the context to create a CG image. So that's how simple it is. We'll have some more interesting examples later on in the presentation.
As I mentioned before, Core Image has a set of built-in filters on iOS. In the seed we have today, these are the set of filters that we have available. The filters fall into several different categories. We have filters for adjusting color values in images, such as hue adjust and sepia tone. We also have filters for adjusting the geometry of an image, such as applying a fine transform to an image or cropping an image or straightening an image. And also we have a filter for doing composite operations, which allows you to combine two images into a third image.
Next thing I'd like to talk about are some of the key differences between iOS and Mac OS. So in both cases, on iOS and Mac OS, we have a set of built-in filters. On iOS, in our current release, we have 16 built-in filters, and those are chosen to give you good performance and also good results for photo adjustment operations. On Mac OS, we actually have 130 built-in filters, plus the ability to add your own developer-provided Thank you.
The Core API on iOS and Mac OS is very much the same. The three key data types that I mentioned earlier, CI Filter, CI Image, and CI Context, are the same on iOS and Mac OS. Mac OS does have some additional data types, such as CI Kernel and CI Filter Shape, which are useful if you're creating your own custom filters.
In terms of performance, there's also much in common between iOS and Mac OS. In both cases, we do render time optimizations of your filter graph in order to produce the best possible performance. And lastly, there's on both iOS and Mac OS, Core Image supports both CPU and GPU based rendering.
There is a subtle difference in that on iOS, Core Image will use OpenGL ES 2.0, whereas on Mac OS, Core Image will use OpenGL. So now that we've given a basic introduction, I'd like to pass the stage over to Chendi, who will be giving a demonstration of how to use Core Image.
So this is an extremely simple demo app. Basically, it consists of a single UI image view. And if I tap on the screen, I have another view controller that comes up as a UI popover. And what happens is every time I adjust one of these sliders, they're attached to the input of a particular filter. So if I adjust brightness up and down, the filter will render a new CG image, wrap that in a UI image, and set that as the contents of your UI image view.
And so this happens all in real time, and it concatenates the filters. It runs pretty quickly. So I can also adjust it to be some setting I like. Say like this. Hit Save. Then I can exit to the Photos app. and my photo is saved just like I saw it. And that's about it. Back to you, David.
So now that we have a tease, let's talk a little bit more detail about how you can really take advantage of Core Image in your application. So I'll be talking in three key areas: how to initialize a CI image, how to filter a CI image, and how to render an image. So, first step we'll be talking about is initializing a CI image.
There are three basic ways to initialize a CI image. The first is to initialize an image based on any of the image I/O supported image file formats. You can do that by either calling the CI image class init with URL or image with data. In both cases, you're passing either a URL or some data that contains either JPEG or PNG or other types of image content.
The second way to initialize a CI image is with other key image data formats that are available in our operating system. For example, you can initialize an image with a CG image. You can also, on iOS, you can initialize an image with a CV pixel buffer, and this allows you to work using Core Image in conjunction with a Core Video-based pipeline.
On Mac OS, there's a subtle difference in that the way to work with Core Video data types is to call Image with CV Image Buffer. They're conceptually very similar, it's just slightly different data types. The other option that's available on Mac OS is calling Image with iOS Surface.
The end result is a CI image in any of these cases that you can then apply filters to. The third and last way that you can create a CI image is with raw pixel data. And this is useful in many situations. You can call image with bitmap data, you pass in NSData that represents your pixel data, and you specify at that time what the bytes per row are, and what the pixel format and what the color space is. So on the subject of color space, let me talk a little bit about color management in Core Image.
So on Mac OS, a CI image can be tagged with any color space. This is provided using the Core Graphics Data Type CG color space. If a CI image is tagged with a color space, then all the pixels in that image are converted to a linear working space before filters are applied.
On iOS it is slightly different. A CI image can be tagged with device RGB color space. And if it is such tagged, then the pixels and images are gamma corrected to the linear space before filtering. One important option that we had, if you saw in the previous slide, the initializing methods have an options parameter, is an option called KCI Image Color Space. And you can use this option to override the default color space on a CI image.
One particular value that you may be interested in is you can set the value for this key to be NSNULL if you want to turn off color management. While in most situations you want to have color management turned on, or in the case of iOS having the gamma correction turned on, there are situations where you want to work your entire rendering pipeline without color management, perhaps to improve performance or to give a different effect.
Another thing that's very interesting about CI images on iOS is that we now have support for CI image metadata. There's a new method on CI image that allows you to get a metadata properties dictionary from a CI image. The method is very obvious. It's just properties, and it returns an NSDictionary.
The contents of this dictionary is the same contents as you would get if you were to call the image IO function CG image source copy properties at index. So inside that dictionary will be sub dictionaries for TIFF metadata, EXIF metadata, and other properties that can come from an image file.
There's one piece of metadata that is particularly important that you should be aware of, and that is the property that's in this dictionary under the key KCG image property orientation. All modern cameras support pictures being taken in any of the four possible orientations, and when they produce an image, embedded in the metadata of that image is a numerical value that tells you which orientation the camera was in.
In the vast majority of situations, your application will want to present the image in the desired direction of up. So you can use this orientation property to correctly display the image to the user. There's some other important aspects of the orientation which we'll discuss later on the presentation, so keep this in mind.
Another great feature of this new metadata properties is it's all fully automatic if you use the image with URL or image with data initializers. If you are initializing your images using other data types, then you can specify the properties on your own by using the KCI image properties option.
So now we know quite a bit about instantiating an image object. The next important step is applying filters to that image. So let's talk about that in a little bit more detail. So as I mentioned a few slides back, there's a different set of built-in filters on Mac OS and iOS. And in both cases, our set of filters that are supported will grow in the future.
Your application may want to query at runtime the set of available filters that are available on that OS. This API is a class method on CIFilter, and it's called CIFilter Filters in Name with Category. In most cases, if you want all the filters that are built in, you would pass in the category KCI filter -- sorry, KCI category built in. There's other categories if you just want to get the set of geometry filters or other subclasses of filters.
The result of this call is an array of NSStrings, the name of all the filters that are built in. Once you have the name of a filter, then you can use that name to instantiate a filter object. As I showed a few slides back, you can call filter with name CI Sepia tone as an example.
Once you have a filter instance, you may want to know what are the possible parameters that you can set on this filter. Well, you can go and look that up on our online documentation, but there's also some runtime documentation of each filter's parameters that is available to you.
There's a call you can make called Filter Attributes, and this returns a data structure that gives you information about each of the input parameters on the filter. For example, it gives you the name of each input, it gives you the expected type of each input. For example, if the expected parameter is an image or a vector or a number.
And it also gives you, wherever possible, the common values that are important for that input value. For example, it'll give you the default value for that input, or the identity value, or the minimum or the maximum value. When Chendi gave the demo a minute ago, he was showing you some sliders that were adjusting a filter. What you can do is you can use this API to determine what the minimum maximum values for that slider should be.
So now that we know a little bit about filters, let's talk about setting parameters on them. This is the next step. Input parameters on filters are specified using standard key value conventions. So we can call set value for key. Most filters take an input image as one of their keys, so we have a common key for that called KCI input image key. For other keys, you call set value and let's say a number. And in the case of sepia tone, we want to set the amount or intensity of sepia tone.
Once you've set all the input parameters, you can ask the filter for its output image. The way this is traditionally done on Mac OS is you call cifilter valueForKey outputImageKey. And this works on both platforms. On iOS, however, we've actually implemented the output image as a property. And that gives you two other ways that you can get the output image. You can either call filterOutputImage or filter.outputImage. These are all functionally equivalent. It's just a question of what your preferred coding style is.
Now, these three previous things that I've mentioned, instantiating a filter, setting the inputs, and getting the output, there's a shortcut which allows you to do all of those on one line. And I'm using that shortcut convention in these slides because it gives us a little bit more room on the presentation.
The idea behind this shortcut is you just call ciFilter, filter with name, and then right after that you specify keys and values. And then you can specify a nil terminated array of keys and values. And lastly, at the end of the, after the closing bracket, we call output image to get the output image. So that's a nice compact representation for applying a filter to an image.
So again, as I mentioned, this is the one line of code to apply a single filter to an image. As I mentioned in the introduction, however, you can chain together multiple filters. So how do we do that? Well, it's actually the same idea. It's very, very simple. The idea is we just chain in another filter, but now the input to the second filter is the output from the first filter. And that's it.
One important thing to keep in mind is that no pixel processing is actually occurring while you're building this filter chain. All the real work of rendering the images is deferred until a render is requested. And that brings us up to the third section of this introduction, which is more detail about how to render. And to discuss that, I'd like to bring Chendi back up on stage.
Thanks, David. All right, at this point, you know how to create a filter chain and get an output image from your last filter. So what can you do with this? As we saw in the earlier demo, an easy way to get it on screen is to use a UI image view and set the image as the contents. You can also save the result to the photo library. And if you're writing an image manipulation app, this will be a pretty common operation for your users. Another method of drawing on screen is using a CA Eagle layer-backed view.
This is actually more performant than using a UI image view. And it's similar to how most OpenGL apps on the App Store render their content. And lastly, we're going to discuss how to run it to a CV pixel buffer. You can use this rendering method to use Core Image to process video frames in a larger AV foundation or a Core Video-based pipeline.
So if you want to set a UI image view's contents to your rendered CI image, you need a context and you need the image you want to render. With one line, you can just call context createCgImage from rect with the area of interest cgRect that you want to render. This will give you a new CG image ref.
Now, you can wrap this easily in a UI image for display in your image view. Also note that since you have a CG image, you can also use normal Core graphics function calls and also set the contents of a base CA layer, for instance, anything that you would normally use the CG image ref for.
One thing to note is that there is an orientation flag that you might want to set for UI image. When you take a photo with your iPhone or iPad 2, and if it's in landscape or upside down, there's going to be a metadata flag or metadata key that says what the orientation is. Most SLRs and point-and-shoots also have information similar to this. Because of this, you're going to want to map the CG orientation to the equivalent UI image orientation enum so the UI image view orients the image properly in the image view.
[Transcript missing]
So now that you know that, how do you save to the photo library? Well, the code for that is identical with the CPU context as it is with the GPU context. You call the same method, createCgImage from rect to get the CG image ref. Then you can use assets library framework to call writeImageToSavePhotosAlbum to save your image to the photo album. And those two or three lines are all you need to do to save to the photo roll.
The other method of rendering I discussed is CA Eagle Layer Backed Views. So it turns out if you use a UI image view to display your image, there's going to be implicit call to GL read pixels every time you create the CG image. And after you set the contents of the UI image view, Core Animation has to do another GPU upload to get the contents into its own memory. So this turns out to be slower than you want. In our example app, since we weren't doing that complex of a filter chain, it was still pretty performant. But if you've got any sort of complex filter chain or large images, it might be too slow.
When you have stuff like that, you're going to want to use a CA Eagle Layer to display the images. And again, this is how most OpenGL games display the content. So what you do is you create a CA Eagle Layer, which has a corresponding Eagle context that contains the frame buffer and the render buffer. One thing to note is you must create your CI context with the exact same Eagle context that backs the CA Eagle Layer.
Normally in an open GL game, what happens is you'll issue a bunch of GL code, a bunch of GL calls to draw stuff, and then when you're ready to display, you call glBindRenderBuffer and then egoContextPresentRenderBuffer. For Core Image, it's very similar. Instead of the GL calls, you set up your filter chain and finally call the new method contextDrawImageAtPointFromRect.
This will draw your CI image onto the render buffer for that ego context. You're not done at this point because you still have to display it, so you still have to call the GLBindRenderBuffer and present render buffer like you would in a normal game. And to demonstrate that, I have a demo app.
So in this app, I'm using AV Foundation to stream live video from the front camera as CV pixel buffers. I then wrap the CV pixel buffers in a CI image and use the code from the previous slide to display on screen. And just like the first demo, I can use the UI popover to bring up a little bunch of sliders and adjust stuff in real time, like contrast, saturation, temperature, and so on. And this all happens in real time, and that's about it. So, yeah.
And the last method I'm going to discuss today is rendering to a CVPixel buffer. The API for this is pretty simple. It's similar to the createCG method, except you have to have a CVPixel buffer to already render into. Once you have your context and your image, call renderToCVPixelBuffer with the appropriate bounds and color space. And once you have this, it allows you to use Core Image to process individual frames within a larger Core Video AV Foundation pipeline. And to demonstrate this, I have a slightly modified demo to show you.
So this is identical to the previous demo, except I have an additional record button here. So when I hit record, instead of rendering directly to the screen, my app first renders to a CV pixel buffer, passes that on to an H.264 encoder, and also uses that same image to display on screen. So I can hit record, adjust stuff like hue, tint, temperature, saturation, just like before. Stop recording. If I go to my Photos library, yeah, so you can see all the live edits I made show up in the encoded video. So yeah, that's it.
So I'll start with the previous demo that didn't do the recording. The AV Foundation callback looks like this. Basically, we grab the CV Pixel Buffer from the sample buffer, create a CI image around it, and we turn off color matching. So we pass in NSNull for the color space. Now, CV Pixel Buffers come in rotated by 90 degrees, so we have to rotate it for display on screen by applying a CG affine transform of 90 degrees.
We passed the image through our set of filters. In this case, the six sliders corresponded to inputs on three separate filters. Once we have our result image, we can draw directly to the GLES context with the draw image in Rect from Rect Call, and then present it to the screen.
And for the second demo, It's slightly more complex. There is a bunch of AV Foundation code I'm not going to include. It's mostly boilerplate code to set up your encoder and append sample buffers to that. But basically what you have now is the same stuff. You have the pixel buffer.
In this case, since we're rendering to a pixel buffer, we set up a CVPixelBufferPool to handle the buffers more efficiently. And you can read more about that in the AV Foundation docs. Create the CG image again with the null color space. Filter the image. And since we're not rendering to the screen yet, we don't have to rotate the image until we finally render. So we initially render to a CVPixelBuffer here.
We wrap the CVE fix buffer in another CI image, rotate it, and we draw the rotated one onto screen, and then we append the non-rotated one to our AV asset writer. And that's how you can encode video and display it to screen at the same time. Back to slides.
So there's a couple of key performance best practices to keep in mind. Firstly is that CI images and CI filters are auto-released. So if you're doing a complex or long chain of filtering and rendering, you're going to want to wrap your stuff, your code, in auto-release pools. That way you can reduce memory pressure in case a CI image is holding onto a large image in the background or something.
When you're creating your CVPixel buffers, try to include the key KCVPixelBufferIOSurfaceProperties key. This creates the CVPixelBuffer with the iOSurface backing, and that allows us to do some secret sauce stuff in the background to make the rendering faster. The next point is very key. Do not create a CI context every time you render. That's equivalent to writing an OpenGL app and creating a GL context for every frame. And CI context generally include a Eagle context, so this is going to really slow down your app if you do so.
Another point is that because Core Animation and Core Image both use the GPU, they could contend for the same resources. So you should try avoiding scheduling CA animations while you're performing CI rendering on the GPU. If you've written enough Core Animation code, you know that once you have a certain number of layers or animations, it starts to get a bit stuttery when things get too complex.
And this will only get worse if you try to render with Core Image at the same time as these animations. So either use a CPU context to render when you have animations, or just schedule a very simple animation or no animation at all on screen while you're performing your rendering.
Note again that both contexts have maximum image sizes. You can query the sizes with the input image maximum size and output image maximum size calls on the context for iOS. Again, this is 2K by 2K on older iOS hardware, 4K by 4K on the iPad 2, and the CPU context supports up to 8K by 8K.
And the last point is pretty important too. Performance generally scales linearly with the number of output destination pixels you have to render. So if you have an iPad 2, which is a 1024 by 768 screen, there's no point rendering a 4 megapixel image and then displaying that on screen scaled down. What you should do is try to render a screen size image for optimal performance.
And when you're performing your final save to disk or to photo library, you can use the CPU context to render the large image in the background thread. And to get smaller images, you can either use Core Graphics or Image IO that have APIs to give you either thumbnails or cropped sub-reqs of your current image. And with that, I'm going to give the mic back to David.
Thank you very much, Chendi. So the third section of today's discussion I'd like to talk about is something we're calling image analysis. Previously we've been talking about image filtering, and filtering has the properties that it's very cheap to set up a filter chain and all the work is done at render time.
Image analysis, however, goes beyond those normal constraints. The idea is that for a large class of operations, you may need to read the entire image in order to figure out what you want to do to the image. So we have two types of image analysis tools that are now available in Core Image that we're introducing to you on iOS. One is face detection, and the second is auto enhance, and we'll talk about them in that order.
So first, let's talk about Core Image face detection. So we all have images in our photo library with faces in them, and one very nice thing to do is to know where those faces are in the image. Note that face detection is a different problem than face identification or face recognition.
Today we're talking about face detection. And there's a lot of stuff you can do if you know where the faces are in an image. There's lots of interesting image optimizations you can do if you know how to crop or improve the quality of an image if you know where the faces are.
So, how can we do that? Well, we have a new Core Image based face detection API. And this API is available and identical on both iOS 5 and Mac OS Lion. It's a very simple to use API. It has two classes. First class is a CI detector, and the second class is the result of a detection, which is a CI feature, specifically CI face feature. We've designed this API to be feature upgradable, while today we just have face features. You can imagine that that could grow in the future to different types of detected objects.
So, how do we use this API? As I said, it's very simple. First thing we do is create a detector. We call a CI detector, detector of type, and this time we specify that we want the one and only type, which is detector type face. And at this time, we can also specify some options. So, what options are important? Well, there's a very important option, which allows you to tell the detector whether to be fast or thorough.
Face detection, like many problems in life, you can choose between having the best answer in a long amount of time or a good answer in a shorter amount of time. And while in many cases the low accuracy, high performance is suitable, there may be certain applications that want to take longer and produce a better answer, for example, finding smaller faces. So you have that choice in your application to direct the detector whether to be fast or thorough.
Second step is to search for the features in an image. We pass in the detector object we created, and we pass in the image we want to look for features in. And we can specify options, and the result is an array of detected features. In this case, the options are also important. In the case of faces, the detector needs to know what direction is up in the image in order to look for upright faces. So this is a very important property. The way you pass it in is via the key value KCI detector image orientation.
In the vast majority of situations, the value you're going to pass in for this key is the orientation metadata that came from the image. So in this case, we're doing two things. We're getting the image properties and getting the orientation key from that, and then passing that in as the value for the key CI detector image orientation.
And again, the purpose of this is to tell the detector what direction up is. Once you call the detector and it returns an array of features, you can loop over those features to do whatever you want. In this very simple code example, we're going to do a few things.
First thing we're going to do is ask the -- we're going to loop over the features F in the array. We're going to ask the feature F for its bounds. All features have bounds, and those bounds are a CG rectangle in the Core Image coordinate space, which is lower left-hand based. So in this example, we're just going to print that to the console using NSLog.
The other thing we can do on a face feature is query it for interesting sub-features, such as the left eye location, the right eye location, and the mouth location. And again, this is very simple. The CI face feature class has properties, such as has left eye position, and that returns a Boolean.
And if that is true, then you can query for the property left eye position, and that returns a point with an X and a Y value. So again, very, very simple in your application. So to show this in practice, and we'll be showing this on both Mac OS and iOS, I'm going to welcome up to the stage Piotr Maj to talk about how to use our face detection API. David.
So I would like to show you today how to take advantage of new CI detector API, first on Mac OS, then on iOS. So the best way to experiment with graphical filters on Mac OS is Quartz Composer. So let me show you how to wire up simple Quartz Composer composition, which performs face detection.
As you see on the output image, it consists of the background image and face frames, one frame on top of each face detected in this image. So, how does this composition look like? First, let's do the background. The background is simple. We just start with input image and pass it directly to a billboard patch, which is responsible for displaying the image on screen.
That was quick and easy. Now, how to draw the faces, the face frames on top of faces. To do this, we need to use a new patch in Quartz Composer on Lion, which is called Detector. It's located here. And it takes image as an input, entry times, An array of face features. We can look at it by pointing mouse over these features. As we see, we have four faces, four elements of the structure. The rest of the job is done in the loop. represented by iterator patch. I will now jump into this iterator patch by double-clicking on it.
Let's make it bigger. It may look scary, but actually it's very simple. All it comes down to is we extract from each face feature its individual components. So here we extract X position, Y, width, and height. We can look at these values. I don't know if you see it, but this value is pretty low.
It's 0.26 for X value. Why is it so low? It is so low because detector patch uses Quartz Composer's normalized coordinate system, which is from minus one to one. So in order to use it on real image and to see actually the result on the screen, we need to convert it to CG image coordinate system, which is pixel based. This job is done in a Core Image filter, which uses some simple kernel.
To first do the math, here is the part which converts coordinates to pixels. And it's also responsible for drawing the square on top of face. And the output of this filter is sent directly to the billboard patch and the face frame is drawn on the second frame. We repeat this step for every face we found in the image, and this produces... The output image. So this works on still images, but there is absolutely no reason why we could not attach video input here.
So it's tracking me in real time. That's the Mac OS part. Now for next demo, I need to switch to my iPad. OK, here it is. This is a simple app which allows me to pick an image. And we have faces detected in just a fraction of a second.
Two faces or even five. So this simple app. And how does the code look like? The interesting part starts in ImagePicker controller. Picker did finish picking media with info method. This is a callback from UI ImagePicker. And we get reference to image user chosen from the library, and perform the detection in detect faces and draw face boxes method. Please note that I sent this job to the background. It's good practice to offload this job to the background, not giving the chance for UI to freeze. So let's jump to this method and see what's happening here.
First, we create CI image from our UI image. Then we instantiate CI detector with accuracy level of our choice. Then we perform the actual detection by invoking features in image method. This method returns an array of faces. Over which we can iterate to compose our final image. Our final image will be done using only Core Image filters.
So, for each face, I create face box or face frame for image in this face box image for face method. And using source over compositing filter, I attach this reddish square on top of background image. I repeat this step for every face. So, finally, every face finds its way to background image. If you're curious how I create this reddish square using only Core Image filters, here it is.
I use CI constant color generator filter which generates constant color, like its name says. This color is on infinite dimensions. So, infinity is probably too much for our purpose. So, we'll use another filter, CI crop, to crop this infinite color to reflect the face bounce. Here, input rectangle, face bounce. And we just use output image property of CI filter to get this box.
Now, having this image ready, we can use, as David showed in presentation, CI context, default CI context to render this to CG image. And now it's trivial to just put it as an input for UI image and UI image view. What's important? This code, the detection part, you could copy paste to Mac OS application and it will look exactly the same. It's pretty easy. So that's how it looks on Mac OS. I have one bonus demo to show. So here I am. But not only me.
Thank you very much. David? I want to give thanks to the AV Foundation folks who did the fun mustache demo, and we happily stole it for our presentation as well. So that's face detection. And as I mentioned a little earlier, if you know where faces are in an image, there's lots of other interesting things you can do, including help improve the image based on whether faces are present or not.
And that brings us to the next subject of our discussion today, which is automatically enhancing an image. This was one of the things that we demonstrated on Monday where we were talking about some of the new features in iOS 5 where we have one-touch improvement to your images to produce a better image than you would necessarily get by default.
Great thing is we've actually provided this as an API for all of you guys to use as well. So how does auto enhance work? Well, like other problems of this class, we need to analyze the entire image first. And the idea is we analyze the entire image for its histogram, for its face region contents, and also for other bits of metadata that are present.
Once we have analyzed for that information, the Auto Enhancement API will return to you an array of CI filters that are custom designed to be applied to the image in question. Let me just repeat this one more time. So the array of filters that we return have had all the input parameters set up and customized for that specific image to produce the best possible output.
So, how does this API work? Well, again, as I said before, we return an array of filters. These are some of the filters that we apply today when we are doing auto enhancement. One of the first filters that we apply based on our image analysis is CI red eye correction. And this in itself is a very complex problem to be able to both detect and repair red eye artifacts that are sometimes red, sometimes amber, sometimes white, that result from camera flashes.
This is a very, very interesting problem, and we've provided that as part of the auto enhancement capabilities. Some of the other filters that we apply is a filter called CI face balance, and this filter allows the auto enhancement to adjust the color of an image to produce pleasing skin tones.
Next, CI Vibrance, which increases the saturation of an image when appropriate without distorting skin tones, which is also important. It's worth mentioning at this point that some of these things like vibrance may be familiar to users of like Aperture. These are algorithms that we worked in conjunction with those teams to use the best imaging technology we have at Apple to improve images, and we've brought them to you as part of this API.
Next filter that we can apply is CI Tone Curve, and this is a filter that allows you to adjust the image contrast. And then also CI highlights and shadows, and we use that to bring up the shadow detail of images when appropriate. So what are some examples of how this looks in actual practice? So we have three images, each of which have some degree of issues with them in different types of areas. This first image is a great photo of a girl, but if you look at the skin tones, it looks a little greenish. So we can actually improve that by using the face balance to improve that to make the skin tones more natural.
The next photo, we had nice content, but the shadows were a little dark, and we can make the image more lively by bringing up the shadows. And in this last photo, I'm hoping you guys can see, we've got a case of red eye, which is very, very hard to repair usually, but with our red eye filter, we can make that problem go away.
So that's how it works in practice. Let me talk to you in some detail about how you can use this in your application. So it's a very simple API, first and foremost. The API is one call called Auto Adjustment Filters with Options. And you pass in an options dictionary.
And some of these options are actually pretty important. You may, for example, not want to just do all the adjustments, which is what we return by default. But you may want to just do Red Eye or just do the other auto enhancements, depending on what your application desires. So you can pass in either KCI Auto Adjust Enhance to either true or false, or KCI Auto Red Eye to false, if you just want to perform the adjustment filters.
Another very important option is to tell the auto-adjustment API what direction up is in your image. You do that, as I showed before, in the face detection example by asking the image for its properties, getting the properties KCG image property orientation key, and then passing that in as an option with the key CI detector image orientation.
You need to do this if you want to get the best possible results out of the auto-enhancement. If you do not pass it in, then the detector may not find the face, and we will be able to do some adjustments, but we won't be able to necessarily find red eyes or do skin balance.
The next thing I'd like to talk about this API is the result of this API is an array. It returns an array of adjustment filters. Once you have that array, there's several options that are available to you. The most likely scenario is that you're going to want to chain those filters together and get an output image. And you do that using the following short line of code. You basically loop over each filter in the adjustments.
You then set the input image to be the input image preceding it. And then you set the image variable to the output of that. And then you just loop. And the idea behind this loop is that you're chaining the output to the input to the output to the input for that array of filters.
You can, however, use this opportunity to do other things. You can query each of those filters and see what the parameter values are. You could store the names of the filters and their parameters off so that if you want to apply the adjustments at a later time without incurring the cost of the analysis phase, you can do so. Typically, however, you're just going to chain the filters together, and you'll get an output image that's ready to render using any of the techniques we mentioned earlier.
So now to give a demonstration, this is a very short demo of auto enhancement in progress, then I'll pass the microphone over to Chendi again. Thanks. So this is a pretty simple demo. You open up the app, and there's a bunch of different images taken with, in this case, SLR. And if we just tap on the image-- CI will perform an auto-enhance, and you can see from the result that it's increased the contrast, improved the colors and the saturation.
I think I'll do two more. This one and that one. And that's it. So that's the end of the bulk of our conversation for today. I want to thank all of you for coming today, and I look forward to seeing all the great applications that you build using Core Image on iOS. If you have any questions, please contact Alan Schaffer. Also, if you didn't already see it, there is a session on capturing from the camera using AV Foundation on iOS 5. Thank you all very, very much.