Graphics & Media • OS X • 49:34
Core Image lets you create high-performance image processing solutions that automatically take advantage of modern GPU hardware on Mac OS X. See how you can harness its capabilities to enhance still images and create stunning visual effects. Learn recommended practices for using Core Image efficiently, and understand how to extend Core Image to leverage your own custom image processing algorithms.
Speakers: David Hayward, Daniel Eggert, Alexandre Naaman
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
[David Hayward]
Good afternoon, and thank you for coming to today's discussion on "Image Processing and Effects with Core Image". It's a lot of fun stuff to talk about today. I'll start off by giving you a brief introduction to Core Image for those of you who may be new to the technology and discuss quickly how you can add it to your application.
After that I'll pass the stage over to Daniel, who will be talking about how to use Core Image most efficiently in your application to get best performance and then, finally, Alex will come up on stage and talk about writing some filters including one at the end with some really twisted math, but don't worry it's not on the final exam.
So, first a quick introduction to Core Image. So, Core Image is used throughout the Mac OS X Operating System for everything from fun special effects like our dashboard ripple effect to effects in iChat and in Photo Booth and in screensavers, and it's always used for serious image processing in our applications like iPhoto Aperture including your applications. So, how does Core Image work? Well, it's built on its core on what we call filter kernels, which are small portions of code, which are written in architecture-independent language. That language is C-like with vector extensions and it's based on an OpenGL shading language subset.
Core Image can then execute those kernels on either as CPU or on GPU depending on how your application sees fit and Core Image is built on other key architectures of Mac OS X such as OpenGL and OpenCL in order to get its best performance. So here's the basic concept of Core Image. The idea is you have filters, which perform per pixel operations on images. In this very simple example here, we have an original image and we're going to apply a Sepia Tone filter to produce a new image that had effect applied to it.
However, we can start adding additional filters and create chains of filters. For example, here we have the Sepia Tone filter and then we've added to that a hue adjustment filter, which gives us a little blue tone effect. So, this allows, as you can see, far more interesting effects.
One key feature of Core Image, however, is that it can concatenate multiple filters and the idea behind this is to reduce the need for intermediate buffers wherever possible to get best performance. And it's not just simple chains that are supported. You can actually have complex graphs of filters in order to achieve much more interesting effects such as sharpening effects and gloom effect like we kind of see here in this example.
So, in addition to the framework for executing filters, Core Image includes a large set of built-in filters, which allow you to get started with Core Image right away. These built-in filters we include include several filters for doing geometry adjustments such as affine transforms and cos transforms. We have distortion effects like this glass distortion; a wide variety of blurs, Gaussian blurs, motion blurs; we have sharpening filters; we have color adjustments like this one, which is an invert filter; we have color effects like this is a Sepia Tone filter we mentioned earlier; we also have a bunch of stylized fun filters like this one, which is called crystallized, which turns an image into a crystal; we also have half-tone effects and also tiling effects, which will take either a rectangle or a triangle and create an infinite image out of a portion of your input image; we have generators, which are for generating starburst effect and checkerboards, whatever you can imagine; we have transition effects, which are useful for using Core Image on video, which allow you to segue from one piece of image to another image; we have composite operations like standard Porter-Duff composite operations; and we also have a set of reduction operations such as filters, which will take a large image and reduce it down to the average color of an image or the histogram of an image and this could be very useful as foundation for other image processing algorithms.
So, as I alluded to earlier, Core Image supports large rendering trees and one of our keys features is we will optimize that tree for you to get the best performance. Our Core Image runtime has just-in-time optimizing compiler and one of its key features is that it defers its optimization until you actually draw the image and this allows it to only evaluate what portion of your image is needed to draw.
So, if you're zoomed in on a very large image, Core Image will only apply your filter to the portion that is visible. Similarly it also supports tiling of large images so if you have a very large image, it can break it up into pieces for you without you having to do all the additional work of tiling.
Core Image also performs optimization algorithms; the other typical compilers don't. For example, it will concatenate multiple color matrix operations if they appear in series or if you have a chain that involves a premultiply and an unpremultiply of alpha, it will optimize those away. It also if you will reorder scale operations so if have a complex filter that's being applied to a large image and then it's downsized to the screen, Core Image is smart enough to move the down sample operation earlier in the processing tree so that the complex filter is evaluated on less data. Another optimization is it only does color management when it needs to, which is typically on the input image and then finally rendering to the display, to the display for a file.
One thing to keep in mind is these optimizations are not just about improving performance. By optimizing out to sequential operations like I've outlined here, we also get better quality because there's fewer operations which can introduce quantization artifacts. So, that's a brief introduction to the architecture of Core Image. Let me talk for a few slides about how you can add it very easily to your application.
So, first off there's a few Core Image objects that are Cocoa based objects that you want to be familiar with. First and foremost is the CIFilter object. This is a mutable object which represents the effect you want to apply, and it has a set of input parameters, which can either be numerical parameters or images, and the results of a filter is an output image, which then you can do further filtering on. Another key object type is the CIImage Object, which is an immutable object that represents the recipe for an image.
This image can either represent a file just read from disk or it can represent the output of a filter that I just mentioned earlier. The other key object to keep in mind is the CIContext Object and this is the destination to which Core Image will render your results and the CIContext Objects can be based on either OpenGL context or on Core Graphics context depending on what you see fit best for your application.
So, how do you add Core Image to your application? Well, you basically can do it in four easy steps. First is we want to create a CIImage Object. In this brief example, we're going to create a CIImage from a URL. Second, we want to create a filter object. We create a filter in Core Image by specifying its name. In this example, the CISepiaTone filter. We also at this time can specify the parameters for this filter, the input image and the amount to the effect we want to apply.
Thirdly, we want to create a CIContext into which to draw. Here we create a context based on a CGContext and lastly step four we want to draw the filter into that context. So, we ask for the output image of the filter and then we draw that into the context and that's all there is to it.
Here's the same steps written in code-like format. Here we have a slightly more complicated example and four lines of code; we're creating an image from an URL; we're applying the Sepia Tone filter as we did earlier; then on top of that we're applying a hue-adjustment filter, which will rotate the hue in the image by 1.57 degrees by radiance. Lastly, we get the output image of that filter, and we draw it into our context.
Now, one of the great things about Core Image is that in a very few number of lines of code, you can leverage all of the key technologies that are below Core Image such as OpenGL and OpenCL. For example, these four operations here when Core Image converts that to OpenGL work, will represent all of this code in OpenGL for you and this includes tiling, this includes converting the kernels to shaders and so forth.
Similarly when Core Image is leverage OpenCL to do its work, it will also convert that same set of operations into a healthy amount of OpenCL work. So, that's our brief introduction to Core Image and how to add it to your application to talk about how to get the best performance out of Core Image. I'm going to pass the stage over to Daniel, who will be talking about getting the best performance.
[ Applause ]
[Daniel Eggert]
Thank you, David. So David showed you how easy it is to use Core Image. So, I want to show you a small demo that does exactly that. It's a very simple demo that uses Core Image. Then next I want to take you through five topics related to Core Image and show you some things and how to do things efficiently with Core Image, a few things to be aware of and then, finally, take you through some debugging tips at the end.
So the demo is a very simple demo. It opens image and then inside the NSViews, subclass is draw method, it applies one of the built-in Core Image filters to it and draws it to the screen. So let's take a look at what that looks like. So, this is the pointilizer demo app.
Let me drag an image onto the app, and it simply opens that image and here's our custom NSView and this filter pointilizes and I can change the radius, which is the input parameter there's a filter and you can see the dots getting larger, and I can drag the slider and they get smaller and that's all there is to that demo. So this simple demo application is available on the attendee website. I suggest to you to download this app check it out to get started with Core Image.
Also, the next few things I'm going to talk about most of those are illustrated in this simple application. So, the five things I want to talk about. I'll start off with something that most of you have probably heard about which is NSImage. This is the Cocoa image object that most of you know very well probably.
Some things that people are not aware of is that an NSImage can both be a source and a destination and, hence, is inherently mutable pixel container. The content inside an NSImage can change depending on various situations. Also an NSImage can contain both bit map data and things like PDF data.
So the other image type that is available to you on the system is something called CGImageRef, which is the quartz image object. This in contrast to NSImage is an immutable pixel container. It contains exactly one bitmap-based image and this is the type you want to use for best fidelity when you're using image processing. This is the type you want to read your data into and if you're saving onto disk, you want to use the CGImageRef-based APIs.
Again, the sample code will take you through some of these steps. Have a look at that. Next up is I want to talk a bit about CPU and GPU. You've probably all heard about that a lot of work is being, like the GPU is the new kid on the block you want to use the GPU, but you need to be aware that both the CPU and the GPU each have a unique benefit.
For example, the CPU is still what will give you the best fidelity whereas the GPU will give you the best performance. So, it's really a tradeoff there. Another more subtle detail is that the CPU is more background friendly. On the CPU, you have thread scheduling so if you want something on the background, you'll probably want to run it in the background on the CPU.
The GPU has the obvious advantage that it offloads the CPU so if you have a lot of work on the CPU you might want to use the GPU. So, it really depends on your application. You need to think about what is the right thing to use for you.
Two examples if you're applying an image effect on image, if you're apply an image effect on image, if you're apply an effect on image and want to say it to disk, you're probably going for best fidelity and in that case you want to use the CPU. If you're interactively updating the display, you probably want to use the GPU.
So now that you know which one of the two you want to use, how do you do it? David showed you how to create a CIContext. Well, when you create it, you can create it with an options dictionary. Inside that options dictionary, you set kCIContextUseSoftwareRenderer to yes, and that will make that context be a CPU context. If you don't, you pass that option dictionary in when you create the context; if you don't specify it, that context will be a GPU context. The other thing to note as David already mentioned is that CPU Context will use OpenCL on Snow Leopard and 4.
The next thing is probably the most important thing to take away from what I will say in this session. It's about your CIContext. The CIContext inside Core Image hold on to a lot of state and a lot of caches. That is not visible to you, but Core Image does that to ensure that your application gets the best performance and you really need to keep your CIContext around and reuse your CIContext otherwise all those caches will be thrown away every time.
So, as I've written here, reusing your CIContext is usually the single change in your application that leads to the largest performance win. So check your application if you're using Core Image and make sure you're doing this and well, how would you do that? If you're using like in the example app you're drawing inside an NSview drawRect, you can simply use NSGraphicsContext and get the CIContext from there and that will automatically do the right thing for you. It will reuse the same CIContext. If you are creating your own CIContext, you simply retain it in the beginning or, sorry about that, you retain it in the beginning, you reuse it and then in the end before your application quits, you release the CIContext. So, remember this.
If you're not doing this, it might give you a large performance win. The next thing is color management and the good story here is that you get color management for free. Core Image automatically does color management for you by respecting the input images color space and respecting the context of output color space. The filters are applied in the linear working space. So at the beginning Core Image converts from the images color space to the linear working space and on the output side Core Image converts from that linear space into the context output color space.
Sometimes people want to turn off color management. You can do that. Again, in the options dictionary, you can set on the input side the KCIImage color space to null. That will turn off color management on the input side, likewise on the output side, and you get setKCIContextOutputColorSpace to null. That will turn off color management on the output side. When you're entering to the display, the display has a color profile and we color manage to match the displays color space.
This color space will change and there are two reasons why it could change. One thing is if the user drags the window of your application turn on the display, if the user has a multiple display setup, the other situation is more subtle. If the user while your application is running goes into system preferences and changes the color profile, you want your application to update the display so that your window still looks correctly. You would do it like this.
You call set displays when screen profile changes to yes on your window and then inside your drawRect method, again, you use NSGraphics context, get your CIContext from there. That will make sure that your window is redrawn correctly when it moves from one display to another. This does not all, however, work when the user changes the display profile in the system preferences. Go and check out the sample code. It shows you exactly how to handle that situation. Some of your applications might use off screen caches where you've rendered something to that you are then reusing. Those off-screen caches need to be invalidated when the display profile changes.
Again, it's kind of the same scenario, but slightly different. You can get notified about the display profile changing either in the windows delegate you can implement the window change screen profile and implement that method, clear your cache there. There's a notification you can register for. It's called NSWindowDidChangeScreenProfileNotification. You can use that to clear your caches. The fifth thing is threading and, again, the good story here is that if you're using the CPU, Core Image will automatically use all available cores on the system. You will get multi-threading for free with Core Image rendering to the CPU.
There is another side of threading, obviously, if you are using multiple threads in your application and there's some things there that you need to be aware of, calling into Core Image the CIContext are not thread safe. Everything else inside Core Image is, but if you're using background threads for Core Image, you need to create separate CIContext instances for each thread that you're calling into Core Image from. You can use locking, obviously, and just use one shared context, but we recommend using separate CIContext instances as that is usually the fastest approach, but it depends on exactly how your application works-- these two possibilities.
The other thing is if you go down the road of calling into Core Image from a background thread, you need to set the stack size. Core Image actively uses the stack. We recommend setting the stack size to 64 megabytes. If you are using Cocoa, this is how you set the stack size on newly created threads. That brings us to debugging.
Now you are using Core Image and you want to know what is going on inside Core Image, we have something that is called render tree printing. It allows you to see constantly what Core Image is up to. We will render to the console for every time that you tell Core Image to draw an image into a context to render an image into a context.
There's an environment variable called CI_PRINT_TREE. When you set that to one, each of these draw calls will cause the filter tree to be dumped to the console and this gives you a peek into what Core Image is up to. One thing to note though is as David already mentioned Core Image does tiling whenever necessary.
If the output size of the image is very large, one single draw operation might actually cause multiple draws and in that case, you will see multiple outputs. So, in Xcode what you would do you would find your executable in your Xcode project under Groups and Files there's a section called Executables. You find your application there, you Right-click on it, select "Get Info", out comes a window, you need to go into the "Arguments Tab", and in there you hit the little plus at the bottom and you can add an environment variable.
Again, it's called CI_PRINT_TREE. You can set that to one. And there's a nice little checkbox on the side so you can just leave it in there and just toggle the checkbox if you wanted to turn it on or off. If it's turned on, it will impact performance because you are logging a lot so this is an easy way to turn it on and off. So, now you've turned it on what will actually happen, what will you see? You will see something like this and you will see many of these if you're using Core Image a lot, and obviously we think you should be using Core Image a lot.
The thing to see here is lines like these and each line starts with these two asterisks is a render operation, is a draw operation. Let's look closer at the first one of these. It looks something like this and these, this is the render operation and you need to read these from the bottom going up. The first line we see, which is the last line, is your input image. In this case, it's a 292x438 image that is being put into Core Image into the filter tree.
We go up there's a kernel being applied to it and there's a fine transform being applied to the result of that and there's some more kernels being applied to the result of the affine transform and finally this is being rendered to a context. In this case, the context name is fe-context-cl-cpu, which tells you this is a CPU context using OpenCL. So let's briefly look at the second one of the two that I had up on the display. Again, we read from the bottom. We start out with a FILL operation. Here's our input image. In this case, it's a 1200x900 image.
We go further up there's some kernels applied to it. There's a source over, some more kernels and finally we end up at the context. In this case, the context name is different. It's an fe-context-gl, which tells you in this case the context is a GPU context using OpenGL. So what can you use this for? You're seeing all these things and what's the madness behind it. Well, one easy debugging help that you can gain from this is you can see how many times a thing is being rendered.
You can find your input images in this output. One easy thing is just to look at the sizes of the images. If you know that you're inputting an image that's 1200x900, you can identify those images and let's say you're taking an image, applying some filters to it and saving it to disk, you would only expect one render call being done on that image. If you're seeing 21 render calls, you should probably go back and look at your application logic and see if everything looks sane there.
The thing I mentioned earlier is if you have large appellate sizes, Core Image might actually call draw multiple times because of tiling, but still we have seen shipping apps that had poor performance simply because they were drawing too many times. Not only when saving to disk but also updating your display. This can give you some hints to what is actually happening.
Another thing that you can use this for is to see are you using the CPU when you expect to use a CPU? Are you using the GPU when you expect to use the GPU? You can, again, identify your input images and then see if the context name matches your expectations.
If you are rendering to the display for interactivity, you probably want to use a GPU, does that really happen? The CI_PRINT_TREE stuff is something, again, you can try with a demo application that is available on the attendee side. You can turn it on there and see what that small sample application does. That is all I have for you and now I'd like to ask Alex on stage who is going to tell you something about writing your own filters.
[ Applause ]
[Alexandre Naaman]
Thank you, Daniel. My name is Alexandre Naaman. I'm a software engineer on Core Image and today I'm going to talk to you a little bit about writing your own customer CI Filter. So, so far we've seen today a little bit the cast of characters and how you go ahead and put those together into creating your own app in an efficient manner and what I'm going to show you today is first off two samples; one very simple sample, which is going to be desaturation filter, and then a more complex one based an idea that M.C. Escher had in 1956, and he actually didn't complete so we're going to talk about those in a fair amount of detail.
So, first question you're going to ask yourself is, why would I write a filter instead of just using what's already there? And two main reasons why you would want to write your own filters is because first off the filter doesn't exist in the existing set that we ship in OS or you can't create the effect that you're looking for by daisy chaining these two things together. So, any number of effects together.
The way we're going to do this well, there are two main ways. First off you can try doing it inside of Quartz Composer and writing a kernel and then porting it over and writing some objective C code in your kernel, but for the purpose of our demo today, we're going to do everything inside of Quartz Composer and it's really easy once you implemented your algorithm inside of Quartz Composer to bring it inside of an app.
So, our first sample is going to be relatively simple. We're just going to desaturate an image and we're going to have the slider that controls how desaturated it gets. So, this is what the composition looks like inside of Quartz Composer so it's fairly simple. We're going to take an input image and then we're going to pass it through our kernel that we're going to write and we're going to go through that step-by-step and desaturate it to a certain amount until we end up with a new image and that is the entire composition and the amount value is going to be a slider that we have that goes from zero to one and controls how desaturated the image gets. So, let's take a look at what the kernel is going to look like and how Core Image works.
So, here we're looking at a sub part of the initial input image because Core Image works on tiles and that's not important right now, but it will be in a moment when we talk about some of the things you need to keep in mind when you start writing your own filters . So, first off Core Image is going to ask us to render, to provide a new color value at every single pixel location, XY.
So, if we're asked to render a point here, the first thing we're going to do is read its value. So, we're going to get the current coordinate and then we're going to determine the value of that color in our input image at that location and unpre-multiply it because colors that come into the kernels have alpha pre-multiplied.
The next thing we're going to do is compute brightness for it, CAP Y, and these values, the RGV values come from the SRGV color profile and you can find those by looking in the color sync utility for that profile. So, we're going to compute a likeness, CAP Y, and we're going to create a new color that we're going to call desaturated color and we're going to assign the red, green and blue components to be equal to CAP Y and preserve the original alpha value.
So, now we have two colors unpre-multiply and, sorry, a RIG and desaturated color, and what we're going to return is some mixture of those two and if this looks familiar, this function mix, it's because the kernel language that you used to write your own custom filters inside of CI looks a lot like GLSL. It's a subset of that.
Now, we can vary the value of amount, and we'll have a more or less desaturated image and so if we do that, we can see how that affects our output and this is, again, just looking at a subsection of the image and this is actually our kernel running in Real Time inside of Keynote. So, we took the kernel that we had inside of Quartz Composer and we just dragged it in there and varied that value amount and we end up with this.
So, it's pretty simple to prototype the effects you're trying to generate. That being said this is a fairly simple sample, but there are two other things you really need to keep in mind when you start writing your own kernels and those are the Domain of Definition, or DOD, and Region of Interest.
So, in the case of the sample that we were just looking at, if we take our input image and let's suppose that it was a size 1200x1000. After we've applied the effect, the image is going to be of the exact same size. This is what we call the Domain of Definition, which is to say given your input image once the filter is run, what is the size of the input image? In this case, it doesn't change we don't have to tell Core Image to do anything special, but you can imagine that if you did a zoom or a blur that the image might get either smaller or larger and you need to tell Core Image what to do in that situation. Now, as I mentioned earlier when we perform the rendering, we actually provide you with tiles.
So, we do a tiled approach that is going to be optimized for the device that you're targeting. So, if we were trying to render this small section here, what we need to know is that is the data that we need from our original input image? And this is what we call the Region of Interest or ROI, and so in this case, we have a one-to-one mapping; for every pixel in, we have another pixel out, and we're not reading any additional data so we don't have to specify an ROI, but as soon as you write anything a little bit more complicated, you may have to take those things into account. So, let's look at another sample that's going to be slightly more tricky, and this is just going to be involving transposing an image.
So, for every, we're just going to swap around the x and y coordinates. Very simple. And the kernel looks like this so we just sample the image but instead of sampling the image at dEstCoord.xy, we're going to sample it at dEstCoord.yx. So, how does this affect DOD? Well, if we start off with an image that's 600x400 and we don't tell Core Image otherwise, it's going to assume that the image that we're trying to render is of the exact same size, which is to say 600x400.
That's not what we want. As you can see, I've colored here in blue the section where Core Image doesn't know what to do. It doesn't know where to grab the pixels from so what we want is to tell Core Image that the DOD for this image is actually 400x600. So, we want to swap those values around.
The way we're going to do that is by getting the extent of the input image, creating a new filter shape and that basically has the origin for the x and y swap and the width and height swapped and then when we call apply, and this all happens in your output image for your filter, we're going to pass in one additional parameter that is the KCIApplyOptionDefinition and give it the new shape that we just created and when we do this, Core Image will know that what we're trying to render our result from this filter is going to be a size 400x600 and has a new origin. So, that takes care of the DOD.
How does this affect the ROI? Well, this is the result that we're looking for, and as I mentioned earlier, when Core Image does its rendering it tiles it. So, if we were, for example, to render a tile in the upper corner of the image located at this location and of that size, if we don't tell Core Image otherwise the result we're going to get is going to look like this and this is a really common mistake when people starting writing their own filters even if they've been writing them for years actually. So, we're going to, let's look at what happens if you didn't specify ROI. So, let's look at our input image and let's look at where that rectangle that we tried to render came from. It's in the middle of nowhere.
There's no data for us to transpose here. We're reading basically garbage so we're going to end up with something unknown or basically based on our rom mode but not the results you want. So, once again, what we need to do is tell Core Image that the data that we're looking for comes from a different place.
So, we're going to swap the origin so the x and y for the origin and we're going to swap the size and we're going to do that by creating a region of method inside of our filter and just swapping those values as I mentioned. One thing to keep in mind when you write your own filters is always think about how does it affect the ROI and DOD? And if you're sampling at locations that aren't equal to the current dEstCoord, you probably need to write both of those and if the rendering isn't correct chances are that you need to tweak that a little bit more. So, now that we're done with the simple example, let's talk about a slightly more complicated example.
So, the code for this is available for download right now on developer.apple.com/mac/library/samplecode/Droste, and it's based on an idea that M.C. Escher had in 1956 and a lithograph that he tried to produce that he actually never finished, but can now be done in realtime. So, we're going to take an image that looks like this and we're going to create what he called a cyclical annular expansion.
So, basically a recursive image that spins around, and it caused him to have some almighty headaches and me too. [Laughter] So, let's look into this one a little bit more. So, let's forget about the recursion for a split second here. Let's look at how we're going to deform one level of this image, and if we animate that that's going to look like this.
The basic idea here being that if we perform this repeatedly so we start off with our first level of deformation and we just keep applying it over and over and over again to out input image until the point where the data is too small to make any visible effect on our final output image, we're going to get our desired result.
Now, one thing you might have noticed here is that our image has actually been sheared and shrunk. So, in order to not end up with this kind of unfinished painting, we're going to add in one more layer on the background, and if we do that, we'll end up with the desired result.
The grid that we're going to use is going to look a little bit like this, and I'm going to talk about the math and I promise I only have two slides on math a little bit later. [Laughter] So, let's talk about how we're first going to do this. We're going to use Source Over. So, if we start with our input image A and we apply the effect level zero, we end up with image B, our first deformation.
We do this again, level one, so the scaled-down version, we get a new image, C. And if we do image C over image B, we start getting what looks like the result that we're looking for. I mean if we do this repeatedly N times, we're going to eventually get our final output image.
So, the question that we have to figure out is, and this is how Core Image works it's going to ask you for the color value at a given pixel, and you have to figure out from your source image where does that come from. So, let's pretend we were trying to render the dot on the corner, a prime of that little table, and we need to figure out where that comes from in the original source image.
And you can see that the rectangle that we're going to be cutting out for this image where the recursion is going to happen is in yellow and even this represents the first level of deformation even after the very first level the image is going to get scaled down a little bit and there are going to be areas that are outside of the bounds of our original image that are pictured in green, red, sine and blue, and we're going to have to figure out what to do with those as well. So, these are my two math slides.
I'm going to go over them quickly here. So, what we're going to do is going to look a lot like a logarithmic spiral. So here we have an equation for a logarithmic spiral r is equal to a times e to the b theta and a if you think back to your math classes, is going to control the number of strands in the spiral and b controls the periodicity.
So let's take a look at how we're going to deform the image. With the inner circle here corresponding to the region that we're going to be cutting out and the outer circle corresponding to the bounds of the image. So, we've got two measurements we've got r1 our inner radius, and r2, the radius that we're trying to get to once we perform the deformation. Now, at theta is equal to zero e raised to the power of zero is equal to one, therefore, a is equal to r1. We need to figure out the values for a and b so we can pass these into our kernel a little bit later.
Then when theta is equal to 2 pi so once we've done one entire revolution, our two is going to be equal to r1, our value a that we just figured out, times e raised to the power of b times 2 pi and if we isolated the value of b we end up with b is equal to log of r2 divided by r1 and all of that divided by 2 pi. So, this works, but it's not a conformal map, which is to say it doesn't preserve angles.
So, if we had a synthetic image with no people in it, it would look fine, but if we had any other stuff it wouldn't look correct. The whole image looks skewed. So, although we've preserved angles going radially out we haven't preserved angles in between the points and this is not going to look correct. So, we need to do a little bit more tweaking to our kernel in order to get the desired result.
So let's look again at our image, and the first thing we're going to do is convert it to a polar log coordinate system. So, we've unwrapped it and then we're going to rotate it and scale it and then finally replicate it along the X-axis and when we do that, we will get a conformal map and we'll get something that looks like this, which is going to give us a desired result. Those are all my map slides.
[Laughter]
So, let's look at the kernel. Believe it or not this is the entire kernel for performing a Droste Effect.
So, we're going to pass in a few parameters. R.x is equal to a or r1; r.y is equal to log of r2 divided by r1 and divided by 2 pi, which is b in the equation you looked at earlier, and then a scaling factor, which initially is going to be equal to 1.0 and then for each subsequent operation is going to equal to image width divided by inner rectangle width and then we're just going to keep taking that to the second power or third power, et cetera, et cetera, to create each additional iteration.
So, our kernel is going to take a few parameters as inputs the first one being the input image, the second one being the location of a center of the image so the center of the rectangle that we chose to cut out from our image and the third one being our values a and b and then the scaling factor that we talked about. So the first thing we're going to do is we're going to move our coordinate that we're currently trying to evaluate to the center. So everything is relative to the center of the rectangle that we tried to cut out.
And then we're going to convert that, we will figure out the angle of that point so we're going to convert that point into polar coordinates so we've got the distance squared and then do .5 that. So, now we've got our polar coordinates and we're going to perform the rotation and the scaling and then we're going to convert that point once again back into Cartesian coordinates and perform the exponentiation, which is going to give us the logarithmic spiral that we were looking for and the last step we're going to do is scale that point if necessary and then move it back to the center from where it came from and that is the entire effect.
So this works, but as we saw earlier it requires a lot of passes over the data and that means a lot of intermediate buffers. So, you can imagine if you had multiple levels of recursion, this is going to end up being quite slow. So the question we want to ask ourselves is, can we do this in a single pass? The key thing to note here is that when b in our equation is equal to zero, we end up with r is equal to a.
That's it. So this is going to give us a hall of mirrors effect. So, if we were to try and cut out this rectangle here in yellow, what we really want to do is replicate what's not inside the yellow inside the yellow and do that repeatedly, which is where the scaling factor comes from that we talked about earlier and that's going to look like this. The question is, can we do this in a single pass? And I have good news, we can. So, let's take a look at this in a little bit more detail. We're going to cut out this section in yellow and we want to replicate the image in.
So, we're going to move it to the center once again and then when we're asked to render a point, let's say on the corner of the arm chair here, what we need to do is figure out where does that point come from within the outer section of the image? Where is the valid data? And so what we're going to do is we're going to draw a line from the center out and we're going to scale this point by that same scaling factor until we reach valid pixel data and we'll look at the code for this in a second. And the same way we saw that even on the first iteration sometimes we'll be asked to render data that falls outside of the image bounds so sometimes we're going to have to take points that fall outside and bring them in.
So, we're going to divide the value of the current coordinate by the same scaling factor, and again, we'll do that repeatedly. In terms of testing for this, we have two points here, which correspond to the rectangle that we've chosen and those are simply at the inner rectangle width divided by two for width and height and minus inner rectangle within height divided by two, and then we have two additional points which correspond to the outer bounds of our image, which are just at center.x, center.y and minus center.x, minus center.y. So let's look at the code.
So, the code is mostly unchanged. We're just going to add a few for loops here and we can unrule these because it's a fixed number of iterations. So, we're just going to be passing in one additional parameter, which is the dimensions of half of the inner rectangle and then the first for loop is going to take an existing point and scale it continuously based on whether or not it falls outside of the image bounds or not and if it's not, it won't change it.
It's just going to do divide by one so it won't affect the position and then if on the other hand we had a point that was inside the inner rectangle, we're going to keep scaling it out for a certain number of iterations until we get a point that is in the valid area and the rest is unchanged. So, we get hall of mirrors plus the Droste Effect in a single pass.
Now, that doesn't view with alpha blending, and if we look at our picture of our iMac, you can see that the sections colored here in green look horrible. So, the question is, can we do this in a single pass? And you see we applied the Droste Effect it looks kind of bad.
So, we can use the same trick that we used for scaling points if we realize that alpha is not equal to one. So, if we're asked to render a point where alpha isn't equal to one, we're just going to move it in and keep accumulating values until we have alpha equals one. So in terms of code, it's going to look like this.
So we're just going to use Porter Duff alpha blending. We're going to have another for loop. Truth be told we probably didn't need four iterations because each new iteration involves a new sampling operation so that's on the expensive side, but that's the basic idea. We're just going to compute a new coordinate, look at the value for that and keep adding it to our current color if necessary and then if we do that, we'll get alpha blending. So we can do everything in a single pass.
So, let's talk a little bit about how this filter is in terms of performance. So, first off I talked a little bit about ROI earlier and this is a perfect example of when it's really important to specify an ROI because in this case we can't determine ahead of time when we're trying to render a small subsection of the image how much of the original image we need.
We might need the entire image to render just a small tile. So, this is not a tileable operation, which means that you're going to be limited in terms of the maximum output size for the image that you can create to the limits for that device so the maximum texture size or the sizes in OpenCL.
Also had we gone down the multi-pass approach, we could have chosen to just simply take our original input image, scale and rotate it and do the Source Over. That's another thing. Also because we're not using the built in affine transform and rotate methods inside of Core Image, we're going to end up with some aliasing artifact so we might want to do some multi-sampling in order to make our image look a little bit smoother.
And one thing that's really important to note is given the recursive nature of this algorithm, the very first thing you want to do is match the aspect ratio of the image that you're trying to perform this effect to be identical to the aspect ratio of the rectangle that you're cutting out. So, let's take a look at how this works.
So, we've implemented this inside of Quartz Composer and first thing I'm going to do is click two points to indicate what portion of the image I want to cut out and then I'm going to enable the effect and we're done. So, we can do this on and then we can change the values of a and b to see how that affects our result, and we can take a look at some of the other graphs and how they get affected by these things including our little spiral here, which was kind of fun, and we can also do it on live video and we can go really crazy [applause] and you can see how I felt before I got on stage. [Laughter] Okay, and this is available for you right now. You can download it on the website via the URL that I showed you earlier. It's kind of spooky.
I know. Okay, if you're curious and want to learn a little bit more about the math behind all of this stuff, there's some people in the Netherlands that came up, that figured out how this all worked and without them this would never have been possible, and they published a paper in the AMS, which you can get at here as well and Jos Leys, whose name I probably butchered, also has an interesting web page about the math behind all of this and one good thing to keep in mind or a good reference for you when you start writing your own kernels is the GLSL quick reference, which is available for download from the Khronos website, which I've listed here.
In addition when you start writing your own kernels, there are a few good books you might want look at, Digital Image Warping by George Wolberg, which has a lot of information about image resampling; GPU Gems 3, which has a good object tracking demo written in Core Image that you can look at; and then finally Digital Image Processing, which is a good overview of image processing in general and has as good chapter on color. You can contact Allan Schaffer at [email protected] or go to our developer forums at devforums.apple.com. And on that note I'd like to thank you all for coming, and I hope you enjoy the rest of the conference.