Graphics and Games • iOS, macOS • 44:44
Core Image is the essential framework for handling image processing tasks in your photo and video apps. In this session, we'll explore new additions to the framework that allow you to achieve great performance in your filter chains and custom CIKernels. We'll also demo a new approach to prototyping in Core Image through the use of an interactive Python environment. Through these techniques you'll discover new ideas for building new creative effects as well as practical approaches to batch processing images for tasks such as image compositing and data boosting for machine learning.
Speakers: David Hayward, Emmanuel Piuze-Phaneuf
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
All right [applause]. Thank you [applause]. Good afternoon everyone and thank you for coming to our session today on Core Image. My name is David Hayward, and I'm really excited to be talking about the great new performance and prototyping features our team has been adding to Core Image over the last year. We have a lot to talk about, so let's get right into the agenda.
So, the first thing we're going to be talking about today are some great new APIs we've added to Core Image to help improve the performance of your applications. After that, we're going to segue into another topic, which is how you can use Core Image to help prototype new algorithm development. And lastly, we're going to be talking about how you can use Core Image with various machine learning applications.
All right. So, let's get into this and start talking about performance APIs. There's two main areas where we've worked on performance this year. First of all, we've added some new controls for inserting intermediate buffers -- we'll talk about that in some detail. And the second thing is we'll be talking about some new CI kernel language features that you can take advantage of. So, let's start by talking about intermediate buffers.
As you are aware, if you've used Core Image before, Core Image allows you to easily chain together sequences of filters. Every filter in Core Image is made up of one or more kernels. And one of the great features that Core Image uses to improve performance is the ability to concatenate kernels in order to minimize the number of intermediate buffers. In many cases, to get the best performance you want to have the minimum number of buffers.
However, there are some scenarios where you don't want to concatenate as much as possible. For example, your application might have an expensive filter early on in the filter chain. And the user of your application at a given moment in time might be adjusting a filter that follows it in the graph. And this is a classic situation where it's a good idea to have an intermediate buffer at a location like this in between.
The idea is that by having an intermediate buffer here, the cost of the expensive filter does not have to be paid for again when you adjust a secondary filter. So, how do you do this in your application? We have a new API, very aptly named, inserting intermediate. So, let's talk about how this affects our results. What we do is instead of concatenating as much as possible, we will respect that location of the intermediate and concatenate as much as possible around it.
Some notes on this. One thing to keep in mind, is that by default, Core Image cashes all intermediate buffers so that the assumption is that a subsequent render can be made as fast as possible. There are, sometimes, however, when you might want to turn off caching of intermediates. So, for example, if you application is going to be doing a batch export of 100 images, there is little benefit of caching the first one, because you'll be rendering a completely different image afterwards.
So, you can do that today in your application by using the context option cache intermediates and setting that value to false. However, if you are also using this new API that we spoke about, you can still turn on caching of intermediates, even if this context option is turned off. So, this allows you to really make sure that we cache something and don't cache anything else.
The next subject I'd like to talk about is some new features we've added to the kernel language that allows us to apply image processing. So, one thing to keep in mind is that we have two different ways of writing kernels in Core Image. The traditional way is to use the CI kernel language. And in this case, you have a string inside your source file; either your Swift code or your objective C code. And at run time you make a call, to say, kernel with source.
And later on, when you create an image based on that kernel, you can then render that to any type of Core Image context, whether that context is backed by Metal or open GL. When it comes time to render, however, that source needs to be translated. It needs to be translated either to Metal or GLSL, and that step has a cost. Eventually then, that code is compiled to the GPU instruction set and then executed.
Starting last year in iOS 11, we added a new way of writing CI kernels, which has some significant advantages. And that's CI kernels based on the Metal shading language. In this case, you have your source in your project and this is -- this source is complied at build time rather than at runtime.
As before, you substantiate a kernel based on this code by using the kernel with Metal function name and binary data. The advantage here is that this data can be applied without paying the cost of an additional compile. The caveat, however, is it works on Metal backed CI context. But it gives a big performance advantage.
So, starting in this release we're going to be marking the CI kernel language as deprecated, because while we will continue to support this language, we feel that the new way of writing Metal kernels offers a lot of advantages to you, the developer. For one thing, you get the performance advantage I outlined earlier, but it also gives you the advantage of getting build time syntax coloring on your code and great debugging tools when you're working with your Metal source. So, great.
[ Applause ]
So, with that in mind I want to talk about a few other things that we've added to our kernel language. For one thing, we have added half float support. There are a lot of cases when your CI kernel can be perfectly happy with the precision that half float gives you. If you're working with RGB color values, half float precision is more than adequate. The advantage of using half floats in your kernel is it allows operations to run faster, especially on A11 devices like the iPhone 10.
Another advantage of using half floats in your kernels is it allows for smaller registers, which increases the amount of utilization of the GPU, which also helps performance. Another great feature we've added to the kernel language this year is adding support for group reads. This gives your shader the ability to do four single-channel reads from an input image with only one instruction, so this really can help. And as a complement to that, we also have the ability to write groups of pixels. This gives you the ability to write four pixels of an image with just one call inside your shader.
So, all three of these features can be used in your shaders to give great performance improvements. So, let me talk a little bit about an example of how that works. So, imagine today you have a simple 3 by 3 convolution kernel that is working only on one channel of an image. This is actually a fairly common operation, for example, if you want to sharpen the luminance of an image.
So, in a kernel like this, typically, you're -- each time your kernel is evoked, it is responsible for producing one output pixel. But, because this is a 3 by 3 convolution, your kernel needs to read 9 pixels in order to achieve that effect. So, we have 9 pixels read for every one pixel written.
However, we can improve this by making use of the new group write functionality. With the new group write functionality, your kernel can write a 2 by 2 group of pixels in one evocation. Now, of course this 2 by 2 group is a little bit bigger, so instead of a 3 by 3, we need to have a 4 by 4 set of pixels read in order to be able to write those four pixels. But, if you do the math, you'll see that that means we have 16 pixels read for 4 pixels written. So, already we're seeing an advantage here.
The other feature we have is the ability to do gathers. In this example, we're reading a 4 by 4 or 16 pixels. And with this feature, we can do these 16 pixels red with just four instructions. So again, if you look at the math on this, this means we're doing just 4 group reads for every 4 pixels written. And this can really help the performance. Let me walk you through the process of this on actual kernel code.
So, here's an example of a simple convolution like the one I described. Here, what we're doing is making 9 samples from the input image and we're only using the red channel of it. And then once we have those 9 values, we're going to average those 9 values and write them out in the traditional way by returning a single vec4 pixel value.
Now, first step to make this faster is to convert this to Metal. This is actually quite simple. So, we start with code that looks like this, which is our traditional CI kernel language. And with effectively, some searching and replacing in your code, you could update this to the new Metal-based CI kernel language. There's a couple things that are important to notice here. We have added a destination parameter to the kernel, and this is important if you're checking for the destination coordinate inside your shader, which a convolution-like kernel like this does.
And then we're using the new, more modern syntax to sample from the input by just saying sample -- s.sample and s.transform. And the last thing we've done when we've updated this code is to change the traditional vec4 and vec2 parameter types to float 4 and float 2. But as you can see, the overall architecture of the code, the flow of the kernel is the same.
All right. Step 2 is to use half-floats. Again, this is an example where we can get away with just using the precision of half-floats because we're just working with color values, and so again, we're going to make again some very simple changes to our code. Basically, places in our code where we were using floating point precision, we're going to use half-float precision as well.
This means the sampler parameter and the destination parameter have an underscore H suffix on them and any case in their code where we're using float 4 now becomes half 4. So again, this is very simple and easy to do. Another thing to be aware of is if you've got constancy in your code, you want to make sure to add the H on the end of them, like the dividing by 9.0.
So again these -- this is another simple thing. The last thing we're going to do to get the best performance out of this example is to leverage group reads and group writes. So, let me walk you through the code to do this. So, again, we want to write a 2 by 2 group of pixels, and from that we need to read from a 4 by 4 group of pixels.
So, the first thing we're going to do is specify that we want a group destination. If you look at the function declaration, it now has a group destination, H datatype. Then, we're going to get the destination coordinate like we had before, and that will point to the center of a pixel. However, that coordinate actually represents the coordinate of a group of 2 by 2 pixels.
The next thing we're going to do in order to fill in this 2 by 2 group of pixels is do a bunch of reads from the image. So, the first gather read is going to read from a 2 by 2 group of pixels -- in this case, the lower left-hand corner of our 16.
And it's going to return the value of the red channel in a half-four array. The four parameters will be stored in this order, which is X, Y, Z, W going in a counter-clockwise direction. This is the same direction that is used in Metal, if you're familiar with the gather operations in Metal.
So, again in that one instruction we've done four reads and we're going to repeat this process for the other groups of four. So, we're going to get group 2, group 3, and group 4. Now that we've done all 16 reads, we need to figure out what values go in what locations. So, the first thing we're going to do is get the appropriate channels of this 3 by 3 sub group and average them together. And then we're going to store those channels into the result 1 variable.
And we're going to repeat this process for the other four result pixels that we want to write -- R1, R2, R3, and R4. And the last thing we're going to do is called "Destination Write" to write the 4 pixels all in one operation. So note, this is a little different from a traditional CI kernel where you would've returned a value from your kernel and said you're going to be calling "Destination Write" instead. All right.
So, the great result of all this is that with very little effort, we can now get two times the performance in this exact shader. This is a very simple shader. You can actually get similar results in many other types of shaders, especially ones that are doing convolutions. So, this is a great way of adding performance to your kernels. So I'm going to seg -- I like to tell people to go to this great new documentation that we have for our kernel language, both the traditional CI kernel language and the CI kernel language that's based on Metal.
I highly encourage you to go and read this documentation. But now that we've talked about improving the performance of how your kernels can run, I'd like to bring up Emanuel on stage, who will help talk to you about how you can make your development process of new algorithms even faster as well.
[ Applause ]
Thank you, David. Good afternoon everyone. It's great to be here. My name is Emmanuel, I'm an engineer on a Core Image team. So, during the next half of this session, we'll shift our focus away from the Core Image Engine and explore novel ways to prototype using Core Image. We'll also see how we can leverage Core Image in your machine learning applications. So let's get started. Since I want to talk about prototyping, let's take a look at the lifecycle of an image processing filter.
So, let's say that we are trying to come up with a foreground to background segmentation. And here, what this means precisely is that we'd like to get a mask which is 1.0 in the foreground; 0.0 in the background and has continuous values in between. The difficulty in implementing such a filter heavily depends on the nature of data you have available. So, for example, if you have an additional depth buffer, alongside your RGB image, things can become easier. And if you're interested to combine RGB images with depth information, I highly encourage you to look at the session on creating photo and video effects using that.
Today, I don't want to focus on these other sources of information, but I want to focus on prototyping in general, so -- so let's say that -- so, we have this filter well-drafted, and we know the effect we're trying to come up with, so in this particular case, a foreground and background mask. The very next natural step is to try implementing it, and you pick your favorite prototype in the stack and you start hacking away and combining different filters together and showing them in such a way that you achieve the filter effect that you're looking after.
So, let's say you did just that, and here we have an example of such a foreground to background mask. Now, if you're in an iOS or Mac OS environment, the very next natural step is to deploy that algorithm. So, you have a variety of [inaudible] that you can use such as Core Image, Metal-with-Metal performance shaders, as well as VImage if you want to stay on the CPU.
That initial port from prototype to production can be quite time consuming, and the very first render might not exactly look like what you're expecting. And there is a great variety of sources that can contribute to these pixel differences, one of them being simply the fact that the way filters are implemented across frameworks can be quite different.
If you take an example here on the left-hand side, we have a [inaudible] blur that applies this nice feathering from foreground to background. And that's an example of a filter that can leverage a grid variety of performance optimizations under the hood to make it much faster. All these optimizations can introduce numerical errors which will propagate in your filter stack, thereby potentially creating dramatic changes in your filter output. Another problem that typically arises when you're putting your code, is that when you're prototyping environment, a lot of the memory management is taken care of for you. So, you don't often run into issues of memory pressure and memory consumption until it's pretty late in the game.
Another topic, of course, that's important to consider is performance. Oftentimes, the prototypes are already using CPU code, and we over - sometimes-- we often over-estimate the amount of performance we can get from pointing our CP Code to GP Code, thinking that everything is going to get real-time. So, what if we could catch these concerns way, way earlier on in our prototyping and workflow? Well, we believe we have a solution for you. And it's called PyCoreImage. Python bindings for Core Image.
So, this is combining the high-performance rendering of Core Image with the flexibility of the Python programming language together. And by using Core Image, you also inherit its support for both iOS and Mac OS along with more than 200 built-in filters. Let's take a look at what's under the hood of PyCoreImage.
So, PyCoreImage is made of three main pieces. It uses Core Image for its rendering back end. It uses Python for the programming interface. And it had -- it also has a thin layer of an [inaudible] code to allow interoperability with your existing code bases. We believe PyCoreImage can now allow you to reduce the friction between your prototyping and product-ready code. If you'd like to stay in a Swift-centric environment, a lot of that can be done as well using Swift playgrounds, and we encourage you to look at a session on creating your own Swift playgrounds subscription.
All right. Let's take a look at the main components of PyCoreImage. So, PyCoreImage leverages Python bindings for objective C, PyObjC and interestingly, we've been shipping PyObjC since Mac OS 10.5 Leopard. It was initially implemented as a bidirectional bridge between Python and Objective C and [inaudible] in the context of Coco app development. But since then it's been extended to support most Apple frameworks.
The calling syntax for PyObjC is very simple, you take your existing Objective C code and you place columns with underscore. There's a few more intricacies and I encourage you to look at the API if you'd like more information. But let's take our CIVector class as an example here.
So here we have some Objective C code where we create an instance of a CIVector by calling CIVector, Vector with X, Y, Z, W. Let's take a look a the PyObjC code. It's very similar. We import the CIVector from the Quartz umbrella package and we can call a vector with X, Y, Z, W and the CIVector class directly.
One thing you may note here is that the code does not exactly look Python-like. And so, we're going to address that in just a few minutes. Now, let's take a look at the [inaudible] gram for PyCoreImage. So the rendering back in is then using Core Image, and Core Image is very close to the hardware, so it's able to redirect your filtered calls to the most appropriate rendering back end, to give you as much performance as possible.
PyObjC lives on top of Core Image, and it can communicate with it through the Python bindings for Core Image by the Quartz umbrella package. And the Quartz is a package that also contains a variety of other image processing frameworks such as Core Graphics, and all the classes that's using Core Image, such as CIVector, CIImages, and CI Context.
PyCoreImage lives on top of PyObjC, and it provides-- essentially leverages PyObjC to be able to communicate with Core Image and makes a lot of simplifications under the hood for you, so that you don't have as much setup code when you're working with Core Image. And we'll take a look at this in just a moment. A lot of it is done through the class CIMG that you can also use to interpret with NumPy via vendor call. And you can also wrap your NumPy buffers by using the class constructor directly.
All right. So, let's take an example how you can apply a filter using PyCoreImage, and you'll see just how simple and powerful the framework is. So, the very first thing you want to do is import your CIMG class from your PyCoreImage package, which we can then use to load the image from file. Note that at this point we don't have a pixel buffer. Core Image creates recipes for images and in [inaudible] the recipe is just giving instruction to load the image from file.
You can create a more complicated graph by applying a filter, by just calling the CI filter name on it and passing the input primaries in this case, a radius. And we can that we are assembling a more complicated graph. And if we zoom on it, we can see that we have out blur processor right at the middle. If you want to get your pixel buffer representation, what you can do is call render on your CIMG instance. And what you get out is a proper unit by buffer.
So, to make that possible, we need to make a few simplifications on how Core Image is called or do a bit of the setup code for you. So, for those of you who are already familiar with Core Image, this will not come as a surprise, but for those of you who are not familiar, please stay with me until the end. You'll see the few simplifications we made, and that should become very clear.
So, Core Image is a high-performance GPU image processing framework that supports both iOS and Mac OS as well as a variety of rendering back ends. Most pixel formats are supported. That, of course means bitmap data as well as raw files from a large variety of vendors. Most file formats are supported, so, like I said, bitmap data and raw from a large variety of vendors, most pixel formats are separated.
So, for example, you can load your image in an unsigned 8-bit through your computation and half float and during your final render in full 32-bit float. Core Image can extract image metadata for you, for example, capture time; exist tags, as well as embedded metadata such as portrait map and portrait depth information.
Core Image handles color management very well. This is a difficult topic on its own that a lot of frameworks don't handle. Core Image supports many battery conditions, infinite images, and has more than 200 built-in filters that you can use, so you don't need to invent the wheel. All right, so I don't think I need to convince you that that's a lot of information and if you're trying to use Core Image in your prototype and your workflow, the learning curve can be quite steep. So what we did is we kept the best of that list, made a few simplifications, which, remember, these simplifications can all be overridden at one time. And since we'll be giving you a weighted code, you can actually hardcode these changes in if this was your prototyping stack.
The first thing we did is that we still have the high-performance feature of Core Image. We still render to a Metal backend. Most all formats are still supported in and out and we can still extract capture time of the data as well as portrait depth and matte information. Last but not least, you have more than 200 built-in filters that you can use. The first change we made is that by default, all your renders will be done using full 32-bit float.
Second change, everything will be done using SRGB color spaces. Third, all the boundary conditions will be handled with clamped and cropping. What that means is, if you're applying convolution or creation, for example, your image will be repeated infinitely. A filter will be applied, and the resulting image will be cropped back to your input size. This is a setting that can also be overridden at one time.
Finally, infinite images become finite so that we can get their pixel buffer representation, and that's what really PyCoreImage is under the hood. So, before looking at a great demo of all of this in practice, I just want to go through quickly a cheat sheet for PyCoreImage. So, let's have a look at the API. So, as you saw earlier, we import the CIMG class from the pycoreimage package.
We can use this to load images fromfile [inaudible]. Here's a Swift equivalent for those of you who are wondering. You can use CIImage for contents of file. You can use fromfile to load your portrait matte information directly as well as your portrait depth by just using the optional arguments usedepth and usematte.
You can interpret with NumPy by wrapping your NumPy buffers in the CIImage constructor or calling render directly under CIImage instances to go the other way around. If you're in Swift, there's a bit more set of coding to do. You need to first create a CIrender destination. Make sure to allocate your buffer previous.
Make sure to give the right buffer properties, initiate and create an incidence of the CI Contex and Qtest render. So, all of that is handled for you under the hood. Core Image also supports residual images, such as creating images from a color or creating an image from a generator.
Let's have a look at how to apply filters now. So, applying a filter has never been easier. You take a CIImage instance, call the filter name directly on it and pass the list of input primaries. Every CIImage instance is augmented with more than 200 lambda expressions, which directly map to the Core Image filters. If you're in Swift, this is the syntax you've seen before I'm sure, applying filter, passing the filter name as well as the list of input arguments as a dictionary of key value pairs.
To apply kernels, you can go applykernel in your CIMG instance, passing of the source string containing your kernel code and the list of input parameters to that kernel, and we'll have a look at this in just a second. Then you just specify the extent in which you're applying that kernel, as well as a region of interest from which you're sampling in the buffer where you're sampling from.
PyCoreImage provides a selection of useful APIs that you can use, such as a composite operations. Here is a source-over as well as geometrical operations such as translation, scaling, rotation, and cropping. All right. I just want to spend a bit more time on the GPU kernels because that's an extremely powerful feature especially for pro typing. So what we have here is a string containing the code to a GPU fragmentor. And what we have there is essentially a way for you to prototype in real time what that effect is.
This is an example of five tap Laplacian and we're going to be using this for sharpening. So, we make five samples in neighborhood of each pixel. Combine them in a way to compute a local derivative, which is going to be our detail, and we're adding in on -- back on top of the center pixel.
I don't want to focus too much on the filter itself, but how to call it. So, we call it black kernel on our CIMG instance. Bass source code, that's just sitting above using a triple [inaudible] python string. [Inaudible] the extent in which we're going to be applying the kernel.
And define the region of interest is along the expression that we're going to be sampling from. If you're not familiar with the concepts of domain of destination as well as regions of interest, I encourage you to look at online documentation for Core Image as well as previous WWDC sessions.
But here is the convolutional kernel, we are reading one pixel away from the boundary, so we need to instruct Core Image if we're going to be doing so, so that it can handle boundary conditions properly. All right. So, that was a lot of information and looking at APIs is going to always be dry, so let's take a look at a demo and let's put all of this into practice.
[ Applause ]
All right. So, during that demo I'll be using Jupiter Notebook, which is a browser-based real-time Python interpreter. So, all the results you're going to be seeing are rendered in real-time using Core Image in the backend. So, none of this has been pre-computed; this is all done live. So, the very first thing I want to do here is import the utility classes we're going to be using, the most important one being the CIMG class here for my PyCoreImage package. Then, we just have a bit of setup code so that we can actually visualize images in that notebook. Let's get started.
First thing I want to show you is how to load images in. So, using from file here, we see that the type of my object is a PyCoreImage CIMG. And we can see that it's backed by an actually proper Core Image object in the back. We can do a render on our image and have a look, I would -- its actual pixel representation using Matte [inaudible] lib here.
This is our input image, and now I want to apply a filter on it, so let's take a look at the 200 plus filters that are supported in Core Image. Let's say that I'm interested in applying GaussianBlur here, and I'd like to know which parameters are supported by that filter, so I'm going to call inputs on my CIMG class, and I see that it supports input image -- I shouldn't be surprised -- as well as an input radius. So, I'm going to do just that here. Take my input image. Apply GaussianBlur filter on it with a radius of 100 pixels, and then show the two images side-by-side.
So, pretty easy, right? Okay. Let's keep going. Like I mentioned earlier, you can generate procedural images using Core Image. So, we'll take a look at Core Image generators. And the first thing we do here is we call from generator -- specify the name of our generator. In this case, CIQR code and pass them in the message that we're trying to encode. Here it is in real time, so I can do changes to that message and see how that affects the QR code that's being generated.
Core Image also has support for labeling your images, so you can use a CI text image generator to do that. So, here's the example here. WWDC and using the SFLO font. All right, let's keep going. As I mentioned, we support interoperability with a -- to and from NumPy, so this is the first thing we're going to do here.
We're going to start with an image and apply some interesting and non-trivial affect to it. In this case, a vortex distortion. Next thing we'll do is, we'll render that buffer getting NumPy area out of it. You can see its type here as well as its shape, its depth, and a few statistics on it. It's minimum, median value as well as its maximum value.
We can also go the other way around and go from NumPy to Core Image. To do this, let's start with a NumPy array-- that's non-trivial. In this case, a random -- a random buffer where 75% of the values have been [inaudible] to black. First thing I do here is wrap my NumPy array into my CIMG constructor, and we can see that we have again, our CIMG class instance and the backing CIImage.
Now that I have my CIImage, I can apply a variety of filters on it. So, the first thing I do is I'll apply this blur. I'll use a wire filter; a light tunnel. Change of contrast in my image. The exposure adjustment as well as the gamma value. Let's take a look at these filters side-by-side. So at this blur, light tunnel, exposure adjust; gamma adjust, here's our final effect. Pretty fun and really easy to work with.
So we -- let's put it all together. So, I'm going to start with a new image here, and what I'll be showing you in this demo is, how we can do bend processing. For those of you who are filming here with slicing images in Python, this is exactly what we're going to be doing. We'll define bands or slices -- horizontal slices in our image, and we'll only be applying filters on these.
Let's take a look at a code first. This is our add band function here. And we can see that the very bottom of it, we render our image in two composites, which is an actual NumPy buffer. But the right-hand side is a CIImage. By using slicing like this, we forced Core Image to only do a render in that band, not in the entire image, thereby being much more performant. So, let's do this and create five different bands into our image and show the final composite.
Pretty amazing. And we've got other labels on top as well, which correspond to the filters being applied. It's really that simple to work with PyCoreImage. All right. And I mentioned performance earlier, so let's take a quick look at this. First thing I want to show you is that whenever you call render on your CIImage instances, the NumPy is baked and cached under the hood. For example, here we create an image where we scaled down as well as applied GaussianBlur, so the first call took 56 milliseconds; the second one only 2 milliseconds.
And let's take a look at large convolutions as well. Core Image is extremely fast and is able to handle large convolutions as if it was nothing. Here we're using CIBlur, CIGaussianBlur with a radius -- with a sigma of 200 -- a value of 200 for a sigma, which is huge. Just to give you a sense here, as I was look -- showing you the image, I'm actually executing the equivalent using scikit-image. And we had a 16 seconds running time. But this time the same thing using CoreImage; 130 milliseconds.
Yeah, it's that fast [applause] -- 200X, yeah. Thank you. All right, let's keep going. So, one of the most powerful features of PyCoreImage is its ability to create custom GP kernels inline and execute them on the fly and modify them on the fly. So, let's take a look at that.
All right. So, the first thing I want to show is how to use color kernels. So, color kernels are kernels that only take a pixel in and spit a pixel out and don't make any other samples around that pixel. So, here's our input image and here's our kernel. So what we actually get in is a color and we turn a color out. So, let's take a look at this effect here. I'm going to be swapping my red and blue channels with my blue and red channels and we'll be inverting them.
Not a terribly exciting effect, but what I want to show you is that I can do things like, start typing away, and say, maybe I want to scale my red channel by my blue channel and I want to just play with the amount of scaling I'm playing here, so we can go from .25 to pretty high values if we want to and generate interesting effects here.
It's extremely powerful, and this all [inaudible] time so you can really fine-tune your filters this way and make sure you achieve the effect that you're looking for. Let's take a look at a more complicated kernel here. So, we'll look at a general kernel, which is a bit like the [inaudible] I showed you earlier, which is a kernel that makes additional taps in the neighborhood of each pixel. So start with an image from file, which is the same image we saw earlier, and we have our kernel code here. Without going into the detail, this is a bilateral filter, which is an edge over blurring filter.
So let's just get the code in and use apply kernel with some parameters that will allow us to get this very nice effect. And what we did here, essentially, is clipped the non-redundant high frequencies in the image. And if we take a look -- let's take a look at this a bit more closely.
Look at a crop here. We can see how the strong edges are still there, but the fine frequencies that are not redundant were washed away. And bilateral filter can be used for many, many different purposes. In this particular case, we'll use it to do sharpening. And to achieve sharpening with this filter, we can simply take the image on the left and subtract the image on the right, giving us a map of high frequencies or details of the image. Let's do just that. So, here what I'm doing is I'm rendering my image, it's an NumPy buffer. Rendering my bilinear, my filtered image and we're subtracting them together using the operator overloading that's provided with NumPy. Let's take a look at the detail layer.
So, if you have detail on your left-hand side for the entire image and a crop for the center of the image. Now, what we can do with this is we can add it on top of the original image. We're going to be doing just that here. We're going to be adding it twice. By doing this, we achieve formed sharpening. It's really that simple. If I wanted, I could go back to my filter kernel string and start hacking away and making changes there in real time.
The other thing I wanted to show you is how to load metadata from your images. So, here I have an image that has portrait effect matte loaded in, as well as portrait depth data. Here are the images side-by-side. The image on the left is the RGB image. In the center is the depth data. On the right-hand side is a high-quality portrait effects map, which we introduced in another session today. We can also look at the exist tags directly by looking at the underlying CIImage from CIMG instances and calling properties. Here, we get information pertaining to the actual capture itself.
Like I said, we introduce the portrait effects matte at another session, Session 503, so I highly encourage you to look at it. So without going into the details here, I'm going to be choosing this filter. If you are interested to know how we did this, I highly encourage you to take a look at this session. Pretty fun stuff [applause]. Thank you.
[ Applause ]
All right. Let's go back to the presentation. I want to switch gear a little bit here and talk about bringing CoreImage and CoreML together. If you would like to get more information about working with Portrait Matte and Portrait depth information, I encourage you to look at session on creating photo and video effects [inaudible]. All right. Let's look at bringing Core Image and CoreML together. This year, we're really excited to announce that we're coming up with a new filter, CI CoreML model filter. It's an extremely simple, yet very powerful filter that takes two input.
The very first input is the image itself with a filter and input CoreML model, and you get an output, which has been run through the underlying neural network. It's really that's simple; extremely powerful. Just to show you how simple the code is, let's take a look at Swift.
So, we have an input image on the left-hand side, all we need to do is call applying filter. But as a new filter that we've entered -- are introducing this year and give your [inaudible] in the model. It's really that simple. And if you'd like to look at other ways to leverage machine learning in your image processing applications, I encourage you to look at the other sessions on A Guide to Turi Create as well as Vision with CoreML.
All right. On a related topic, one of the common operations we carry in our training datasets in machine learning is data augmentation. And data augmentation can dramatically increase the robustness of your neural networks. In this particular case, let's say we're doing object classification and we're trying to determine whether that image is a bridge or has water in it.
So, augmentation is on your original trend dataset will increase yet the number of images you have in that dataset without needing to gather new images. You essentially get them for free. So, there's many operations you can carry. One of them is just changing its appearance. For example, the tense, the temperature and the white point of your image.
Changing the spectral properties of your image by adding noise. Or changing the geometry of your image by applying transforms. Well, it turns out all of these are trivial to achieve using Core Image. Let's take a look at a few filters and how you can use them for your data augmentation purposes.
So, we have our input image on the left-hand side here. And we can change the temperature and tint using CI Temperature and Tint. We can adjust the brightness, contrast, as well as saturation in your images using CI color controls. Change the frequency spectrum of your image using CI dither as well as CI GaussianBlur. And change the geometry of your image using affine transforms. Let's take a look at all of this in practice.
All right. So, we're back to my Jupiter notebook here. Same setup as before. First thing I want to show you is how to [inaudible] augmentations using Core Image. So, we're loading an image in and we're going to define our augmentation function here. And what we'll be doing essentially is sampling from a random space for each of the filter I've defined here.
So, we'll be applying GaussianBlur, scaling rotation, a few adjustments -- exposure adjustments -- fibrines as well as dithering for noise. All right? Let's cache that function in and let's have a look at a few realizations of that augmentation. So my slider here controls the [inaudible] that I'm using in the back end.
All right, pretty cool. I'm not sure how efficient that is, so here I'm going to be processing 200 of these augmentations in real time and we'll take a look at here -- how they are being -- actually saved to disc in real time. So let's just do that, just to give you a sense of how fast that is.
That's really powerful. All right. This next thing I want to show you is how to use CoreML using Core Image. And first thing you do is to load your Core ML model in, which we did here. We have a glass model, which we're going to be using to generate interesting effect. So, let's start with the procedural image. We've seen this one before.
And then to make it a bit more interesting, we'll add some texture to it. So, we'll be adding multi-band noise on it as well as some feathering and some [inaudible]. All right, so this is the input image we're going to be feeding to our neural network alongside the other -- the CoreML model that we have pre-trained. All right? So, let's run this.
And -- There you go. WWDC 2018, just for you. All right. On that note, I want to thank you for coming to this session today. I hope you enjoyed this as much as we enjoyed preparing these slides for you. I highly encourage you to come and talk to us tomorrow at Core Image Technical Lab at 3:00 pm and thank you very much.
[ Applause ]