Media • iOS, macOS • 49:26
When using Portrait mode, depth data is now embedded in photos captured on iPhone 7 Plus. In this second session on depth, see which key APIs allow you to leverage this data in your app. Learn how to process images that include depth and preserve the data when manipulating the image. Get inspired to add creative new effects to your app and enable your users to do amazing things with their photos.
Speaker: Etienne Guerard
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
I'm happy -- my name is Etienne. And I'm happy to be here today to show you how you can use depth to apply new kind of effects to your images. First, we're going to see what depth is and what it looks like. Then we're going to see how to load it and read it from our image files.
And then we're going to show you several examples of effects you can achieve with depth. And we'll conclude with how to save depth data. All right. So let's get started. What is depth? So, to answer that question, let's start with how we capture depth. Depth can be captured only on iPhone 7+ and only on iOS 11.
iPhone 7+ has a dual camera system that can be set to capture two images of the same scene at the same time and at the same focal length. The differences between those two images is called disparity. So disparity is a measure of the parallax effect. It measures how objects that are closer to the camera tends to move, to shift more from one image to the other. Once we know disparity, we can compute depth with a simple formula. Depth is 1 over disparity.
So, in the remaining of this session we're going to talk about depth or disparity under the broad term of depth data. But remember, they're pretty similar and one is the inverse of the other. For more information about how we capture depth I would refer you to the session on capturing depth in iPhone photography that took place yesterday. All right.
So now that we know what depth and disparity looks like -- sorry -- now that we know what depth and disparity is, let's take a look at what it looks like in practice. And for that, I'm going to call my colleague Craig to show you what it looks like. Craig.
[ Applause ]
Thank you, Etienne. What we're seeing here is an image that was captured by the iPhone 7+. And here is its disparity map. And as we've learned, disparity refers to the distance between two corresponding points that were captured by the iPhone 7+'s dual camera system. Bright areas are closer to the camera and correspond to higher disparity values.
Where dark values are farther away from the camera and correspond to low disparity values. So, let's go back to the image and look at the disparity map again. We can pinch in to zoom in on an area. But we have one more trick we can do with this application. If I drag my finger across, we can view the data in 3-D.
[ Applause ]
We can zoom in and really get a good look at the range of data that's available to us. We can rotate all the way around and look at that again. And even switch back to the image data overlaid over top of it. So let's look at another image.
Here are some beautiful flowers. When I zoom in and rotate around, you see that we need to fill in those polygons with some data. So we just take the image values and stretch them along there. This goes to the fact that the depth data is not a good representation for recreating a full 3-D scene. But this view is still interesting to look at. Also, it's important to note that the disparity map is a lower resolution than the full image, roughly a half a megapixel for the disparity map versus 12 megapixels for the image.
This application we built with SceneKit. So it made it really easy to implement. We took a mesh and then we transformed the z positions of the vertices so that the brighter pixel values were closer to the camera. Also, we normalized and remapped the data so that it made sense when we viewed it in 3-D.
We'll look at one more image. On this image, it's interesting if we look at the disparity map. And we zoom in and move around a little bit, we see that we have a few distinct planes to work with here. So, with that, we might get the idea that maybe it would be a good idea to quantize or threshold this depth data before we filter it for a more dramatic effect. So with that, I'd like to turn it back over to Etienne.
[ Applause ]
Thank you, Craig. All right. So now that we've seen what depth and disparity looks like, what kind of effects can we apply with that data? So let's take a look. Here's an example image and its disparity map. One effect we can apply is a depth blur effect.
And this is the effect that you can achieve by capturing using camera in portrait mode. We can get a bit more creative. And here's an example where we apply a different effect to the background and the foreground. Here we dissipate the background while increasing the saturation of the foreground to make those flowers stand out.
We can go even further than this. And here we are actually the deeming the pixels in the background proportionally to the depth. And so this is just a couple of examples to give you a taste of what you can do with that data. And we're going to show you how to do this and more later in the talk.
Now let's see who could use that depth. Well of course, if you are editing an application, you can now use depth to create new kinds of effects and to apply new kinds of effects to your images. But if you're a camera application, you can also opt in to capture depth and be the very first -- apply the very first depth effect to the images such as your own depth blur effects, for example. If you are a sharing application, you may also want to take advantage of depth to apply cool effects before sharing images. All right. But before we can apply any effects, let's take a look at how to read depth data and load it into memory.
So let's take a look. Depth data is stored in image files alongside image data in a section called Auxiliary Data. Beware that the values image [inaudible] in the system such as UI image and [inaudible] image or [inaudible] image do not contain depth information. You need to access the image file in order to read the depth data. So let's see how to do that.
If you're using PhotoKit, there's a couple of ways to can access the image file. You may be using PH Content Editing input, for instance. Here's how you can request the Content Editing input for a particular PHAsset. And you can access the image file URL from the Content Editing input that way.
You may also use PH Image Manager. You can ask the PH Image Manager to request image data for a particular asset. And that will give you back a data object that contains the file data. All right. So now that we have access to a file, let's see if it contained depth data. And so we're going to use ImageIO for this.
We start from an image source that we create from our image file. And then we copy the image source properties. This will give you back a dictionary that looks like this one. You want to look for the kCKImage Property Auxiliary data key in that dictionary. The presence of that key will tell you that image file that you're working with contained auxiliary data.
You can look at the type of the data. And here you can see its disparity. Could also be depth. One thing to note here is that the dimension of the depth data are smaller than the dimension of the full size image. This is an image captured by iPhone 7+. The full size image is 12 megapixel. And the depth data is about less than a megapixel. All right. So now that we know that we have a file with depth data, let's see how we can read it [inaudible].
So, it goes like this. We start with the auxiliary data from the file. And then we create an AV Depth Data object, which is a [inaudible] memory representation for the depth data. From that object we can access a CV pixel buffer that contains the depth data. The pixel buffer will be a single channel of data containing either depth or disparity and in 16-bit or 32-bit floating [inaudible] values. All right.
So let's see how to do that in code. Again we start from [inaudible]. And next we ask to copy the auxiliary data out of the image [inaudible]. So for that we request a particular auxiliary data type. Here we are requesting disparity. And this give us back a dictionary that contains the auxiliary data. It can also return the [inaudible], and that will indicate that the image file does not contain the disparity -- sorry, the auxiliary data of that particular type. So that's another way you can check if a file contains depth data.
Next, we can create an AVDepth Data object from the auxiliary data, the representation that we got from ImageIO. And that AVDepth Data contains a couple of properties that you can ask. For example, you can check for its native data type. And that's a pixel format that you can check. And if it's not the one you want, you can also convert it to a new pixel format.
So for example, here we ask for disparity float 16. Because maybe we're going to use a disparity map on the GPU, for example. And this will return a new AV Depth Data object of the right format. So once you have a depth data object that you're happy with, you can access a CV pixel buffer using the depth data map property.
Once you have the CV pixel buffer you can use it directly. Or you can use it using Metal or Core Image. If you're working with Core Image, there's a convenient way you can load the depth data directly into a CI Image. Here's how to do it. When you create a CI Image from the contents of a file, you can now specify a new option such as kCGIImage auxiliary depth or disparity to indicate to CI to load the depth data -- the depth image instead of the regular image. Once you have a depth image, you can always go back to the AV Depth data object by call its depth data property. And keep in mind that you can always convert back and forth between disparity and depth using convenient UCI filters such as CI depth to disparity.
All right. So now that we've read the depth data out of a file and into an image, let's see -- we still need to take a couple more step before we can start editing with it. If you remember, the depth data is lower resolution than the image. So, the very first thing that you want to do is to scale it up to the resolution of the image that you're working with. There's a couple ways to do that. So, let's take a look.
Here's our example image and its disparity map. So, if we scale up, let's say this small, tiny portion there, using narrow sampling, you can see that it's very, very pixelated. So at the very least you would want to apply linear sampling to get the smoother result. You can also use a new CI filter to CI BiCubicScale Transform to get an even smoother result.
However, beware that depth data is not color data. And so, instead of smoothing, maybe what you want is actually preserve as much as possible the details of the image so that the depth data matches the image more closely. And you can do this with a convenient CI filter called CI Edge Preserve Upsample Filter. This filter will upsample the depth data and will try to preserve the edges from the color image.
All right. Oh, also another thing that you need to be careful of. For all of those resampling operation, we recommend that you use disparity over depth because it gives you -- it will give you better results. Okay. A couple thing that you may want to do as well. You may want to compute the minimum and maximum value for the depth data. Because there are many cases where you need to know those values for the particular effects that you want to apply.
Also keep in mind that the depth data is not normalized between 0 and 1. For example, disparity values can range from 0, which means infinity, to greater than 1 for objects that are closer than one meter away. Okay. Another thing you can do is to normalize the depth data.
So once you know the min and max you can normalize the depth or disparity between 0 and 1. And that's pretty convenient first to visualize it. But also if you want to apply your depth effects consistently across various different scenes. All right. So now that we've read our depth data, and prepared it for editing we are ready to start filtering with it.
So in this section we're going to show you several example of depth effects you can apply. We're going to start with simple background effects that you can achieve using built in Core Image filters. Then we're going to show you a custom depth effect that you can achieve using a custom CI [inaudible] code.
Then we're going to show you how you can apply your own depth blur effect using a brand new CI Filter. And finally, we're going to show you how to create a brand new 3-D effect using depth. So let's get started with the first one. And for that I'm going to call my colleague Stephen on stage for that demo. Stephen.
[ Applause ]
Thank you, Etienne. Good morning everybody. My name is Stephen. And now that Etienne has shown you how to load and prepare your depth data, it's my pleasure to be here to show you a couple of ways that you can use depth now to achieve some new and interesting effects on your images. So we're going to jump right in with a demo. Okay. What we're looking at here is -- I'm in the Photos app. And I'm going to enter Edit here on this image. So, this effect is implemented here in a photo editing extension.
And now that we got the rough parts out of the way, we're looking at the original image here. No edits applied. And I'm going to go ahead and turn on the effect now. What you see is that I've applied a de-saturating effect to the image, but only to the back ground region.
And I can pick a different background effect to apply. In this case I've picked just a flat white image. And maybe I'm not terribly satisfied with where that threshold is between background and foreground, so I can pick a new threshold by tapping. And this is all based on the depth data.
So let me bring it back here to the front. And you can clearly see there's a pretty sharp boundary between what's considered background and foreground. There's actually a narrow region in between where we're doing a little bit of blending between the two. And I have control over the size of that blend region.
I can adjust that by pinching. And there you can see it looks like a pretty nice white fog effect. We've implemented this effect by making use of a blend mask. So let me show you what our blend mask looks like here. The black regions of the blend mask correspond to the background image. Solid white is foreground. And then everything in between is where we blend between the two. So, this is what that blend mask looks like as I pinch. We're adjusting the size and slope of that blend.
Okay, back to the original image. As many of you know, there are so many built in interesting Core Image filters we could choose to apply. I'm going to show you a couple of others. This one is a hexagonal pixelate filter. And a motion blur. Let's say I'm happy with this. And I'll save it back to my photo library. Okay, now let's talk about how we did this.
As I mentioned, we accomplished this effect by building a blend mask. And so I'll talk to you now about how we build that blend mask. The basic idea is that we're going to map our normalized disparity values into values between 0 and 1 for the mask. And so we want some high region of disparity values to map to 1 in our blend mask corresponding to the foreground. Some region of low disparity values to map to 0. And that will be the background region. And then all disparity values in between will blend with the linear ramp.
The first part of building this blend mask is to pick the threshold between background and foreground. So when the user taps on the image what we do is to sample the normalized disparity map at that same location and set that as our threshold between background and foreground. Now I'll show you the code for what that looks like.
This is all accomplished using built in Core Image filters. So the first of which I'd like to show you here is the CI Area Min Max Red filter. This filter, when you render it into a single pixel will return the minimum and maximum values of the image within the region that you specify.
Here we're passing in the small rect that the user tapped on. The other thing to note about this line is that before we apply the effect we're clamping the disparity image so as to image that if the user taps near the boundary of the image we won't sample any clear pixels outside the boundary.
On this line we're simply allocating a 4-byte buffer large enough to store a single pixel. And we render into that pixel on this line. Note that we're passing in nil as our color space. And this tells Core Image that we don't want it to do any color management for us. Finally we read the maximum disparity value out of the pixel's green channel and then remap it to the range of 0 and 1 by dividing by 255.
The other input the user has control over is the size and slope of the blend region. So as the user is pinching on the view, we're adjusting the size and slope accordingly. And this is the result of a mapping -- sorry, this mapping is the result of applying a CI Color Matrix filter. So I'll show you in a second.
But then we also apply a CI Color Clamp to ensure that the values remain within the region of 0 to 1, the range of 0 to 1. So here's the code. First we apply our CR Color Matrix filter. Its inputs are essentially the slope and the bias that were selected by the user by tapping and pinching. And then on a single line we apply the CI Color Clamp filter.
Now that we've build the blend mask, the rest is straightforward. What you see on the left is the original image. And on the right you see that image with the background effect applied. When we apply the blend mask to the original image, the background region disappears. And when we blend that together with the background image, we get our final effect. One more slide of code.
Here's where we apply our background filter. This could be any filter you choose. There are many built into Core Image. You could write your own. And then we apply the CI Blend With Mask filter, passing in both the background image and the mask. And that's it. That's how we accomplish this effect using a suite of built in Core Image filters.
And next I'd like to jump in and show you another demo. For this one -- the previous one we used disparity -- depth data indirectly, right? We used it to build a blend mask. For this one we're going to use disparity a little bit in more of a direct fashion.
And I'm going to pull up this other editing extension here to show it to you. Okay. Here we are in photos again. Let's pick this next extension. There we go. Original image. No edits applied just yet. I have a slider at the bottom, though. And I'll start to move that now. And you can see that the background fades to black, leaving us just this prominent foreground figure. That's a really nice effect, wouldn't you say? Let's save that back to the library and show you how this one's done.
What we're doing is we're mapping our normalized disparity values into a scale value, which we then apply directly to our pixels. And for this particular effect we're mapping our disparity values through an exponential function, which when we start off, we raise our normalized disparity values to the power of 0. And that maps all of our scale factors into one, producing no effect on the output image.
When we raise the power to 1 as we move the slider over, this is effectively the same thing as scaling our pixel intensities by the inverse of the depth. Because we're scaling by disparity directly. The effect becomes more interesting as we start to raise it to higher and higher powers. As you can see, the shape of the curve becomes such that there's a sharper distinction between background and foreground with the background quickly going to black.
So I'm going to show you the code for this effect in just one slide. We've implemented this effect as a custom CI Color Kernel. And there are a couple of notable advantages to using a customer CI Color Kernel. One of which is performance. If you are able to express your effect in terms of a custom CI Color Kernel, Core Image is able to optimize that by concatenating your kernel into its render graph, thereby skipping any potentially costly intermediate images along the way.
The other nice thing about this is that Core Image allows us to pass in multiple input images. And Core Image will automatically sample those for us and passing in those sample values to us as parameters to our kernel function, which you see here. The first parameter is a sample from the original image.
The second is a sample from the normalized disparity map. And then third is the power selected by the user moving the slider. The first thing we do with a normalized disparity is to raise it to a power, as I mentioned. That gives us our scale factor. Then we take our scale factor and apply it to the intensity of the pixel while preserving the original alpha value.
This last line is a line of Swift code illustrating how we can apply our custom kernel to our original image once it's been constructed from the source code you see above. We pass in our image extent as well as a list of arguments. Here these are the original image, the normalized disparity map, and the power selected by the user. Note that these arguments correspond one to one with the parameters defined in our kernel signature.
Okay. So that's it. I've just shown you how to use a custom CI Color Kernel to produce this really nice darkening background effect. And hopefully it gives you some ideas of other things you can do with custom CI Color Kernels combined with depth to produce some nice effects. So now I'm going to invite my colleague Alex up onto the stage to show you something brand new in Core Image. Alex.
[ Applause ]
Thank you, Stephen. Good morning everyone. May name is Alexandre Naaman. And today I'm really excited to be here. And today I'm going to talk to you about a new Core Image filter we have. So as you know, in iOS 10 using the iPhone 7+. You could capture images using with depth capabilities using the camera app and the portrait mode.
Now, with iOS 11, and Mac OS High Sierra, we're enhancing those capabilities by allowing you to use those exact same algorithms via a new Core Image filter called the CI Depth Blur Effect. So, now let's try and switch over to a demo and see how that works in real life. Yay! All right. So here we have an asset, a photo that was taken with depth. And we're just viewing the image without having applied any filters to it.
If I tap on this once, we can see what the disparity data looks like. I'm going to tap on it once more and we'll back to the main image. And if I tap once more, we're going to see what happens when we use those two images together along with the new CI Depth Blur Effect filter to create a new rendered result. We should see the background get blurry.
Yay. So now, in addition to just applying the filter as is, there are many tunable parameters that we can set. And inside of this application I've set things up such that it will respond to a few gestures. So, if I now, for example, pinch, we can dynamically change the aperture and get a new simulated look. And so we can simulate any lens opening that we would like quite simply.
Another gesture I've set up in this application is such that when we tap at a different location it's going to change the focus rectangle. And so we can see right now the aperture is quite wide open. And only the lady in the front is in focus. But if I tap on the lady on the left, all of a sudden now she's in focus. The background is a little less blurry.
And the gentleman on the right is still a little blurry. And I can tap on him now and change the focus rect once again. And now they're all three in focus and the background still remains blurry. Now that we're done with our demo, let's go and look at how this happens --
[ Applause ]
-- in terms of code. Okay. So as I was mentioning, at its base the CI Depth Blur Effect really just has two inputs: the input image and the input disparity image. And with those two, Core Image is going to extract a lot of metadata for you and apply the effect in order to render a new image.
Internally, however, there are many parameters that you can set, as I was mentioning earlier. We already know that you can set the input image and the input disparity image. And in the case of the application that we were looking at earlier, when we were tapping, we were setting the input focus rect.
And then as we were pinching, we were setting the aperture. So now that we have an idea of how we want to do this from a conceptual standpoint, let's take a look at how this will be done in terms of code. And this is effectively my only slide that has any code on it, so you can see how simple it is to use.
As we saw earlier, you can load a CI Image via URL, quite simply. This gives us the main image. And then in order to get the disparity image, all you need to do is use that same URL and ask for the auxiliary disparity information via the Options dictionary as Etienne mentioned.
Once we have our two images, we can create a filter. And we do that by name, CI Depth Blur Effect. And then we specify our two images. Once that's done, we can then ask for the output image via .outputImage. And we have a new CI Image that we can then render in any number of ways. And it's important to remember that a CI Image is really just a recipe for how to render. So this is actually a quite lightweight object.
In the case of the application that we were looking at earlier, all we had to do in order to render a new effect with a new look was to change two values. So in this case, we changed the input aperture. And we do that by calling filter setValue forKey and specifying a float value between 1 and 22 to create a new simulated aperture.
And we specified a new rectangle where we wanted to focus on via the input focus rec key, which corresponds to a lower left based rectangle in normalized coordinates. Once those two things are done, we can ask that filter for a new output image and then render as we wish.
Now as I was mentioning, Core Image does a lot of things for you automatically by examining the metadata. There are, however, a few things you can do in order to further enhance the render that we don't do for you automatically. And those relate to finding facial landmarks. And you can use the new vision framework that we have in order to do this.
So, via the vision framework you can get the left eye positions, right eye positions, nose positions and chin positions. And you can specify up to four faces to be used inside of the CI Depth Blur Effect. In the case of this image, because there are three faces, we would actually specify six floating point values into a CI vector and set that for each landmark that we found. And there are [inaudible], so it would be xy, xy, xy.
The next thing I'd like to talk to you a little bit about is dealing with rendering outputs of different sizes. Although the inputs are quite large, 12 megapixels, chances are you won't often be rendering the entire image. And you may want to down sample the output. Your initial reflex may be to just down sample it, the result of the CI Depth Blur Effect. But that's actually not very efficient because the CI Depth Blur Effect is actually quite computationally expensive. It makes more sense to, instead, down sample the input.
And if you do this, we can then take advantage of the fact that the input image is smaller and perform some optimizations. In order to do this, however, you do have to set one more parameter, which is called the Input Scale Factor. So in this case if we wanted to down sample the image by 2, we would set the input scale factor to 0.5. And doing so ensures that we sample appropriately from the image and also take into account other effects such as the noise in the image.
There are a few additional things that I'd like to mention about using the CI Depth Blue Effect which are important to keep in mind. The first and foremost is that when you create your CI context where you'll be using these filters, you're going to want to make sure that you're using half float intermediates. And you can do this by specifying the kCI Context Working Format to be RWAH.
On MacOS this is the default. But on iOS, by default we use 8-bit. And if you don't, you will see that the data will be clipped because disparity data comes in extended range. And without specifying this, it will get clipped and the result won't look very good. So it's really important to remember to keep doing this when you use this filter.
Also as I mentioned earlier, the CI Depth Blur Effect will automatically set many properties for you on the filter. In order to do so, it will examine the metadata from the main image as well the data that exists inside of the auxiliary image. And so it's important to try to preserve that throughout your pipeline.
Core Image will do its best in order to ensure that this happens. But it's something you're going to want to keep in mind as you're using this filter. And Etienne's going to talk to you a little bit later about how to ensure that you do this when you save images.
All right. Well, the last thing I'd like to talk to you about today has to do with some internals of the CI Depth Blur Effect. It's been mentioned many times so far already today, the main image and the disparity image are of very different resolutions. And now internally, Core Image is going to up sample the disparity image up to a certain point in order to achieve the final result.
And this is an area where we feel like if you have additional processing time you could perhaps do something a little different. Maybe some of the methods that Etienne spoke of earlier. And that concludes pretty much everything I wanted to tell you about using the CI Depth Blur Effect. And I hope you all go and start adding it to your apps. And on that, I'm going to hand it back over to Etienne. Thank you very much.
[ Applause ]
Thank you, Alex. All right. So, so far we've seen interesting cool new effects that you can do using depth data. But the depth data was really used as a mask to apply different effects to different part of the image. And so now we want to show you something different. Something that actually uses depth as a third dimension. And this will give you a taste of what kind of new creative effects you could apply using this data. And to tell you all about it, I'm going to invite Stephen back on stage. Stephen.
[ Applause ]
Thank you, Etienne. It's good to be back with you. What I'm going to show you now is a true 3-D effect. And this particular effect that we're going to show you is called dolly zoom. Many of you are probably already familiar with what dolly zoom is, especially if you've ever seen a scary movie. But to get everybody up to speed a little bit, I'm going to show you a little animation of what's going on with dolly zoom. So what you're looking at here is a scene consisting of three spheres.
While the camera is dollying back and forth, it's also doing it so in such a way that the field of view is also simultaneously constrained so that the gray sphere in the center on the focal plane remains at roughly the same size throughout the effect in the output image, which you see on the right. Everything else in the scene will move around due to perspective effects. So let's take a look. Let's switch over here to the device.
Perfect. All right. Let me pull up the dolly zoom editing extension. And I'll draw your attention now to the group of flowers in the center of the image. Those are on the focal plane. So as I begin to move the camera, there you see the dolly zoom effect in its full glory.
When I pull the camera in this direction in particular you can really see the true 3-D nature of this effect with the foreground flowers really popping out and the background sort of fading or pulling away from the camera. You do also see a couple of artifacts, of course. One of which are the black pixels that you see coming into view around the background. This is due to the fact that in the camera's current configuration, the virtual camera, its field of view is wider than the iPhone that captured the image.
And so the virtual camera sees more of the scene than the iPhone did at the time of capture. So we're just filling those pixels in with black. Similarly, the stretching you see in between the foreground flowers and the green leaves behind them is due to the fact that this camera, the virtual camera, has exposed some portions of the scene that weren't visible to the iPhone at the time of capture. One strategy you might take to work around some of these issues is to set a new focal plane.
So, now I've tapped on the yellow flower in the foreground, which is in the bottom right corner of the image to set that as the focal plane. And as I move the camera now in this direction you can see that none of the black pixels are coming into view.
Of course if I move the camera again in this direction, they show up again. And really, that 3-D effect is quite strong here. Correspondingly, I can tap on a background region of the image, such as the trees you see in the upper left corner. And when I pull the camera now in this direction, it really produces a pleasing sort of prominent effect on that central group of flowers.
So let's take a look now at how we implemented this. Because of the true 3-D nature of this problem, we turned to Metal as a true 3-D tool to solve this effect, to produce the effect. We were able to get our system up and running quite quickly in Metal because of all the work that it does for us.
Basically all we had to do to start off with was to construct a triangular mesh that we mapped onto the image, much like what you saw in Craig's Depth Explorer demo at the beginning of the session. And we mapped the -- excuse me -- we mapped the image -- we center the image around the origin.
Metal also gives us the opportunity to program a couple of stages of its pipeline, one of which is the vertex shader. And the job of the vertex shader, it gives us the opportunity to process the geometry of the scene in some way. And we can also program the fragment shader, which gives us the opportunity to produce a color for each pixel in the output. We were able to reintegrate all of this 3-D Metal rendering back into our Core Image pipeline by using a CI Image Processor Kernel.
So here's the code for the vertex shader. Again, the job of the vertex shader is to process the geometry. And it does so one vertex at a time. So we get one vertex of the original mesh as input. And then we'll produce something new on the output. The first thing we do in the vertex shader is sample the depth at that vertex. That's this line you see here. We're storing it in a variable called z, which will get used in a couple of places in this shader.
The first of which is this magical line right here. This line is the line that every young engineer grows up dreaming that they'll write some day. Because this is where we do the math. There are three variables as input to this equation. One is the depth, which we just sampled above. And the others correspond to the user inputs of the focal plane and the camera's configuration.
This produces a scale factor, which we can apply to our vertices, which we do on this line right here. And we can apply a scale factor to it because the vertices are centered around the origin. So this scale factor serves to move vertices either radially away from the center or toward the center of the image. And this is what produces the illusion of three dimensions.
Once we have transformed our vertex position, we output it here in the new output vertex while preserving the original depth value, z. And this is important because it will get passed into the z buffer machinery of Metal, which will then just do the right thing for us as pixels move around in the output and start to overlap each other. Also, we output the texture coordinate of the original vertex. This will get used by the fragment shader, which I'll show you now.
Remember, the fragment shader's job is to produce a color pixel output. And since Metal interpolates all of these texture coordinates for us, all we have to do in our fragment shader is to sample the original image at the interpolated texture coordinate. And that's it. That's really all the code you need to see to implement the dolly zoom effect. And hopefully it's given you some ideas of new directions you can take this in to produce your own brand new custom 3-D effects. We're really excited to see what you come up with. And now, I'll hand the stage back over to Etienne to finish up.
[ Applause ]
Thank you, Stephen. All right. So now that we've applied various new effects to our images, there's one more step that we need to take. And that's save your depth data. So, you should always preserve the depth data. All right? That way your users will be able to use other apps like yours to apply their own depth effects on top of yours. Even if you don't use the depth data, you should always preserve it if it was present in your original image.
This will really ensure the best possible experience for your users. However, when you store the depth data, be sure to match the geometry of image data. If you don't apply geometry correctly, the depth data will no longer match the image data. And so further depth effect applied on top of that will no longer work properly. So let's take a look at how the kind of geometry transforming might apply to depth data.
A very common operation is orientation. Often times you get to work with a portrait image that was actually shot in landscape and has an [inaudible] orientation. So, the depth data may look like this. And so you want to make sure to orient the depth as well. So make sure to apply orientation. Another very common operation is crop. Right? And so again, make sure you crop the depth data to match.
Now, you may have a more advanced transform that you also apply to your image such as [inaudible] transform like this one. Right? Or maybe you have a custom [inaudible] transform such as a perspective transform. Or maybe even a 3-D transform like the one we saw in the dolly zoom demo. In any case, you want to also apply the same transform to the depth data so that it matches the image perfectly.
Okay. So the key thing to remember here is to apply your transform at the native resolution of the depth data. So you want to scale your transform parameter from the full size image to that size of the depth image. Otherwise, right, your transform will be applied incorrectly. And then the depth image will no longer match the image, the output image.
Another thing to note is that depth data is not color data. So when you are rending a new depth image, make sure you don't apply any kind of color match into it. All right. So now that we've seen what kind of transform we may apply to depth data we can render it into a new CV pixel buffer.
Once we have a new CV pixel buffer we can create a new AV depth data object from it. Here's how. We start from our original depth data object and then we call Depth Data by replacing Depth Data Map. And we pass in all newly rendered depth buffer. Now returns a new AV depth data object. And we can then save into our output image. Let's a take a look at how to write depth data using Image IO.
We start from an image destination that we create for our output file. And here we ask for a JPEG format. So please note that not all image from that [inaudible] depth, but JPEG does. Next we add our output image to the image destination. And then we ask all depth data object that we want to store into the image for a dictionary representation for the auxiliary data to store in the file.
So this is will return the dictionary for the auxiliary data as well as by reference the type of the auxiliary data to store. Then we ask the CG image destination to add that auxiliary data, passing in the type and the dictionary. And finally, all we have to do is to call CG image destination finalized to write everything down to disk.
If you're working with Core Image, there's a new very convenient way you can do this as well. So, if you are using CI context [inaudible] with image representation for a particular CI image directly in order to render and save to a JPEG file, you may now pass in using an option key a depth data object that you want to store as part of that file. Even better, if you have an image, a depth image that you have -- let's say you have applied a transform to it or something.
You can also specify it as an option to that method so that Core Image will render both the regular image and the depth image and save everything down to the file in one call. Very convenient. And so that's it. That concludes our session on editing images with depth. So, let's recap what we've learned today. We've learned what depth is and what depth and disparity looks like.
We've learned how to read and prepare depth data for editing. And then we saw several ways, and we showed you several ways to apply new depth effects to your images. The first one was background effects using a built in Core Image filter. Then we had a custom darkness effect using a custom Core Image kernel. And then we showed you how you can apply your depth effect using a new CI filter.
And then we saw how you can create a brand new 3-D effect using depth. I hope that this session will inspire you to use depth data into your own applications. And I can't wait to see what cool effects you'll come up with. For more information, please go to the [inaudible] at Apple.com.
We have a couple related sessions. There's a session on Advances in Core Image that's going to take place later today. So we strongly encourage you to go there. Check it out. And there was a couple of session yesterday was doing Photo Kit and also how to capture depth with iPhone. And with that, I hope that you have a good rest of the WWDC. Thank you very much.