Advanced Media Application Development - WWDC 2008

Media • 59:09

Dive deep into QTKit, the framework for handling rich media, to learn advanced uses of its classes, data structures, and protocols. Learn the nuts and bolts of creating movie content, tracks, timecode, threading considerations, and more. Understand when and how to drop into the procedural QuickTime API. A critical session for advanced developers who are playing, capturing, and manipulating time-based media.

Speakers: Tim Monroe, David Underwood

Unlisted on Apple Developer site

Downloads from Apple

SD Video (722.4 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Okay, great. Thanks, and welcome to session 712. This is Advanced Media Application Development. And we're going to focus on some more advanced uses of QTKit. I see there's a lot of people that weren't here for the first session. So let me just remind you that at noontime, there is a talk by one of the head programmers at Pixar up in the Presidio that I think will be very entertaining.

If you haven't heard him speak before, you absolutely owe it to yourself to go. And then this afternoon, from 2 onward, a lot of the QTKit engineers will be down in the lab, so you can come by and ask questions that you want answered. My name is Tim Monroe, and I will be joined shortly by David Underwood for this session.

Actually, the first thing I want to do is go to the demo machine. Every year, maybe you know this, we have what we call a stupid movie that sort of encapsulates one of the themes of WWDC. And I have gotten the honor of showing that movie to you today.

So could I have the demo machine? So here for your viewing pleasure is the 2008 stupid movie.

[Transcript missing]

Some things are scalable. And some things are not. Some things scale naturally. And some things probably shouldn't be compared at all. Like, not even a little bit. Know what I'm sayin'?

Okay, back to slides please. This talk has nothing to do with scalability, but, or maybe it does. So here I want to put on my non-reading glasses. I want to start off by talking about QTKit and the playback portion of it. And as you know, in Snow Leopard, our charter was not to add a bunch of APIs to give you guys new capabilities, but it was to go back and focus on stability and performance of the existing APIs.

And that was a great charter because it allowed us to go back in, look at some of the code that we maybe thought wasn't as good as it could have been, rework it, and add capabilities, make things faster, make them more stable for you. So we've gone in and completely reworked two classes.

Well, actually a number, but the ones that you'll be interested in are QTMovie. We've rewritten it from scratch in order to uncouple it from the QuickTime framework and allow it to sit on top of other frameworks. And of course, the one that you've heard about is this QuickTime 10 framework, which gives us added capabilities that we'll talk about in a little bit.

We've also gone in and done significant work in making QTMovie View and QTMovie Layer faster, more capable, and actually less buggy. So we'll talk about some of that here today. I want to start off talking about what I call QTMovie Best Practices. And these are essentially trying to head off places where you may run into trouble. The tech support people or the mailing list seem to get the same sorts of questions all the time. And I want to talk about that.

I want to just touch on a few of them here today to save you some trouble. Then I'll focus on these QTMovie improvements that I just mentioned. I'll show you some of the drawing improvements, that is to say the improvements in the movie layer and the movie view. And then David will come up and do roughly the same thing with what we have changed in the capture portion of QTKit. So let's talk about QTMovie Best Practices.

There are three of these that I want to talk about. One having to do with movie loading. One having to do with accessing APIs in the QuickTime framework directly. And then I want to talk about how you can move QTKit processing onto a background thread. So if you were at the introductory session this morning, you saw me use the API init with file in order to open a file on disk and display it in a QT Movie view. What I want to suggest to you is that if you're going to be doing more advanced work with QTKit, you want to move to a different API, namely init with attributes.

The idea is, here is the declaration for it, you will pass in a dictionary of attributes that you want the movie to have when it is opened. Now, of course, one thing you have to do is specify where the movie is. So you will pass in either QT Movie file name attribute or QT Movie URL attribute, and that will help QT Movie find the data that it needs to use. I've listed a few others that are likely candidates for inclusion. In the dictionary that you pass to init with attributes. Again, if you were at the earlier session, you saw that I made an API call to make the movie editable, which by default it is not.

You can save yourself the trouble by putting the editability attribute into this dictionary, and that will be set at the time you open the movie. You can also set a delegate on the movie, and that's useful for areas where the delegate method needs to be invoked early on in the open. So you can set the delegate to be set at the time you open the movie.

And finally, pretty much any other attribute that can be set on the movie is fair game for inclusion in the dictionary that you pass to init with attributes. And here I've listed just a couple. You might, for instance, want to set the looping state of a movie so that it loops continuously during playback. Or you may want to set the volume of the movie in order to get some non-default volume. So here is a simple example of a simple looping state.

Use of initWithAttributes. As you can see, I'm passing in a URL associated with the QTMovieURL attribute. And I'm also setting the movie to be editable by passing in the movie editable attribute. And then I just call initWithAttributes error, and I get back a QTMovie.

[Transcript missing]

Well, this is not something you want to do in a UI app because your user will sit there thinking your app has crashed.

But it's perfectly reasonable to do in a command line app. Maybe you have a command line tool that opens movies and exports them to some other format. In that case, it's perfectly okay to sit around waiting for all the movie data to arrive, and so you can force synchronous loading in this way.

For a GUI-based app, the alternative method is to monitor what we call the movie load state. The movie load state is an attribute of a movie, and these are the currently defined values that can be returned when you ask for a movie load state. Two of these, actually three of these to look at.

The first one is the error state. If you get back an error state, it means that something bad has happened during the opening of the movie, and you should just toss the movie. There's nothing you can do with it. The state at the bottom is the load state complete. It means we have every little last bit in that movie available. We can play it, we can export it, we can do whatever you want with that movie.

Now, you don't actually need to wait until you have all the data to do certain things with it. If all you want to do is know how big the movie is, what's its dimensions, what's its duration, how many tracks does it have, then you want to look for the load state loaded. In technical terms, that means that something called the movie atom is available to us, and that's where a lot of the data about the movie is stored. So let's look at how this would work.

So I can open my movie, again, within it with attributes, and then I set some object as an observer for the QTMovieLoadStateDidChange notification. That will get issued whenever QTMovie determines that the load state has gone from one level to a different level. And then your observer could look something like this. If you get the load state error, toss the movie. There's nothing you can do with it.

Once you've gotten to load state loaded, at that point you can do the things you want to do. You can ask for the movie's natural size, you can create your movie view to be just that size, and you could even assign the movie to the movie view. Okay, let's talk about direct access to the QuickTime framework.

As you probably know, QTKit sits on top of QuickTime to do all of its work. There is, QTKit, of course, has a limited API, and there are certain things you can do with the QuickTime framework, with its thousands and thousands and thousands of function calls, that you cannot do with QTKit.

Well, we give you a safety valve, or a trapdoor, that lets you get down into the QuickTime framework. And here are methods that return to you the QuickTime identifiers, or objects. So, for instance, if you have a QT movie, you could ask for the QuickTime movie that that QT movie is representing.

You could also ask for the movie controller, which is another component that helps manage the playback of the movie, and also draws that little controller bar underneath the movie. Or if you're working down at the track or the media level, again, you can get the QuickTime track or QuickTime media associated with those objects.

So here's an example of when it might be useful to dip down into the QuickTime API. There is currently no QTKit API that lets you find out where a frame starts. So you're at some time in the movie, and you'd like to back up to the beginning of that frame. There's no QTKit API that will tell you the time that that frame starts at.

Well, we could write it ourselves in this way, by dipping down into the QuickTime API, and the operative API here is called GetMovieNextInterestingTime. And you can see that the first parameter to that function call is the QuickTimeMovie that is returned by the QuickTimeMovie method there. So I'm not going to look into this anymore, just to let you know that if there's something QTKit doesn't do, and you need it for your application, you can dip down into the QuickTime API.

Now, there are some caveats here. If you dip down into the QuickTime API, there's all sorts of fun calls that let you dispose tracks or dispose movies. Don't call those. You will just crash your application. Okay? Because QTKit assumes that those things are still there, and if you dip down and behind its back start deleting tracks from the movie, it'll get hopelessly confused and eventually crash.

QuickTime methods that return information to you are virtually always safe to call. However, when you go in and start changing attributes at that lower level, again, there's a possibility that there could be inconsistencies between what QTKit thinks and what QuickTime thinks. Now, you may not crash at this level, but strange things may happen.

And one thing to keep in mind is that this ability to dip down into the QuickTime API is not available to 64-bit apps. The very simple reason for that is that the QuickTime framework is not available in 64-bit. We do a little bit of inter-process communication from the 64-bit QTKit to be able to access the 32-bit QuickTime capabilities.

Now, one thing this should tell you is that if you need some of the functionality in the QuickTime framework in your application, and you want to run 64-bit, you should tell us about that so that we can give you equivalent functionality in QTKit, which is guaranteed to work in 64-bits.

[Transcript missing]

Okay, so those are things you may run into just to keep in mind. Let's talk about Qt Movie improvements. As I've said, Qt Movie is a pretty thin wrapper on top of the QuickTime Movie and Movie Controller concepts. So QuickTime has some features and some limitations. One of the nice features of QuickTime is that when you open a movie, you can not just play it back, but you can also edit it. So it always sets things up for editing and playback.

It is not, as I mentioned just a few minutes ago, 64-bit capable. And, as we just saw, there's some stuff you need to do to make it thread-safe. And QTKit, of course, being built on top of QuickTime, inherits these features. Since there is no QuickTime in 64-bit, as I said, we do IPC in order to get things to actually work correctly. You must initialize a QT movie on the main thread. And then if you want to operate on it on another thread, you need to do the work that we just showed.

So here's sort of the way you might look at how QTKit was originally structured. QTKit sits on top of QuickTime, and the parts of QuickTime that it sits most heavily on are called the Movie Toolbox and Movie Controller Components. Now, the movie toolbox itself will talk to other components, and just a couple of them are media handlers.

So if you have a video track in your file, you'll get a video media handler. If you have some sound in your file, you'll have a sound media handler. And they will talk to yet other software components, here the sound manager and the image compression manager, in order to get the sound through your speakers or the bits up on the screen.

Now, the nice thing about this architecture is that we can swap parts of it out or add in new parts fairly easily. And one thing that happened quite a while ago was that we lessened our dependence on the sound manager and added support for core audio, a much more powerful sound processing API. And we did a similar thing with the image compression manager.

We made a path where you could go through core video to give you better performance and better fidelity for your video. We can do the same thing for the Movie Toolbox and the Movie Controller, and that's what QuickTime 10 is. It's a new set of components that doesn't rely on any existing QuickTime functionality that will play your graphics and your sound through those other software components.

So it's a new media pipeline that you can opt into for playback of your media files. It's more efficient than the original pipeline, as we'll see in just a minute. And, nicely, it is 64-bit capable. We don't need to do the IPC stuff that ties our hands in certain ways. And finally, it is much, much more thread-safe than QuickTime is.

How do you opt into the new media pipeline? How do you get that QuickTime 10 goodness? It's very simple. In the initWithAttributes call, you add in one more attribute, which is the QTMovieOpenForPlayback attribute. When you pass that in with the value yes, you're essentially telling QTKit, I'm only ever going to play back this movie. I'm not going to edit it, I'm not going to export it, I'm just going to play it back.

So here's how the call would look. It's just the same code we had before, but with the addition of a new key value pair, namely the open for playback attribute, and in this case it's set to yes. So this tells QTKit that you're willing to let it take a new code path. So let's take a demo of that, if I could go to the demo machine.

So here's an application that I've got pre-built. And I'm going to open a movie. And now you can see down here that I am listing the current media stack. In this case, if I start it playing, it's going to use QuickTime 7. And I'll just use this button to start it playing. And we have the movie playing back using the code path that it's always used. Now here's a fun thing. I've got a button here that says change media stack.

I can change it on the fly. And now I'm using QuickTime 10. It looks the same, it sounds the same, but what's the performance like? Well, let's go back to QuickTime 7. Can we mute the audio there? Let's launch... Oh, somebody took it away. Good old Activity Viewer, I suppose.

Activity Monitor, oh, there it is. All right, so this is gonna, we're gonna look at the QTKit server and QTKit player. This is a 64-bit application. And to get the appropriate CPU load, we need to add together the QTKit server and the QTKit player. And what are we looking about there? Adding them together is about 25% of the CPU, roughly. Now let's go over here and change it to QuickTime 10.

All the CPU load for the QuickTime, for the QTKit server, goes to zero. And what are we left with? Five percent. Whoa. So what was it before? 25 percent? Five percent. Which path would you rather take? So by adding in one attribute to our dictionary, can I go back to slides?

We were able to significantly reduce the CPU usage of our application. So, one of the things we have added for Snow Leopard, the new media pipeline that you get access to by specifying that new attribute. Now, when you go that path, certain methods in QT Movie will not work. Anything that exports it, so the right to file with attributes, will not work. Anything that attempts to edit the file will not work, because you've told us that you don't want to do those things.

No editing, no export. And, more importantly, because we're no longer built on QuickTime, you cannot access those QuickTime primitives. You can't use those API calls in the QuickTime framework, because there isn't any QuickTime framework for that particular file. Now, one more thing to keep in mind is that not all files that are playable by QuickTime are playable by the new media pipeline.

And when I say QuickTime there, I mean QuickTime 7, of course. The ones that can be played back are really those that can be played back on our devices, such as iPods and iPhones. And if we decide that the movie cannot be played by QuickTime 10, we will fall back to the existing media pipeline. Without telling you, but it'll play back.

We may need to add in an API that lets you figure out which code path the movie is taking, but we have not exposed that in the seed that you have. So let me talk a little bit about improvements in the drawing subsystem. And for that, I just want to go straight to demos. If I could have the demo machine.

I want to show you two demos. In fact, you've seen one of these already in the graphics and media State of the Union. And I'm going to launch it. House of Mirrors here. It's a very simple... But at the same time, very cool demo. Scalability, a review. Again, could I mute the audio on that? So here we have a QuickTime movie, our stupid movie, played into a QT movie layer.

And we've, actually it's two layers because the one layer is on top and the other layer is the reflection there. Now, this is the exact same QT Movie instance being attached to two different movie layers. That was not possible before Snow Leopard. You could try it, but it wouldn't work. You could not get this nice reflection. In Snow Leopard, we've improved the movie layer stuff so that you can do this sort of thing. And of course, as you saw in the demo, we could add lots of layers. One QT Movie, lots of layers. OK?

And I want to show you one more demo. We have not only added that capability with the movie layer, but...

[Transcript missing]

So again, we didn't add any new API for that. We just added capability to QTKit. Yeah, I know there's bifocals. Okay, so what can we do? We can take the same QT movie and attach it to two different movie layers. We can take the same QT movie and attach it to two different movie views. We recommend doing the movie view thing only if you're taking the QuickTime 10 code path.

Otherwise, the movie controllers attached to the two views start arguing with each other, and it's not a pretty sight. So that's all I want to talk about, and now I'd like to bring David Underwood up to talk about some of these new and improved features we have in the capture part of QTKit.

Thank you, Tim. So if you were here at the previous session or if you're familiar with QTKit Capture, you know that it's primarily focused on kind of what we perceived as pretty common use cases. So for example, a lot of applications are going to want to capture real-time media and record it to a QuickTime movie, for example.

But what we also did in the API was we left some openings, some way of really controlling what you're doing with your capture sessions, and also ways of getting at the data that's being captured. So you can do custom things with it that we didn't anticipate. So for the remainder of this session, I'm going to go over a few of those things and some detailed ways that you can control and get the most out of your capture sessions. I'm also going to show you some new APIs that we've introduced to make that even better. So what we'll go over. First thing I'm going to do is show you the small handful of new capture APIs we've added in Snow Leopard.

Then I'm going to go into detail about the QT Capture Connection class. And this class kind of plays an interesting role in the API, because it's a class that is behind everything, and you see it in a lot of the APIs, but you rarely have to interact with it directly. But it's actually very powerful, and I'll show you a few tricks and a few things you can get out of it.

I'm also going to go a little bit over the importance of observing real-time changes in your capture session as it's running. One of the key complexities with capture is that because it's a real-time system and things can change at any time, an application ideally should anticipate those changes and deal with them. And that'll be a bit of a theme that you'll see in a lot of these examples.

So, first, for new APIs in Snow Leopard, the biggest new API that we've added is the QTCapture Audio Data Output. And this class is a QTCapture Output subclass that allows your application to get directly at the raw audio samples that are going through the capture session. So, this is pretty much analogous, if you're familiar with it, to the QTCapture Decompressed Video Output, which is that it's basically a pipe, an outlet, that lets you get at those samples, do some custom processing on it, you know, anything that we didn't think of that your application needs to do.

Some of the key features of QT Capture Audio Data Output. The format of the audio that it gives you is the canonical Core Audio linear PCM format. So this is the format that's compatible with almost all audio units in Core Audio. So it's designed to be specifically compatible with audio units and make it easy to do your extra processing in Core Audio, which is going to be the most likely case.

We package the audio data in a QT sample buffer object, which is an object we already had defined in the API. And what this object does for you is it contains both the actual data itself and also metadata about that data. For example, it tells you the format that it's in. That's very important. And it also gives you timing information. So everything in a live capture session is time-stamped relative to a certain time base. So you can use that for synchronization and other services to figure out the timing of the data.

So if you're familiar with this API and you've been to these sessions before, you've probably seen a diagram like this, where we have our QTCapture session in the middle, and then we have our QTCapture inputs and our QTCapture outputs. And in the past, I've plugged QTCapture decompressed video output as the way to serve open-ended use cases.

So all this class does is it gives you CV image buffers of your video frames as it gets them as quickly as possible, and then you do what you want with them. QTCapture audio data output is basically the same thing, but for audio. And the big difference is your input will be from some kind of audio source, and your output format will be these QT sample buffer objects instead of CV image buffers.

So just to quickly go over how you would use this class, it's very similar to the other QTKit capture output classes. First, you will create your QTCapture audio data output, just using the standard initializers, and you'll set a delegate on it. And common to the pattern in other classes, this delegate's going to get called every time a sample buffer full of audio is received. And just a note on that, a single sample buffer can contain many, many samples of audio, as is natural for audio. So it can contain, you know, maybe 512 samples of audio that you can use to queue up.

And then finally, once you've created that, you'll take that output and you'll add it to your capture session, just like you would with any other output. So, simple. The delegate will implement one method. There's one method defined, which is this capture output did output audio sample buffer from connection method. And most likely what you're going to do with this sample buffer that you're given is do some processing on it with core audio.

And so QT sample buffer provides a few conveniences that make that relatively easy. One is that you can get a handle on an audio buffer list object, which is the common object that's used with core audio audio units and AU graphs. And you also need to find out how many frames of audio are in that single sample.

And you do that by calling the QT sample buffer number samples method. And then once you have that information, you do whatever you need to do with it. So just to show you a concrete example of this, I'm going to show you a quick demo application. So if we could go to the demo machine, please.

And this application is called Audio Data Output to Audio Unit. And if you go to the WWDC attendee page for this session, you can download this application and play with it yourself. I'm just going to close this off. And what this application does for you is it's a simple example of how to take the audio buffers that you're getting from QTKit Capture and use them with the core audio API, which is going to be probably the most common case of using this class. And it deals with a few of kind of the impedance-- a little bit of the impedance matching that you need to do between those APIs. So I'll just show that to you pretty quickly.

And all this application is going to do is it's going to get those audio buffers from a capture session. Then it's going to use an audio unit to apply an effect to those buffers. And then it's going to use the core audio x-ed audio file API to write that to disk. So we're going to record an audio capture session with an effect. So this is a very simple application. It's a single-windowed application, so we just have one controller. I'm just gonna open that up.

And I'll jump through this a little bit. Don't need to build it from scratch because, again, this is kind of the common theme. A lot of the things you need to do here are similar to what you've done before. So, for example, we need to find an audio device. So we use the default input device with media type and pass an audio media type. And we have to open it, make sure it opens successfully. Then we create our capture session and create our device input and add that to the session. Pretty simple.

And then the new thing here is we create our audio data output and set a delegate on it to get those callbacks whenever a new audio buffer comes in and add that as an output to the session. And really, I don't know if you can see it here, this is basically all of the QTKit capture code in this application. Really, in this case, we're supporting these open-ended use cases.

Really, the meat of the work is going to be done by everything else. So by core audio and the other things your application does. And really, in this case, we're just going to do that. We're just using QTKit capture as a source for data and nothing else. So the amount that you do with QTKit capture is pretty small, actually.

And also in the setup area, we're going to create our effects audio unit. In this case, we'll do a delay effect. It's pretty easy to hear. And we'll go and open that up. And here's where it gets a little bit interesting. So if you're familiar with Core Audio, and if you're not, there are some sessions at this conference that you should go to, or at least review.

Core Audio, both the Audio Unit API and the AU Graph API, employ what's called a pull model for getting data through these different units. And what that means is when you have a chain of audio units, or even just one audio unit, it's told to render periodically, or it's pulled on.

And when it's told to render, the thing that's telling it to render is getting the output from that. And what that means is that whenever the audio unit needs new data, it asks for it. It issues a callback that was set on it. And this is called a render callback.

So in other words, instead of data being pushed into it, it issues this callback whenever it needs more data. But Qt Get Capture kind of has a very different set of goals. The audio data output is trying to give you these audio sample buffers as quickly as it can. So it's pushing them on you effectively.

So we have a push model and a pull model. And you need to reconcile those. And that actually turns out to be relatively simple. And the way we're going to do that is by setting this render callback on the audio unit. And this is the callback that's going to be called when the audio unit needs new data. data. So that's what we do here. So I'm just gonna skip down here a little bit. Here is the delegate method that gets called by QT Capture Audio Data Output for every sample. And I'll skip over all of this for a minute. I'll come back to it.

And the main thing we try to do here is we need two pieces of information, which I showed you in the slide. We need to find out how many frames of audio are in the specific sample buffer, so we get that here. And then we also need an audio buffer list, which we get here.

What we're gonna do with that audio buffer list is we're gonna assign it to, whoops, We're going to assign it to an instance variable in our class. So we're just going to store it away. And what we'll do immediately is we'll tell our audio unit to render. And this is the call that pulls on the audio unit. It pulls on the output. And what this will cause it to do is cause it to call its render callback, which we have down here.

And all that the render callback does is it takes this sample buffer that we just stored in our instance variable right here and just fills it in. It just fills in the pointers. It doesn't even need to copy any data. And this all happens synchronously. So when Audio Unit Render is called, the render callback is called synchronously.

So we don't have to worry about the data going away. We don't have to retain it or hold on to it or anything because... Once this audio unit render call returns, we'll be sure to have copied all that data in there. So that's how you reconcile those push and pull models in a very simple way. Just you need to do a little extra storage and then move over with that render callback.

So once we have the output audio from our audio unit, we use the X audio file API, which is the high-level file writing API in Core Audio, and write that out to disk. And we do that asynchronously because we can. And it's that simple. So that doesn't sound so bad. You notice I skipped over a ton of code here, though.

And this code is very critical, actually. So as I alluded to earlier, the format, or many things about a capture session, can change over time. So for example, just in the internal line in on the Mac, the user can go and open Audio MIDI Setup and at any time change the sample rate of the audio that it's giving. So if you're recording from that device live, the user can change that format at any time, and your application needs to be able to deal with that.

And the way that we deal with that is for every single QT sample buffer that comes in from the audio data output, we're going to get its format description. And this is an object of type QT format description. And what we're going to get out of it is some information. So we're going to get the Core Audio AudioStream basic description. And we have a copy of the old AudioStream basic description that we were using previously.

And basically we're just going to check, did something about the AudioStream basic description change? That means the format of the audio changed, so we're going to have to reconfigure all of our audio units. And we're going to have to reconfigure the X audio file. And the things that are liable to change are the number of channels in the audio. So it can go from stereo to mono, for example. And the sample rate.

So if that's the case, we need to do some teardown. So we uninitialize our audio unit, and then we reset it up again. So we need to set up our stream format. So we set that audio stream basic description on it. And we also set up the X audio file down here. So we update its output format to match the number of channels in the sample rate that we're writing. It's very important that your application does this.

It always needs to be able to anticipate these changes and reconfigure whatever processing is being done as necessary. And that pretty much wraps it up. So if we build and run this thing. And you see we have a pretty minimalist user interface here, and I'm just going to do a quick recording. Hello. Okay. I think that's enough. And let's just open that up.

Hello. Hello. Okay. Okay. Okay. That's going to get out of hand. And so that's a simple application. Again, using QTKit Capture just as a source for data, but really doing all the hard work in a different API or something that's specific to your application. So can we go back to the slides, please?

So in Snow Leopard, we've also added a handful of other new APIs. And keeping with the general goals of Snow Leopard, they're all, for the most part, performance related. So for example, QTKit capture the decompressed video output, we've added a few APIs. First, we've added an API that enables automatic frame dropping.

A common problem that a lot of developers had with decompressed video output was if they were doing some kind of processing on their video frames and it was taking longer than the rate at which the frames were coming in, then they didn't do anything to drop those frames.

The memory usage of the application would just get out of hand 'cause the frames would just kind of pile up further and further on a queue, and your memory usage would grow and grow until your application crashes. So what we've told developers to do in the past is shuttle all of that work off onto another thread and handle dropping frames yourself, basically. So if you see that you're getting behind, just throw out the frames as they come in.

But we realized there are some applications where they don't really need that much control over their frame dropping behavior. They just wanna kind of salvage their performance if they're running behind. So we've added a new API that lets Qt Capture decompress video output do that work for you, so you don't have to worry about shuttling your work to a different thread.

In a similar vein to that, we've also added an API that lets you specify the rate of the video frames that you get, with a set minimum video frame interval API. And this is an opportunity, if your application doesn't need frames that often, this is an opportunity for the entire pipeline to be optimized, so you don't use as much bus bandwidth, you don't use as much memory, and it's just a performance hint for the API, basically. QT Capture File Output has gained similar types of APIs. It also has a way of specifying the frame rate.

We've also added an API for specifying the maximum size of the video that you're writing to disk. So, as you might know, a lot of the computers that we've shipped relatively recently have these giant HD iSights on them, built in, that produce these huge frames, and they're around 1280 by 1024.

And some applications might actually want that resolution, but often they don't want to, you know, hitting the disk with frames that big repeatedly is actually too much for most hard disks. So now you can control what's happening. QT Capture File Output has also gained a... A pause and resume recording API, so you can pause in the middle of a file and record it.

You don't... Yeah, you no longer have to do weird things like change files and then stitch them together. We do that for you. One note is that the minimum frame interval API in both classes is not currently implemented in the seed you have today, but you will see that in the future.

So now I'm going to move away from new APIs a bit and talk about some of the general things that you can do with QTKit Capture that really enable you to fine-tune what's going on in your capture session and also kind of understand what's going on. And one of the classes that's central to this is QTCaptureConnection. So I'm sure by now you've seen a lot of diagrams like this. We have, again, our capture session in the middle, and we have our inputs going into the capture session and our outputs taking the data coming out of the capture session.

What you haven't seen in most of these diagrams is there are actually these streams of data coming from the inputs into the session and coming from the session into the outputs, represented by these lines and arrows here. And in QTKit Capture, these are actually represented by a class called QTCaptureConnection.

QT Capture Connection serves three fundamental purposes. First, it allows your application to identify one of these specific streams, one of those lines or arrows in the diagram, and this can either be an identifier that's given to you or something that you give to part of the API. It also allows the application to actually control which of these streams of media are enabled at any given time.

So if an application doesn't need to take all of the data coming from a certain thing, you can disable it using QT Capture Connection. Finally, because QT Capture Connection is sitting on top of the stream of data, it can give applications lots of up-to-date information about what's going on with that stream at a given time, such as its format or other useful information about the data stream. So I'm going to go into some additional detail about how you can use QT Capture Connection to identify a specific stream.

So if you've gone through the headers, you're familiar with the API, you've probably seen a lot of QTKit capture APIs that either give you a QT capture connection or take one as a parameter. And when they're doing this, they are effectively using this connection to identify one of these specific streams. So we just have a good example down here. We have a QT capture connection outputting to a movie file output. And if your application does this, you can implement a delegate method called capture output did output sample buffer from connection.

And this method is called every time a new sample buffer enters the movie file output. So you can use it to precisely control what's going on. And say you're implementing this, and you really want to know, you know, I have audio and video going into this movie file output. Where did the sample buffer come from? Did it come from the audio connection or the video connection? And the way you do that is you look at the QT capture connection parameter in that method. That's why it exists as a parameter so you can identify the stream.

Another really common example is, and if you were at the last session, you saw a little preview of this, is if you want to set compression options on the movie file output, you need to specify which stream of media is going to get compressed, because the movie file output might be writing out multiple streams. And you use QTCaptureConnection to do that. So in this case, say we want to compress the video coming out of our capture session, we find a QTCaptureConnection object and say, use these compression options for this video stream.

So when you do that, when you need to pass a QTCapture connection to a specific API, you need to get that QTCapture connection from somewhere. And the way that you do that is both QTCaptureInput and QTCaptureOutput define an API called connections. And this method just returns an NSArray of the QTCapture connections that are currently associated with that input or that output.

So we have a few examples of this here. So for example, if you wanted to get all the connections coming from a device input going into the QT Capture session, you'd just call the connections method. And similarly, if you wanted to get all the connections coming from the session into, say, your movie file output, again, you just call the connections method. And the time to call these methods is generally after you've built your capture session up. So you have your session, you've added all of your inputs and outputs, because that's the time at which these connections will have been created and will be defined, and they're created implicitly.

So once you've gotten your array of connections, generally you want to find a specific connection in that array, so you want to narrow it down further. And the way that you do that is straightforward. You just iterate through the array and use some criteria to decide, is this the connection that I'm interested in? So here's an example, again, of setting compression options on a movie file output. And this is an example of what was being done in the last session and in many recording applications when you want to record to a movie file and compress it somehow.

So in this case, we have some video compression options that we've created. And what we're going to do is we're just going to iterate through the array of QT Capture connections owned by the movie file output. And whenever the connection is of the type video, we say, oh, okay, it's a video connection, so we want to compress the video, so we're going to apply those compression options for that connection.

In addition to acting as an identifier, QT Capture Connection also, as a representative of this stream of media, has a way of controlling whether or not any media is going through it, and also specifics about the media going through it. So one of the most common cases about it, and it's a question that's come up a lot on the mailing list, and we actually have a tech note about it now, is say you're recording from a DV or an HDV camera.

The interesting thing about these cameras is these give you a muxed media stream or a multiplexed media stream, which can contain both audio and video data. So when you create a QT Capture device input around one of these devices, it's going to expose two QT Capture connections, both a video connection and an audio connection.

So say you have a simple scenario like this where you just want to preview it and record it to a file, but you're not actually interested in recording the audio. Maybe it's a security camera application and there's no interesting things to record about the audio. It just wastes disk space. You just want to use this as a video source. So what you do there is you would disable the audio connection coming out of the QT Capture device input. That's why we have that big red X over it.

And what will happen is when you disable a connection, any connections that are downstream of that just simply won't get any data. So you've effectively prevented the audio data from making it to the movie file output right here. And so this is a way of kind of narrowing down and specifying exactly what you want.

And the way that you do this is QT Capture Connection defines an API called SetEnabled. And so in this particular example, we're showing in code what I did on that slide before, which is we're going through all the QT Capture Connections owned by the device input. And since we don't care about audio, if the connection happens to be of the sound media type, we just say disable it. SetEnabled now.

Finally, QT Capture Connection also is a provider of information to your application. So it will allow you to get a snapshot of what's going on with that specific stream of media at that given time. And the easiest way to show you in what ways it does this is just through a handful of examples. So, for example, if you want to display in your user interface what the format of the captured media is, so you have a device input set up and you want to show the user, you know, here's the format that's coming off this device.

Well, one way of doing that is you can get the QT Capture Connection's format description from that device input, and then QT format description has a convenience method called localized format summary, which will give you a nice human-readable string. And in this example, we just stick it into a text file, or to a text field, sorry. So that's one example.

Another pretty common example for applications is any application that records audio most likely wants to display a little level meter so that the user knows if they're clipping out the audio, if they're speaking too loudly or too quietly, or whatever they're recording, they'll have an idea of its decibel level. And QTCaptureConnection, again, exposes this, exposes as an attribute because it's an optional property of the connection, and it's called the QTCaptureConnectionAudioAveragePowerLevel s attribute.

And what this lets you do is it lets you obtain the power level in decibels of the audio that you're looking at. So in this case, we'll maybe have an NS level indicator that shows the audio level. And because it's in decibels, we wanna convert that to some kind of linear scale. So we take it as a power of 10, and that'll give you a nice, smooth audio level meter that you can query and update.

So in light of this, I've told you that QT Capture Connection is a way of inspecting the current state of a stream. But I've already kind of repeatedly said that these are things that can change often over time, and your application should deal with that. So even in the very simplest case, even if you trust your user implicitly and they know they're not going to try and undermine everything your application is doing, they're not going to go and unplug and plug in the device over and over again, they're not going to go and screw around with audio MIDI setup.

Even if you know that they're not going to do that kind of thing. Even then, when you've initially set up your capture session, and you've added your inputs and your outputs to it, the QT Capture Connection, no data has gone through it yet. You haven't started the session.

So it has no idea what the format of that data is or any properties of it, such as the audio levels. And so when you start your session, a little bit later, once that data starts coming through, it'll know. And so if your application is relying on that data, it needs to know when that information came in. And then, of course, once you've done that work, well, now you're also accounting for the case where the user is trying to destroy everything and is unplugging and plugging in devices all the time and doing other such things.

There are two ways of doing this, basically, and they're both common Cocoa patterns. One way of doing it is to use notifications, and QT Capture Connection defines a number of notifications. But one of them is the format description did change notification, and that will tell you whenever the format going through the connection was altered. And so one example of this that you can think of is DV cameras. Many consumer DV cameras have a toggle mode on them that switches between widescreen and standard.

And this is something the user can fiddle around with at any time, you know, so to switch the aspect ratio of the video. And that's a format change. It's a different video format, then. In addition, if a DV camera is a tape camera, the format of what's recorded on that tape can change.

So even if the user is not, you know, fiddling with the switch, if you're recording off of tape, that format can change at any time. And so responding to this, this notification will be posted whenever the format of the data going through that connection is altered. As an alternative, if this is what you want to do with your application, you can use key value observing. And again, it's just a standard Cocoa pattern. You add an observer directly to the QT capture connection for the format description key path. So it's roughly equivalent.

So this is useful for QT capture connection, but in fact, this paradigm applies to any QTKit capture object that is liable to change at any given time. And the reason for that is, again, QTKit Capture is dealing with a lot of live things like devices. So one thing that you might not have known, it's not called out very explicitly, is that anywhere where you see an API that uses attribute for key in QTKit Capture, the keys that you pass to that, they can be used with both key value coding and key value observing as a way of observing those changes. And I'll show you some examples of that in a second. So a very common example of that is QTCapture device.

So the FireWire EyeSight that we've shipped in the past, probably can't see it too well, has this little privacy iris on it. So you rotate this little thing, and it covers up the video camera. And this actually goes and talks back to the computer through hardware and says this is closed, so that applications can display in their user interface. They can say, you've closed the iris. That's why you're not capturing any video.

And the way we expose this in QTKit Capture is through an attribute on QTCaptureDevice called the QTCaptureDeviceSuspendedAttribute. And this is something that can change live at any time, regardless of whether the device is associated with a capture session. And this is something that you want to observe changes on.

You don't want to pull, and you don't want to do anything silly like that. So one way of doing that is to, again, use notifications. And QTCaptureDevice exposes a QTCaptureDeviceAttributeDidChange notification to do that. And then you can query the user info of the notification to find out exactly what changed.

As an alternative, you can use key value observing and observe the capture device directly. And in this case, what you do is you can use the attribute key directly as the key path on that object for that key value observing. So for any attribute that you see in a QTKit capture API, again, that attribute key is key value observing compliant.

So just to sum things up a little bit, even though out of the box QT Capture sessions and the inputs and outputs that you add to them often give you kind of the useful behavior that you wanted, they can be tweaked to do custom things. And in addition, we provide certain QT Capture inputs or certain QT Capture outputs that let you do custom processing that wasn't anticipated by the API.

And these outputs in particular are the audio data output and the decompressed video output. That's where you get your raw data to do some custom processing. You can also have fine-grained control in a QT Capture session by using QT Capture Connection. And you can either use that to inspect and identify media streams or use it to actually turn certain streams of media off.

So, if you have any additional questions or problems, come see us in the lab. That's going to be, I believe, at 2. Yeah, it's going to be at 2 this afternoon. So, come in with your questions. We'll help you out with anything you have. And, you know, if you just want to chat, we'll be there, too.

And again, for more information, ping Alan Schaefer. He will direct you to someone who can answer your question, and probably will answer a lot of your questions anyway. And check out all of our documentation and sample code, both at our developer page and also at the attendee pages for this session, actually. They're not up there, but you should be able to find it through the WWDC attendee site.