Graphics and Imaging • 1:08:37
QuickTime includes rich APIs that make it easy to create and work with audio and video media. Gain insight into video export components, clean aperture mode, multi-channel audio processing, high-resolution audio export, the audio context mixing path, movie audio extraction, audio context inserts, closed-captioning, and more. This is a must-attend session for developers using QuickTime APIs to create high-quality content for the iPod and Apple TV.
Speakers: Kazuhisa Ohta, Sayli Benadikar, Brian Pietsch
Unlisted on Apple Developer site
Transcript
This transcript has potential transcription errors. We are working on an improved version.
Good afternoon. Welcome to our session: Creating High Quality Content With QuickTime APIs. My name is Kaz Ohta. There will be Sayli Benadikar and Brian Pietsch joining me later in this session to cover some of the topics. So what we're going to talk about at this session is the kinds of things you can do using QuickTime to help create content.
Basically, when you create content it usually involves some sort of work flow. As much as shown in this diagram. You might have seen a nicer version of a similar diagram in the previous session. Basically you can start capturing material video content and process it, or possibly edit it. And then encode the content for delivering to your final destination, including things like iPod or Apple TV.
QuickTime can be a help in each of those stages. Today I would like to focus on a few areas. One is about audio processing. We covered pretty much the video stuff in the previous session. I would like to spend some time in the session talking about the audio aspect of QuickTime.
What you can do with the QuickTime API, like extracting uncompressed audio or playing audio effects. Then we're going to talk about closed caption, which is a new feature of QuickTime we added in QuickTime 7.1.6. We're going to talk about how you add closed captioning to QuickTime Movies and how you can play closed captions along with QuickTime Movie.
Then finally, we're going to talk about exporting QuickTime Movie. We're going cover some basic way of exporting QuickTime Movie into various file format, and we also cover a way to export to specific format, including device formats. One of the device formats like iPod and Apple TV. Now I would like to bring up Sally Benatika. She is going to cover the audio aspect of the session. Sally?
Thanks, Kaz.
Hi everyone. I am Sally. And in this first session we're going talk about some of the QuickTime audio APIs that you can use during the processing step of your content creation work flow So when we think about dealing with audio in QuickTime land, here are some of the APIs that you might think of using. On the left-hand section you have more of the presentation or the playback set of APIs. Things that you can use the control the rate, pitch, volume, et cetera during playback. Or metering APIs to get at the characteristics such as volume or frequent levels.
The data that is modified by these APIs isn't persistent. It doesn't save across exports On the other side you have more of the authoring APIs like the SC audio compression API set which you can use to configure your export settings, export compression settings. And then there's movie audio extraction, which is your way of getting at uncompressed movie audio data.
And sitting in the middle you have the audio context APIs useful for both presentation as well as authoring. The audio contexts, this abstract notion of an audio rendering of an audio environment. And every movie has its own audio context, and there's this concept of playing to a destination context.
So if you're doing regular playback your destination is an audio device context. If you're extracting, your destination is an extraction context. So today we are going to focus mainly on these two API sets, the movie audio extraction set. And in addition to the audio context APIs dealing with inserts.
So let's dive right into movie audio extraction. What exactly is this? This is your way of accessing uncompressed audio data of the movie. And you can choose to have this data such that all the data from all the tracks is mixed together. Or you can choose not to have it. So I like to make a distinction here between extraction and export. Export is when you're converting usually from one kind of compressioned format to another. Extraction is when you're converting a compressed format to uncompressed PCM audio data.
This is our recommended API. It was introduced in QuickTime 7, prior to which developers had to resort to all sorts of trickery and tedious combinations of things to get this end result. But since its introduction we'd like you to use this instead of any of the other alternatives. And I will show you in the next slide how easy it is to work with the movie audio extraction APIs.
Another cool thing about this, is that it's thread safe. So you can have little Elf worker threads in the background doing your extraction for you. There's a caveat here, of course, that the decoder that you're using for the extraction needs to be thread safe. And finally, if you are extracting using these APIs, the extraction is highly configureable.
By this I mean that you can specify what the output data is going to look like at the end of the extraction. You can specify the target format. By this I mean sample rate, as well as what specific kind of uncompressed data you want, which is either floater in 16 or 32-bit big endian, little endian, et cetera. And also the channel layout. You can use the start time property to specify a particular end point if you don't want to start the extraction at sample zero.
And the All Channels Discrete property allows you to completely disable mixing. So by default what happens is like channeled from the different tracks together. But if you set this property then you get all the channels separated in order. So that can be a useful thing. Let's look at some code at a pseudo code level.
Just to show you how easy it is to use these APIs. You can think of this as a four-step process. Step number one is begin extraction. Step number two is where you can configure your particular extraction instance. And the way you can figure this is by Set Property Calls. We have various properties for the various things I mentioned in the previous slide. So say you want to set a particular target format.
You would use the Audio Stream Basic Description property. You accept that property. The way you do this it usually you get a property, you modify the fields you want to change and set it back. So that's how you change the audio stream based on description. If you want to specify a particular layout that's the property you would use. You can use the current time property to settle a specific end time that's not time zero.
And the All Channels Discreet property to disable mixing. So that's step number two. So at the end of this you have a configured extraction section that you're ready to use. So what do you do? You pull from the data by calls to movie audio extraction fill buffer. Once you have the -- this happens in a loop.
So every time you get a buffer of data you are write it to fileor whatever you want to do it with. And you can go on until there's no more data to extract, or say you want to extract just a specific duration, then you keep track of how many samples you have extracted and get out of the loop once you're done.
And step number four, you just send the extraction. So you can see that QuickTime here has handled a lot of the intricacies related to setting up of the audio converter, setting up of mixers, et cetera. All that is abstracted away from you and you just get this really clean high level, convenient API to use. So we strongly encourage that you use it if your goal is to get the decoded audio movie data.
The second API said that we saw in the figure is the audio context inserts. And we've added a new -- this inside API is new in Leopard. And what exactly is it? It's a way to tap into the QuickTime audio rendering chain. And those tap points can be made during playback, as well as movie audio extraction.
Once -- once you have access to QuickTime's data along this chain you can do all sorts of interesting things with it in your app. If this is during playback you could just watch the samples go by and do things like data visualization, or -- if it's play -- either during playback or extraction you can get the data, add cool effects to it, and hand it back to QuickTime for the rendering. So what's cool about this is that until now QuickTime's audio chain has been sort of a black box. You haven't been able to get into it.
And this gives you points that you can actually tap into. And you can have -- you can attach inserts both at the movie level as well as the track level. The movie level works on all the audio data of the movie that's mixed together. Track level as the name suggests operates on just the data of a particular track.
So I thought a diagram would help here. So this is a general audio flow diagram. We have a movie with two tracks and they mix together and that data is then further sent to an extraction context, because this example deals with extraction. If we were playing back instead then that mixed data would be sent out to device.
So a little more detail in that same diagram. Let's say you have two kinds of tracks. One's stereo, the other has three channels. The data, the light channels mixed together to create what is called the movie summary mix. And that's the data that then gets sent on ward.
The movie level insert fits right about there. So what it gets as its input is the movie summary mix and the client does its own processing or effect-adding, and hands it back to QuickTime. That data then makes it on wards to the extraction context. It -- it's good to note here that your particular insert might have constraints like it can only do stereo to stereo effects. And that's perfectly fine. So during the set up of the insert you communicate to QuickTime that your insert expects stereo data. And so this is, that movie summary mix of left, right, center will be mixed down to a stereo before it's handed to your insert.
A track level insert fits further up stream. It only operates on the data for the particular track. And here, too, a sub mix can be performed by QuickTime if your insert has certain constraints. You can have only one insert at each of the top points. So a movie can have one movie level insert and as many track level inserts as the number of tracks. But this shouldn't really constrain you in any way because you can do as much processing as you want on your end. You just display it as a single insert interface.
And another caveat I'd like to mention here is that inserts do not work on predicted content. I'd like to now do a quick demo of -- where we deal with inserts during playback as well as extraction. Just to show you the kinds of things that you can do.
So just a quick -- I'd like to let you know what I am going to do in this demo. There's three things I'd like to show you of the movie level inserts during playback, track level inserts during playback, and track level inserts during extraction. And looks like our machine's up. And you might not see me behind this, but I am here.
So this is an app that we've written and it should be available to you as sample code as part of the stuff that's related to the session. So you can go take a look at it. I am just going to open a movie first. And actually just play a quick segment for you.
Brad's an engineer on our QuickTime team and he moonlights as a stupid movie star and a classical guitarist. So a couple years ago he did this stupid movie and we asked him this year to do kind of a voiceover, over the original, talking about what he's playing and his experiences doing it. Kind of like a director's cut version of that movie. So if we look at the properties for this, this is just a regular old movie with five sound tracks. I am now going to open the audio context insert panel.
And try some effects -- so the first thing I am going do is apply an effect to a movie. A movie level insert, basically, instead of a track level insert. And let's try a -- this section here is the insert configuration area. So I want to use an aband class filter as the processor for this insert. And I am going to specify that this insert expects a stereo layout and that it's going to produce stereo.
Notice how the four tracks of the movie are creating a summary mix that's quad but since this insert expects stereo data, QuickTime will be mixing it down to stereo before handing it to the insert. And I am going to start off bypass -- just so you know how it feels like without the insert. And then I'll bring it in.
- -
- four guitars was originally recorded by the Los Angeles Guitar Quartet on their Labyrinth CD in 1998.
( Music )
- Next I'd like to apply a track level insert I know for a fact that track number five contains Brad's dialogue track. So I am going to select that. And apply -
- let's see -
- a delay effect. And see what that looks like. So in this case we're only effecting the data going out of that track. So you should see that the rest of the audio is unaffected.
( Multiple voices speaking )
( Multiple voices speaking )
- Let's try a slightly funny effect.
- Of the four -
- common language -
- my kids to do it. My three kids --
- Those are his kids.
- I've got Milo there, who's now 6, Eliza who's 4 -
- that little baby Clara there is now 3 years old.
So the hardest part of doing this was keeping straight which shirts I had to wear in which chairs. And finally, all the shots of the band so that I remember --
Okay. So Brad with a slight chipmunk voice. I would like to now extract. So we just saw track level inserts during playback. So what happens if we extract with this effective play. So I will bring up the extraction panel here. And I don't want to extraction the entire movie.
I just want a little segment at the end. So I am going to set this to like -- and I can preview what I am about to extract just to make sure.
- And so we can now export this. So what's going to happen is during extraction we're applying this insert effect to that specific track. So I'm going to hit export, and let's just call this Extracted. And save it to the desktop. And we should see it somewhere here.
- -
- the bass guitar and the rhythm one, and it's a lot of fun on to put together. Four shirts on a hot day. I hope you liked it.
- So that's some of the things you can do with inserts during extraction and playback. I'd like to now switch back to slides please.
So how exactly did we do this? You want to apply Chipmunk effects to your own tracks. So how do you go about doing it? The general theory of operation is pretty simple. What you do is on your client app site you do the signal processing logic needed for your particular effect. You then implement three call backs that I will get into more detail in. That these call backs are going to be called by quick type.
And then during set up stage you register your insert with QuickTime. That's your time to communicate things like what you expect the in and out channel layouts to be for your insert and addresses of your call backs, et cetera. So for a movie level insert, this is what the registry structure will look like.
The first field is just a pointer to -- a pointer that QuickTime will send back with all your call backs just so you can I.D. your instance. The next four fields refer to what the insert expects on its input side. The channel layout expects -- and what it's going to produce on its output side.
And the last three are just addresses to the three call backs that you will be implementing. If it's a track level insert you provide all the information that you do for the movie level insert. So the first field is just this script above. And the additional thing that you need to communicate is what track you want to attach the insert to.
So what are these call backs that we're talking about? There's three call backs that you need to implement. The first is reset. The reset call back gets called right at the start of processing, and also every time there's an interruption in the processing chain. So this is the point where QuickTime communicates to your app the sample rate of the data it's going to hand to it, and also the maximum number of frames that will be pulled per render cycle. And your clients should be -- the client app should be able to deal with any kind of sample rate that its given.
The signal processing that you might be doing might be working with windows or samples, so it could be the case that you need a particular number of input samples before you can create meaningful data. So if you have that kind of latency, then the app communicates that to QuickTime as the latency figure.
And also things like reverbs might be creating samples beyond the end time of the so if you have a tail time, you can communicate that. And like I said, the reset gets called pretty much any time there's an interruption in chains, in the rendering changes. So you need to reset your buffers and it could be that time has moved and we are now playing backwards in the case of instances during playback. So you shouldn't depend on any state that you had previous to that.
The process data call backs is where the crux of the processing happens. It gets called for every buffer of audio rendered, and QuickTime hands the client data. You process it, add your effect, and hand it back to QuickTime And the first processed data after a reset is when the latency gets pulled.
So if in the reset call back you specified a latency, then QuickTime will pull that many samples to clear out the latency. And if this is an insert during playback then it will be called on the high priority isle thread. So make sure you don't do any expensive operations like memory allocation, taking locks, et cetera.
Here's just a general prototype for this process data call back. And as you can see, audio buffer list, audio time stamp, these data structures are the same as those used in Core Audio So the calling conventions for the inserts is very similar to audio unit conventions. So if you were to write your effects using audio units that would work beautifully with the insert API. But that is not a requirement.
You can roll your own processing logic and it should still work great. And the finalized call back is just a way for QuickTime to let the insert know that it's never going to call it again. And any resources that the insert might be holding on to, it can let go of now. It's optional, because depending on what your application -- how it -- the logic of your application is, you might have complete control over the state of effects that caused an audio context to go away or an extraction session to end.
In which case you don't really need to be told that. You know that it's okay to leave your resources. But in a more complicated scenario this could be a useful call back. So right in the beginning of the session we talked about extraction, then we talked about inserts. So how do we put the two together.
It's really -- extraction with inserts is just like regular vanilla extraction, but just a slight change. So in step one you still begin your extraction session. Step three looks just the same, where you're calling fill buffer until you're done. Step four is when you end the session, that's exactly the same.
So what's really changed is the configuration and what you do is you configure your session exactly how you want it to, but then you do some extra work. You set properties to register your insert. So if it's a movie level insert that you want to attach during extraction then you would set a property called register movie insert. And you hand in that registry structure that he went over earlier.
If it's a track level insert, you do the same. Except you set a different property and you pass in the track level registry structure. And that's pretty much it. So with this and just a combination of the two you can do quite a few interesting things with it. So this is like the end of the QuickTime part of this talk -- so just a quick summary. We went over two APIs. The movie audio extraction APIs, which helps you get at uncompressed audio movie data.
Inserts, with the help of which you can add effects during playback and extraction. And cool ways of putting the two together. And this might be useful to you during the stage when you're manipulating your data, processing it, et cetera. Before you do your actually export to creative content. I would like to now invite Brian up to talk about closed captioning.
( Applause )
Thank you, Sally. My name is Brian Pietsch. I am an engineer on the QuickTime team. And I'd like to talk to you this afternoon about closed captions in QuickTime and how you can take advantage of that in your applications for both creating the content, as well as playing it back in your applications.
So the closed captions with QuickTime in sticking with our theme that you may have heard in some of the sessions at the beginning of the week is that we're really taking advantage of open standards. We didn't want to reinvent the wheel with closed captions. Closed captions have been around for a long time. And there are several standards out there that cover them quite well. And we thought we would take advantage of those.
So initially, for this first support that we've added in the just recently released 7.16 we've added support for what is known as the CEA-608 standard, which is essentially the standard on which the FCC based their regulations. So it's what you would see coming across a television set in North America.
Commonly known as analog line 21 captions. ! You've probably heard them referred to as several different ways. But essentially, what that is, is you know, when you turn on a television set and there's captioning available, you see the standard black bars with the white text. So we really wanted to start with that in QuickTime, as that's the most common format used today.
The support in QuickTime for that is limited to a single field and a single channel. If you're familiar with closed caption technology you know that captions are available in up to four channels of information and you have options on your TV set to choose which one you want.
The first one which is the most common one is typically the direct captioning of what you've got. And then often times there's a second one which might be a foreign language translation, or a simplified translation for maybe a younger audience or something of that nature. Channels three and four are extremely rarely used. So I wouldn't even go into them.
So the support in QuickTime is done via a new Midi handler and a new track type. We could have gone and used the existing text track in QuickTime. But that would have meant, you know, going with a non standard type thing. And we really wanted to take advantage of the standard, as I mentioned before.
The nice thing too about this is that with a new track type and Midi handler, we can make captions, visual content, capable which text tracks are not. As many of you are probably well aware. That means that you can display captions in a visual context, enable the application, and get all the performance that comes along with doing that.
Current, captions are not searchable. If you're used to text, you might know that text is searchable in QuickTime, captions are not. And really, this is a first pass support for us. So of the main focus here is playback, and also the ability to create the movie. So we don't yet have a capture solution.
I just wanted to point that out. So I just want to show you something here that kind of illustrates the format of the data before we get into actually how to get this into a movie in your own application. One of the most common file formats that you might see this represented in, in a digital world once you've brought it off the Line 21 analog signal is something called a Scenarist closed caption.
And basically what that is, is a text based file. And you'll see by looking at it that you have a time code followed by a series of hexadecimal bytes. And you'll see a 2-byte pair followed by another 2-byte pair, et cetera. Line 21 signal carries 2 bytes per field in the NTSC signal. So each one of those pairs of bytes represents one field. And then the time code is the time at which that particular row of data starts displaying. And that will be useful when we look at how we create the content in a moment.
So I wanted to show you some pseudo code here, basically, that will show you how you can add the content to your own movies. And the great thing to note here is that there's actually no new APIs. No new data structures. You can do all of this with existing APIs.
There are some new constants which we don't yet have published constants in the APIs set for, but those will come soon. And you can note the four CCs here. So the first thing you're going to do is create a track media, which you would do in the normal content creation process.
And the four CC for that media time is C L C P for closed caption. And you'll notice that I am using a Time Scale of 30,000. 30,000 is a good NTSC based Time Scale. You can represent one frame in an NTSC world as 1,001 units in a 30,000 Time S0cale. Not required, but we do highly recommend that you stick with that.
Setting up the sample description is extremely easy. Like I said, there's no new data structures. You can just use a standard vanilla sample description. And the only field of note there is the data format, in which case this is c608, which is a representation of the CEA-608 standard.
Then the only thing left, really, is what is the format of the actual media sample in the track. And if you look here I've defined a structure -- I've defined this just for illustration purposes. There's no actual data structure in the headers that you need to follow. But this will just help to illustrate the layout of the data. And what it is basically is just an atom. If you're familiar with an atom, it's a size field followed by a time field and they be an amount of data following that.
So you'll see here as we've set it up that we're allocating a buffer that is big enough to hold those two header fields, the size and the type, as well as the amount of data that you're putting in there. You're setting the size to be the size of the buffer. Because it's important to note that the size actually includes the header itself. And you're setting the type for that particular atom to see that, which is caption data.
And one thing I would like to point out here too is just that -- to note that the endianess is important. QuickTime file format lives in a big-endian world. So you'll always want to make sure that even if you're on an Intel machine that you're writing out your data in big-endian format.
And then finally you're just going to add those bytes like you saw in the Scenarist closed caption file illustration earlier, to that buffer. And finally, you're going to call an Add Media Sample too, and the one important thing to note there is the duration that you're adding for that sample. And I've calculated here it's the data size divided by 2. Since there's 2 bytes per NTSC field. Times 1,001. 1,001 being the duration of an NTSC frame in a 30,000 Time Sca0le base.
Some tips and tricks. I probably mentioned a few of these as I was going, but they're worth mentioning again. The first one is to use that NTSC Time Scale. It gives you the most accuracy when you're stepping through the samples. The sample data must be in multiples of 2 bytes, since we're following the standard of NTSC and line 21 analog. You're going to have 2 bytes per field. So you want to make sure you keep that consistency.
As far as selecting the size of your QuickTime sample, your media sample, there's no specific limitation, but what we're recommending is that you do one caption, so for example, one sentence or one speaker's line per sample. That's a nice trade off between having a large number of samples. For example, if we were to do each 2 byte field as an individual sample, you're going end up with a lot of samples.
Versus doing everything in one giant sample where you will have a very small number of the media samples, but you've got a lot of data in it. So it's a nice trade off. It gives you a little bit more flexibility in terms of scrubbing through the media. QuickTime is a little bit able to handle that a little bit better. And you're able to seek around and pick up where you left off.
And to know it on playback as well that the Midi handler will actually interpret the samples as its playing through. So if you have a media sample that has, say, 2 seconds worth of data, as QuickTime is playing it through, it will -- if it's the type of caption that's, say, paints on as you're watching it, QuickTime will interpret those frames automatically for you. So it would look exactly as you expect it to look on a television set.
And then you can avoid edits when you're adding your media samples by simply extending the duration of the sample. So you notice in my previous slide that when I calculated the duration of the sample I calculated by taking the data sides and dividing by 2 and multiplying by the duration of the frame. Well, that was just, you know, that's the minimum size of that sample. But you can certainly make the duration longer and QuickTime will automatically ignore that extra duration by padding in null data.
That was how to create the content. As you see it was very simple. How, then, would you display it back in your own application. If you were presented with a movie that had this information? Basically what you're going to do is you're going to walk the tracks in the movie looking for tracks of type CLCP. That's that same content we saw before when we were creating it. And when you do find one of those tracks if you want to show the closed captioning on it what you're going to do is set a track property.
Put QT set track property API has been around a little while now. All you're going to do is set the closed caption property type and the display property, which is represented by DISP as a simple boolean value that turns it on and off. It's worth pointing out that that's not quite the same as enabling and disabling the track as you may be used to. That mechanism is more of an authoring type mechanism, whereas this is more of a user preference to turn on or off at runtime. And not everybody, for example, has access to QuickTime Pro with functionality to be able to turn the track on and off.
So I want to give you a quick demo that shows the process of bringing in the content from -- since the Scenarist closed caption format that I showed you earlier. And then playing it back in QuickTime Player. Should be on this machine right here. So included with the content of this session is sample code for a closed caption import component. And I'm not going to go into that code. But basically it shows all the things that I demonstrated for you earlier today.
Oops. Pardon me while I fumble around a little bit here. So as you know, QuickTime import components can be added to your library folder. So I've already done that ahead of time just to show you that it is there. And basically what that movie import component is going to do is allow me to bring in a Scenarist closed caption file. So if I just open this real quick to show you what's in it -- maybe. Maybe not.
Okay, well. Basically it's a text file that looks exactly what I showed you earlier. And what I can do then is drag this down into QuickTime and what it will do is open up using that movie import component And you will see a couple seconds into it we've got caption data exactly like you would see on a television set. So what I can do here is take this -- I have selected all it. I am going copy it.
And what I am going to do is open up a movie -- I've created these captions to go along with this movie. So what I will do then is I will add them to the movie. I'm going to save it. And because of the little thing with glare I have to save it and then reopen it. But what I'll do now is play this back for you. And for your enjoyment.
( Music )
- All too often Lazy Larries try to speed through the development process with a disturbing disregard for careful planning. But this industrious engineer has discovered the fun and gratifying results that come with strict adherence to standards.
- What's a standard?
- Travel to exotic locations.
Engage in civil discourse. Enjoy enriching presentations guaranteed to thrill, excite, and inspire. Regular delivery of exciting reading material. With easy to remember standards names too. Learn about the standard process. So never underestimate the importance of standards. After all, it was standards that brought us cheap, reliable electricity. Oh crap. I wasn't work like this. Somebody call the -- bleep -- governor. Get out of my way.
( Music )
( Applause )
So I thought that one would be appropriate given it's adherence to standards. Came from a few years back. So that's pretty much it for my portion of the talk. I will be in the lab down stairs after this session if you have any specific questions.
I know I kind of went through this pretty quickly. But definitely take a look at the sample code if that's something you're interested in. And I hope that you'll all be captioning your own movies pretty soon. I am going hand it back to Kaz now so he can go over some export information with you.
( Applause )
- Thanks Brian. It's actually good to have this kind of movie as one of our assets so we can use that as a part of demonstration without worrying about rights and all that stuff. Anyway, export. So what I am going to talk about -
- excuse me -
- is basically overview of export.
And how you can run export in an application in terms of a very simple code. And QuickTime export functionality can go into different kinds of file format. I am going give you an overview of what kind of file format you can export into, and when you might want to use one.
And how you can specifically use exporter to go to one of the formats. And that would include one of the device formats like iPod or Apple TV or even iPhone. And the last one I would like to briefly touch on the -- the stuff we've been doing in terms of export performance.
So what does export do? That's a very simple question. I mean, simply put, it's essentially convert -- converting one form of content to another. And that actually involves both the file format level conversion as well as the media format level conversion. As you already know very well, a content can contain both audio and video. And they can be in different formats.
And then depending on the file format you're going to, the file format itself may dictate a particular video format or audio format. So this export functionality basically handles a conversion of the both media format and the file format. Here's a simple diagram that out lines what's in the export system.
Clearly, you need video compressor, audio codec -- video codecs, audio codecs handles the compression and decompression of the media data. And we have associated piece of code that manages the codecs. And we also have facility to manage the file format like writing sample down into the file container, reading them back. Those kind of stuff.
And then on top of that, we have something called export component that basically bundles those functionality together and then provide you an easy way to have access to that functionality. And we also have something we call high level API, which is very simple, easy to use API. So then you can run export with very few amount of code.
Something I would like to point out about this architecture is that this is designed to be a plug-in component architecture. Many of the pieces are implemented as component manage component. For example, exporters, they are implemented as components. For each supported file format there's an exporter component. So, for example, we have an epic 4 file format exporter component. That is implemented as a component.
Similarly, video codec, audio codec are implemented as a component. So for each supported video and audio format, video or audio format, there is a codec component. This way you can do a mix and match of export component as well as the video and audio codec components. And that basically lent our support with a variety of file format, and then the video and audio formats.
So let's look at quickly what's happening in the export process. This is also a simple diagram that shows the -- basically the flow of data during export. Of course there are both audio and video data can be in the source media -- material. We have to handle both audio and video compression. Let's focus on the audio side first.
The source data in the source movie is -- needs to be decompressed. And that is actually done through the mechanism Sally was explaining. That was the audio -- I'm sorry -- movie extraction. And if there are multiple tracks -- with multiple audio tracks in the source movie theory decompressed and mixed down. And that will result in a single track of uncompressed audio data.
And this uncompressed audio data is then fed into the audio compression machinery that uses the audio codec of the type you want to use. And finally, the compressed audio data comes out of the codec. Is laid down to the final container. That is often times a file. And that is done by the movie tool box.
A similar process is happening in the video side as well. A video data in the source movie needs to be decompressed. And then if there are multiple video tracks they need to be composed to create a single picture. Or more accurately, a single stream of pictures. And then this is done by video decompressor and the appropriate composing machinery we have in QuickTime. And then that will produce a stream of pixels, or essentially pixel images, or pictures.
The series is now fed into the video compression machinery. And once again there's some mechanism to draw the video codec. And video codec will produce a series of compressed video data. And then this compressed data is laid down into the final container with the movie tool box. And in movie tool box, or -- depending on the file format, appropriate file format handlers associated with the exporter component -- that is responsible to manage interleaving of that video data, audio data, appropriately. Once again, this is something dependent on the file format.
So how do you run this export processing in the application? That's very simple. What you need to start with is a source movie, source content. Prepared as a QuickTime Movie. In order to have this you can either run the capture or you can read existing QuickTime file. Or you can import some of that format or materials. QuickTime has a series of -- a number of importers you can use to bring in media content in formats other than QuickTime Movie. You can use that. You can use one of those to ingest your source material into QuickTime.
Once you have sourced QuickTime Movie, then all you need to do is to call this high level API called ConvertMovietoDataRef. This is not a new thing. It's been around for quite some time. And then if you have used QuickTime, many of you have already seen this API, or used this API.
But I would like to just, you know, do a recap of how simple it is. This is the code you would have to use. Basically, this is a simple function code. Anyone -- you are passing it in as a source movie, which is either imported, or read into memory.
And the -- the last line shows the flag parameter you can pass in. Now basically tells this API to bring up the piece of UI. That is something you see in the QuickTime Player when you do export in QuickTime player. So basically you can provide a very similar user experience in your application.
So what this hollow way API does is the following. It basically presents the configuration UI as I said. The dialogue looks like this. It's a navigation sort of dialogue. And it has some customs controls on it. And basically you let the user of your application to choose the type of export or the target file format. And as I said there's one exporter component for each file format. So you can basically set up the export type by choosing the target file format.
And this also let's the user specify the location of the out put. Exporter component usually creates a file as a result of export. You can specify the location. Where you want the export component to put the final output. And then also there's a piece about -- and to bring up the exporter special dialogue, to manage compression-specific options.
And then one nice thing about this hollow API is it remembers what the user of your application chose though this dialogue. So when you invoke this function the next time around, it remembers what your user chose in the last call of this function. This information is saved in the QuickTime preference -- preferences. And of course, it runs the expert. That's the most important part. If the user pushes the same button it runs the export parts.
So as you can see, with the use of this hollow API, it brings up the UI. It let's the user of the application choose the export type. You may not need this much flexibility. For example, your application might have a particular destination format. Or your application might have some specific convention about managing files, like project folder or something along those lines.
In those cases, you want to specify those information through the API rather than let, you know, your user to pick up one time. Of course you can do it. Before going into that detail, I would like to briefly touch on the choice of file format you can make for the application.
These are the typical formats for -- as I said most important formats we support in QuickTime currently -- that includes QuickTime Movie, MPEG-4 file format, 3GPP, and in the Apple devices, that's the format that can go into, like, iPod, iPhone, Apple TV. And of course we can export to DV, which is the standard of DV file. Many of these files look similar, for example, MPEG-4 and 3GPP, they are both based on the ISO media file format standard.
And some of you might think that they're interchangeable. They are really not. So what I would like to say is that what you make a choice about the file format, do what your application exports to. You have to consider carefully, and you really have to know what your final destination is.
Okay. So then here's how you want to export without bringing up the dialogue. It's essentially similar. You see the function code here. What you have to do before calling this convertible into dataref file APIs is to find the exporter component itself. This is done through the standard component manager function, open default component.
What you have to specify is the type of export, which is an OS type. And then once you have that component open, you can pass that to the convert movie to data ref. And it's going to use this component to run the export. And then when you're done, you just close the component, as always.
Okay. And then there are a few more things I would like to touch on. These are what we call helping features. As I indicated earlier, this export functionality has been around for -- I would say a long time -- with QuickTime. And there are some features of QuickTime that are added recently. And in order to take advantage of those features you have to indicate your intention.
Two examples I have here. One is the high bred solution of the export. You have the capability of handling every solution you are given in QuickTime 7. And that include multichannel review, sample wave greater than 64 kilohertz. d And you can now use those materials as a source of export, or you can export into those formats that involve multichannels or a sample rate. And it also brings you some level of thread safety.
This is not a complete thread safety unfortunately. As Sally indicated, the codec involving the process have to be thread safe in order to be -- in order to have the whole export process be thread safe. So this is sort of conditional thread safety. But basically had what I -- I'm trying to say here is that with this solution Opt-in, you can potentially make your export process thread-safe.
The other thing is the aperture mode. Those of you who were in the previous session now know what aperture mode means. That basically control how your content would look in the -- when you open this content in QuickTime Player or some of the applications that basically plays the content through QuickTime.
And then not as critical when your source has clean aperture or your source is encoded in a format that uses no script results. What it does is to get the same effect as you -- as you would with setting the aperture mode in your playback application. This is to make sure you get the same result as you see during playback as a result of export.
So here's what you do. So I'm going to talk about the high resolution of your Opt-in first. So this uses QT set component property. This is a function that has been around for some time. This basically let's you set a particular prompt of a component. So in this case you are talking to the export component. You discovered that's actually the code I showed in the previous code slide.
And you can set this -- I can't even pronounce this name. It's K QT with the exporter property I D M as core enable high resolution and audio feature. This actually is a single word. I couldn't fit this into a single line, so I broke this into two lines. But essentially, it's a single property name. And then the data type is boolean. So you indicate this is boolean. You do this before running export. So units are discoded in the middle of the code example I showed earlier.
Okay. And then here's how you manipulate the aperture mode. It's a little more complicated, but the idea is similar. This time, instead of movie -- I'm sorry, instead of exporter component you want to manipulate an attribute of the source movie. So now you use QT Kit movie property, or QT Kit set movie property. What it is trying to do here is to save the current aperture of the source movie, and then set the aperture mode you want. Okay, another -- I have a nice highlight.
So that's the aperture mode you were using. In this example I said create an aperture. So let's say you want to create an aperture of the source movie. What you do is to save the current aperture mode of the source movie, and then set the clean aperture source movie. And then run the export. And then of course you want to reset the aperture mode of this movie back to the way it is before this code runs.
Okay. Export to iPhone, iPhone FTV. Okay, so many of you are interested in, I am sure, trying to export into -- you're exporting the content your application handles into one of those formats. This is actually very simple. But there are a few things I would like -- I would like you to know about this export. This is a new breed of exporters. And this behaves -- behaves slightly differently than the traditional exporters. They are basically not configureable. They don't have any settings UI. They are supposed to do the right thing based on the source movie, and based on the destination device.
And that actually overlaps to what I say. But they are optimized for the device's capability. Those devices run H.264 video. And then they use appropriate profile -- profile depending on the device. Unfortunately, they are somewhat different. And those devices use ACOD. They can playback ACOD. So these exporters compress audio into NFC audio.
And the other nice thing about this approach is that as the capabilities of the device evolve, we can update exporter to take it -- advantage of the evolving -- the new capabilities of the device. So you don't have to worry about adjusting your settings, saving the application, so on and so forth.
And one example of knowing the source movie and adjust the settings appropriately is to preserve the aspect ratio of the source. This has never been done in other exporters before. Those exporters look at the destination of the source movie and then try to fit the image into the maximum dimensions the device is capable of dealing with. That way you can get the maximum size without, like, squishing or distorting the content.
And it adjust the -- the exporter adjusts the data rate depending on the dimension of the picture and determines. So you don't have to worry about tweaking the data away, or tweaking the size, trying to preference the aspect ratio and all that stuff. And it also -- the exporter's also maintain better data supported by iTunes.
You might have already seen some of the meta data you can see in iTunes that come against the file iTunes plays. And if -- if your content has those meta data in your source, those exporters will preserve -- preserve and transfer over to the NS destination with appropriate trance coding.
Okay. So here's the code. It's once again very simple. And it's essentially the same that one of the example that I showed before. This slide incorporates some of the points I made earlier about this app and these app features, like high res audio and the clean aperture. Difference is that you just have to find the right component for your device. In this example, this is for Apple TV. And in the end the type of the components happen to be MPBH.
Currently, those component types are not defined in your public datas. We will make sure that these will be defined in the future datas. In the meantime I would tech node or developer sample code -- those kind of information -- will contain the list of component times that you need to use depending on the type of export -- I'm sorry -- device -- you want to export to. But essentially, this is just a matter of finding the right protocol. And this is the high res property and this is the source movie's aperture mode. And you're just passing the source movie into the component view configured. And the export. So it's that simple.
Okay. So I would like to brief touch on the export performance. There's some kinds of stuff we've been doing about export performance. We've been hearing a lot of feed back about H 264 compression in particular. You can get a high quality result. But often times it takes a long time. We're taking steps to address the situation.
Like trying to optimize the CPU usage, using asynchronous I/O, we're trying to take advantage of the multicore systems. So then you can have a scaleable performance depending on the configuration you have. These are current only limited to iPod, Apple TV, and iPhone export. In particular, compression including in it, with H.264, and AAC audio.
And this is actually available with the Leopard seed of QuickTime. Currently, it's only accessible in QuickTime Player. We're working to make sure that that will be accessible to the rest of the applications. Here's an example of kind of performance improvement you can get through this approach. The blue bars result from QuickTime 7.1.6, and then the orange, yellowish bars -- looks like orange -- they are from QuickTime 7.2.
This is a few builds before the Leopard seed. So the data may not exactly match with the Leopard seed, or by of the time we ship Leopard. But this is just sort of to give you an idea about what kind of improvement we are looking at. And this is using a RGB source going to Apple ) export and there are three data points: single core, two core, and then four core system. And this is the same export from 2UVY. It's that similar trend.
Okay. So quick summary. Basically what we covered was some of the things you can use QuickTime to help your content creation work flow And that includes audio processing that let's you extract uncompressed audio source and then apply audio tracks. And then we covered closed caption. Both creation and then playback. And finally I basically explained some basic concepts of export, (Inaudible) and some particular about going to device export.
More information, there's some documentation, code examples, these are available through the web site, your attendee web site. And there are a few labs. In particular, there's a QuickTime audio lab running in parallel to this session -- ( pause ) -- speakers including myself will be in the lab. So if you have more questions you can come and talk to us. And then there are QuickTime video lab tomorrow. Probably some of us will be there. And there are people from video group will be there so you can ask questions about video compressions and such.