QuickTime • 1:04:53
This session focuses on techniques for handling video and audio in your QuickTime application. Topics include media acquisition using the Sequence Grabber for capturing or processing, and playback of media using a video device such as a DV camera, media compression, video effects, and filters.
Speakers: Tim Cherna, Kevin Marks, Sean Williams, Tom Dowdy, Jean-Michel Berthoud
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good afternoon. Does this work? Good. Excellent. Hi. Welcome to session 602, QuickTime for Video-Intensive Applications. And this is when application became more video-intensive. This is PowerPoint running a QuickTime movie for all of those who were wondering. It's pretty cool. New this year at WWDC. My name is Tim Chernna.
I'm manager of the QuickTime Pro Media Group. We specialize in support for high-end video applications and audio applications and do the support for the DV codec and YUV SD and HD and Final Cut and iMovie and Adobe Premiere and a bunch of other developers. So that's kind of what we focus on.
Today's session, we're actually going to talk a little bit more about video and audio in general, things that you can do in QuickTime for any application that wants to become more video intensive. And the three areas that we're going to be talking about are acquisition, video processing, and video output.
Acquisition is basically capturing video and audio, and we're going to be talking a little bit about some of the new camera support we have in QuickTime 6 and Jaguar. Video processing, we're going to be showing you some techniques for doing cool things to video that you have in your movies or with your camera that you're capturing. And video output, finally, is the way that you can take the video and output it to an external video device.
Some of you have seen the overview slide, the graphic that we made. This is sort of showing the different components of QuickTime that exist in the system. And the ones that are highlighted are the ones that we're going to be talking about. So the standard compression, standard sound, the Sequence Grabber, and then some of the components that we have, which are the eat and spit, which are the import and export components, Vout, which is to send video to a device, and the digitizers. And with that, I'd like to introduce Kevin Marks, who's going to be talking about acquisition with QuickTime. Hi there. I just want to make it clear that acquisition is not about getting your company bought by Apple.
What I'm going to talk about is the Sequence Grabber, which is QuickTime's way of getting video and audio from the outside world into QuickTime movies so that you can use in other applications like you saw this morning. are all here. There's two parts to what I'm saying. I'm going to give an overview of how the Sequence Grabber works. For those of you who aren't familiar with QuickTime, this API has been around for about 10 years now, and it's served its time pretty well over there.
I'm going to give some introductory stuff about how that works, and then I'm going to give some details of some of the changes we've made for QuickTime 6 and Jaguar, which we have more interest to specialized developers who already know something about this. So there's a mixture of the two in this. And then we'll have some nice demos and lobsters and things.
So, QuickTime provides an abstraction that lets you capture video and audio and other formats. The Sequence Grabber is a generalized way of capturing time-based media from an external device. At the moment, we only provide video and audio capture components. In principle, you could add other ones if you wanted to, and if there are things that you're interested in that, let us know. The idea is that the Sequence Grabber abstracts this from your application, so you don't have to know anything about the hardware you're dealing with. You just say what kind of media you want, and let the user choose what they're going to capture.
The Sequence Grabber holds a series of channels. So in order to capture something, you create a Sequence Grabber component and you add channels to it for the kinds of media you want, video and audio channels. The channels are the ones that deal with the hardware, talk to the devices, and get the data out in a usable form and turn that into a movie. Any number of channels is possible. The constraint is normally the hardware you've got attached.
And the source code for this, the example source code, remains HackTV, which has stood the test of time as it's been source code for about 10 years now. It's been updated a bit for OS X, and we intend to update it a bit further in the future. URLs for source code will show up in the final slide.
So the basic way you would use the Sequence Grabber to capture something, there's a straightforward series of calls you make. You create the Sequence Grabber, and you create channels and add those to it. And then you would call SGSettingsDialog for each channel that you're interested in. This provides a very full user interface for the user to make complex series of choices about what's available. They can choose between different devices. They can choose the compression format.
And set other parameters about controlling the capture devices. For example, setting the input for a video camera, or adjusting color and that kind of thing. There's a large amount of user interface there that's constructed for you by the Sequence Grabber, so save you having to do that as well.
Once you've done that and you're ready, you call SG Prepare. This gives the devices time to start recording, to start buffering up data. It pre-allocates the files on the disk and gets ready, so when you call SG Start Record, it can start straight away. Then while you're recording, you call SG Idle at regular intervals. At the moment, we haven't got this hooked up to the Idle Manager, but the rule of thumb is call it at least as often as the frame rate you want.
On OS 9 and Windows, you would call this from the WaitNext Event loop or in a tight loop. On OS 10, you can use a CF1 loop to call this. And when you're finished recording, you call SG Stop. At that point, QuickTime will stop all the devices and then create the movie header that describes the file. This describes the media that's already recorded to the disk.
So moving on to some of the new things of editing QuickTime 6. These are mostly incremental improvements for the Sequence Grabber. The basic way it works hasn't changed, but we've made a series of small changes to update it and provide new user interface and better user experience in various places. And some of this is involved with adding new programming APIs to make life easier for the developers too.
The Sequence Grabber has a way of finding the devices that are attached. Up to QuickTime 5, that will just tell you the devices. If you wanted to find any more information about them, you had to open them individually and query them. We've extended that structure so it will now tell you what inputs the devices have available, and you can pass a new flag to get that. In addition to that, we've added some new utility function calls to let you adjust these input parameters without having to make direct driver calls.
And without having to make the different driver calls for audio and video that you had to do before. So these are listed here. There's going to be more details in the QuickTime 6 developer documentation. But the SGGetChannelDevice and InputNames lets you find out the device's name without calling device list, walking the device tree, and then making further calls to find the input name. And SetChannelDeviceInput gives you a channel-independent way of changing the input.
We've made some significant changes to the Sequence Grabber panels, mostly in UI terms. They're now much more modern looking. They've been updated to fit in with the OS X Aqua style. They're resizable, and as part of that, we need to give the SGPanels, which are the Video Digitizer's way of extending that user interface, as a way of telling us how we can resize them.
So we have a new call, SGPanel getDittlesize, where we pass you the size that we want you to fit the dialog into, and you return the dialog layout for us. And there are some additional... You don't have to support every size. If you support largest and smallest sizes, we'll interpolate the dialog elements for you. and we have another utility called SGGetChannelRefCon because that seemed to be missing from the API.
A key change we made to the Sequence Grabber panels and the settings is that we now store far more information so that we can properly identify the VDIG. The previous assumption in the Sequence Grabber was that video digitizers were on cards in your machine. You'd probably have one of them, and you wouldn't change it very often. In a world of USB cameras and FireWire cameras where devices are hot-pluggable, this isn't true anymore.
The devices can come and go, and we need to keep track of more information to find out which one's which, so that when you save and restore settings, you get the same camera back you got before, even if you got two or three plugged in. So we now store extra information, but we're still careful to... If we can't find the camera we're looking for, we'll fall back to a different camera, depending on what camera you've got. So you can still use... You can use the settings call in a general way and we'll cope with what you have. But if you've got the same camera you started with before, we'll make sure we get that one back again.
As part of these changes, we've made some changes to the way the Sequence Grabber interacts with the Video Digitizer. For those of you who are developing Video Digitizers, there are some Small changes to the way this API works. In particular, the recommended video digitizer API is to use the VD compressed one frame, a VD compressed done call. VD compressed done second parameter was defined as a Boolean saying, yes, I have a frame, meaning one, zero, I don't have a frame.
We've redefined this to be the number of frames that you actually have queued up inside so that we can make further calls to pull only out to deal with variations in idle timing and that kind of thing. The notion is that the VD is doing the buffering and the Sequence Grabber will pull out the frames that you have. And as an application, you can get access to this information by setting an SG grab compressed complete proc, calling through to the Sequence Grabber, and then looking at this frame's value.
Another change to this interaction that we've added is that previously the Sequence Grabber and the VDIG, the Sequence Grabber kept the VDIG in the dark. The VDIG didn't know whether it was being recorded or previewed or what was going on at all. It just had to handle giving the Sequence Grabber bits, and the Sequence Grabber would deal with what it did with them afterwards.
With the kind of VDs that we need now when you're plugging multiple ones, they need to know when they should allocate bandwidth on the USB or the FireWire. They need to know more information about what's going on so they can make sensible decisions themselves. VD developers up to now have had to use heuristics about our call sequence to try and guess what's going on here. So we've added a new call, VD Capture State Changing, which tells the video digitizer when we're changing from a preview state to a recording state to a stop state. And there are additional flags in there to provide further information.
Two of these flags are accessible from outside for applications to pass through. This is SetGrabLowLatencyCapture. The idea of that is for applications such as video conferencing, live streaming, or real-time image processing, where you're not concerned about getting every frame, you just want to make sure you get a fresh frame. If you set that flag, that will be passed through to the video digitizer, and it can adjust its buffering accordingly.
The second flag, SetGrabAlwaysUseTimeBase, is designed to tell the video digitizer that it should really pay attention to the time base it's passed in, rather than just round off every frame to the same duration, which is a technique that many VDICs have used in the past, which fits better with the way broadcast thinks about video as a series of uniformly spaced frames.
But if you're trying to synchronize that to another source inside the computer, it's more useful to know the exact start time of the frame, rather than round it off, because you find that over time they'll diverge. And if you're running a security camera application 24 hours a day for a month, any small divergence will be bad after that time.
Other changes we've made to the Sequence Grabber and Video Digitizer interaction. We've made it easier for Video Digitizers to support more than one device. Previously, because we always used the component name as the thing we displayed, you effectively had to register one component for each device you had. And that was particularly cumbersome under 10, because you end up registering them with every application you launch, rather than doing it globally. So we've added a Video Digitizer call, VDGetDeviceName and flags.
which is designed for VDIG to dynamically report what the name of the device is currently got plugged in, rather than having a generic name from the VDIG. So this is particularly useful for hot-pluggable cameras, where you could have one generic VDIG supporting many different ones, or you could come up with a user way of naming the devices and then report those names back if you wanted to store preferences like that.
Additionally, we've added these unique ID APIs to help you identify a specific hardware device. FireWire devices have unique IDs by default. USB ones don't. As part of the standard, they can have those added in ROMs or whatever. You may have a private API to query those. The point of these is for the persistent settings that the Sequence Grabber stores.
The Sequence Grabber will call VDIG at unique IDs and ask for the unique identifier for the device and the input in use, and it will store those in the preferences. When it wants to restore that, it will call VD_SELECT_UNIQUE_IDs. Note that it's select, not set, because we can't change the unique ID. We just want to say, if you have a device with that ID attached, please start using it.
Back to a more general piece. When we moved to OS X, we deprecated SND record to file. Well, we had actually deprecated it a while back on 9, but we didn't implement it at all on 10 because it didn't fit in with the way that OS X works.
The recommended way for capturing sound is to use the Sequence Grabber with a sound channel. In QuickTime 5.04 on Mac OS X and now in QuickTime 6 on all platforms, we've adapted the sound recordings to use much better sample rate conversion and to allow bit depth and channel conversion so that you can record from, say, a stereo 48 kHz source at 8 kHz mono.
[Transcript missing]
Other enhancements we've made to video capture, as was mentioned in Tim's keynote this morning, we've added support for the iIDC class of cameras. This is a general spec for FireWire cameras that you can plug in different ones for multiple manufacturers, and there's a wire format defined for what they're sending.
The data is not compressed, but it can be YUV or RGB, and we've got some broad support for that. In addition, as I said, we've changed the settings user interface to add a series of features, which we'll demonstrate for you shortly, to bring it up to date with the rest of the OS. And there's some extra features specifically for supporting the iIDC cameras. So I'd like to invite Sean Williams up to show us these new IDC cameras.
As Kevin mentioned, Jaguar, we support a new class of firewire devices called the IIDC cameras. It's a mouthful of an acronym. It stands for something like Instrumentation and Industrial Digital Camera. But the spec is flexible enough that it provides for a wide range of devices, ranging from lower-priced consumer webcam-type cameras for under $100 up to higher-level $1,000 3CD fancy optic devices.
So I'd real quickly like to demonstrate that for you all. Since it is firewire, as you can imagine, you can plug up to 60... 63 of these devices into your computer, and we'd like to essentially let you digitize from all the cameras which you have firewire bandwidth for.
So I'm going to quickly show the different scenarios in which multiple cameras can be used under Jaguar with QuickTime 6. So the first thing I'd like to do is show you a single application
[Transcript missing]
and you'll see one application using multiple inputs. Similarly, there's no reason you need to limit yourself to just one application. Let's go back to single window mode here.
Here I have one application again, grabbing from one camera. And I'll just start up a second application. And you have two streams running simultaneously, different apps. As long as I had, I could conceivably launch several more applications and be capturing from all of them concurrently. So finally, you can imagine applications such as maybe stereoscopic imaging or something like that where it would be useful to have one application simultaneously capturing from two different cameras.
And I've got a quick little demo of that sort of operation going on here. And as you can see, a single application could open up multiple video channels and concurrently be grabbing from them. So you can imagine different applications for that in the future. Finally, Kevin mentioned that the Sequence Grabber settings dialog has been revised to become more Aqua compliant, so I'd like to show you that briefly.
The first thing you'll notice is that as opposed to being a pop-up menu-driven interface, we now are a tab-based interface consistent with the look and feel of other OS X applications. Similarly, for the source menu, instead of being a pop-up, it's a scrollable list so I can just seamlessly toggle back and forth between devices.
The observant few, or many in here, would notice that there's a new set of panels that are available. Let me quickly jump over here. In addition to the existing compression, image, and source panels of the past, we've added a new color,
[Transcript missing]
There's a, in addition to the standard live preview window, we have another preview source where you could, for example, be looking at the compressed image. If you needed to, you could throw it into a vectorscope mode and quickly see...
[Transcript missing]
So that pretty much covers the IIDC and the new Sequence Grabber panels. So thank you very much, Kevin.
introduce Tom Dowdy to talk about video processing. Thanks, Kevin. I'm taller than Tim is. One of the most common questions that we get asked when talking about video capture or movie playback is how to perform processing on the video. and it's a reasonable question for developers to ask.
There are lots of ways for QuickTime to perform operations on video, either video you've already captured or video you're in the process of capturing or video that you're in the process of playing back. And it can be confusing to find your way through all the APIs. So we thought we'd take some time today to talk about the various techniques you can use to get your hands on the video, perform modifications of it within your application.
So the first type of processing that we think is appropriate for people to consider is processing live capture. This is a situation where you may have a camera that's being digitized coming through QuickTime, and the compressed data can be given to your application before anything else is done to it. This is done through the application's SGDataProc, which you install. At that point, you can decompress the video, display it to the screen, you can process the video, save it to a file, or a combination of the two.
So what's good about this? Well, the good thing about this is it takes place live, interactively, while the user is digitizing the video. What's bad about this? Well, it takes place live, while the user is digitizing the video, which means that if your processing algorithm takes a long time, you can cause dropped frames to take place during the capture.
This isn't really an appropriate sort of thing for you to be doing if you're doing pro-level capture, but if you have a capture application that's for lower data rate or smaller size video or situations where perfect frame capture is not necessary, for example, maybe you're doing a preview capture of a tape, a roll-through, the user is going to do some adjustments, they want to take a look at it, and then they're going to later go make a second pass through the video, do a real capture.
It might be okay to do this kind of adjustment at that point. It's also fine to use for operations like webcams or where you might be performing detection on the video stream which is coming in, and you're going to, based upon the detection, do some actual capture. So, you're going to be able to do that. So, that's the first action.
The source code that demonstrates how to do live capture processing, the SGDataProc sample, which used to be known as Minimung, there is one really important thing about this sample code. If you've previously downloaded samples of Minimung, you're going to nod, right? If you've previously downloaded examples of Minimung, you should make sure you download, and you, and incorporate them in your application, you should make sure that you download the latest sample code and check it out because there was one very small bug in the Minimung example code that could cause some problems in your application depending on what you're doing with the data.
So, enough of that. How about a demo? Okay, take it away, Kevin. Okay, so we've had a lot of requests for people to do live demos, and one of the most unusual ones we heard of was from someone called Robert Huber, who wanted to use the Sequence Grabber to detect lobsters in a tank. So we thought this was an interesting approach for the demo. So... I'm going to launch the Sequence Grabber, choose the camera here. Wrong one. Oops. and even I've got the Lobster Detection Program running here.
[Transcript missing]
Want to tell us how that works, Tom? Sure. I suspect that the person who is doing scientific detecting of lobsters in a tank is using a slightly more sophisticated algorithm than the one we used here, which we're using the SG data sample.
are all involved in this. The first step is to capture the video. The data comes in, we decompress it into an off-screen. At that point we scan through the video, converting it into YUV, and looking for colors of red that are present in lobsters. And when we detect that there is enough red in the frame, we say, "Hmm, there must be a lobster there." We record the left, right, top, and bottom area where the lobster is.
That forms the rectangle that's drawn on the screen. And then based upon the area of the screen that's covered with the rectangle, well that's how big the lobster is and how much melted butter you would need. It's a recurring theme in the QuickTime team, lobsters for some strange reason. Slides, please.
Okay. So the next type of question we get all the time is, "Alright, I don't want to process the movie as I'm capturing it. I want to capture the movie and then later I want to do some processing on it." So, what goes on here? You've got a movie, it's saved out on disk somewhere, you read the movie in, it's going to come into QuickTime.
You direct that movie into an off-screen. QuickTime will give you the raw pixel information. It will go into the off-screen. At that point, you can perform whatever processing you want to do on the off-screen. For example, here we're placing a text overlay on top of the video. You can then send the data back into QuickTime, have it be recompressed and exported in whatever format QuickTime supports, in this case an MPEG-4 file, and it gets written out into a brand-new movie.
So, what are the pros and cons of this approach? Well, one of the pros is that all the frames get modified. Since you're processing them one at a time, you're sure that they all go through the system. There's no time constraints here. If you have an algorithm that takes one hour to process every frame, that's just fine. Every frame will still get processed. None will be skipped. It might take a while to do the whole video, but they all will go through.
The downside to this is it's not real time. And that's on both sides of the fence. If your algorithm does take a long time, it's going to take greater than real time for the movie to be processed or displayed to the user. Similarly, if your algorithm is faster than real time, you're going to proceed through the movie quicker than in real time. Once again, that might be confusing or annoying to the user.
Also, because this process is potentially lengthy, it's difficult to have more than one setting for the user. Typically, you set up the things at the beginning, and you do a batch process, and you go through all the frames. This isn't a really great thing for interacting with the user, and they want to change things as they go along.
But what are some things you might do here? Well, the real basic one is if you're just recompressing the movie from one format to another, that's a great thing to be doing here. But before you do that compression, you might want to be doing a 3-2 pull-down, you might want to blur or sharpen the image, deinterlace it, do all those other sorts of pixel image processing operations you might want to do on video. The canonical source to this, which has been out there for a while, is called Convert to Movie Junior. It shows an example of how to do this basic operation. But let's have a demo instead.
So what we're going to do for this demo is add an overlay onto the movie. So imagine you just made this great movie, you wanted to send that to someone for approval, but you wanted to make sure it had your overlay on it so they wouldn't use it before they paid you. So I'm going to open up this Jellyfish movie that I filmed earlier. choose the compression I'm going to use, in this case, JPEG, and run the processing.
[Transcript missing]
and the rest of the team are working on the project. Tim Cherna, Kevin Marks, Sean Williams, Tom Dowdy, Jean-Michel Berthoud Couple of things to note about that demo. First of all, when you're doing video compression, Kevin brought up the standard video compression dialogue and the options that were there.
If you have an application that's doing video compression, it's really strongly recommended that you use the standard compression dialogue to let your user configure the options. By doing that, you make sure that whenever we add additional options specific to a particular codec, they're available to the user to tweak the settings.
If you need to do batch processing or you need to have scripting type interfaces to this, you can also do the same settings via code, but it's always preferable to have some way for the user to get to the settings via the dialogue and save them away so they can later do them. This is, like I said, the preferred way to do recompression or compression of data. And once again, the convert to movie junior source code shows example of how to bring up the dialogue, how to run this compression and put the data back out into a movie.
[Transcript missing]
The third type of processing that people want to do on movies is to process a movie in real time while it plays back. Now there's a feature of QuickTime movies that's been around since QuickTime 3.0 called the QuickTime Effects Architecture, and certainly it's worth taking a look at that if you're not aware of it.
We're not going to focus a lot on it today, but we're going to talk about it briefly for those of you who aren't familiar with it. The QuickTime Effects Architecture is a way to operate on tracks, a track or tracks, and do image processing or effects on those tracks. For example, transitions between one track or another, or a traveling matte, or a gradient wipe, or a blur or a sharpen.
The QuickTime Effects architecture, however, operates on individual tracks, not entire movies. If you're interested in how to do processing of movies using effects, you can look at the Make Effect Movie Source Code that's available on Apple's website. Today we're going to focus more, however, on processing the movie within your application.
So this is a type of processing that operates on the entire movie at once. That is to say, the whole frame, which might be composed of multiple tracks. An example of this that you may be aware of today is the Video Adjustments feature, which is inside QuickTime Player. This feature is implemented by using the techniques that we're about to talk about.
So how does this all work? You have a movie, and QuickTime is going to play the movie back. Your application registers a draw complete proc and directs the movie off screen. QuickTime then decompresses the movie, producing raw pixel data, and calls your application back whenever there's actually a change that's taken place in what should be displayed to the user. You can then take the resulting data, you can display it to the screen, you can process it, save it away, do both, whatever is appropriate for the type of thing you're trying to do. Pros and cons again.
are all involved in this. The user can interact with the playback and it works on the whole movie at once, no matter how many frames there are. There is a downside to this. And that's that as the movie is playing, QuickTime will use its normal frame skipping algorithm to skip frames if your processing or the video playback is not able to keep up with real time. That's so that the video stays in sync with the audio.
The other downside to this, it can be difficult to package your particular processing code up because it's taking place within your application. So if this is something that needs to be available in multiple applications or needs to be able to be delivered independent of the movie itself, the effect approach would be more appropriate.
Like I said, a good example of use of this is live movie adjustments or adjusting the various video settings that are done in QuickTime Player today. The example code for this is called MovieGWorlds and it's been out on Apple's QuickTime example code site for a while. How about another demo? Okay.
So... What we're going to do here is play back the jellyfish movie again and adjust the color while it's playing. So it's playing here. You can see it's drawing a histogram of the red, green, and blue in the picture. Say we wanted to get rid of that blue and gradually turn it down so that the sea is green.
We can do that by adjusting the range that the blue occupies on the fly like this. And again, if we didn't like the red, we can turn that down and make the jellyfish greenish too and put the blue back to full. We've got a sort of reddish jellyfish. We can even completely mess up the thing by putting the...
[Transcript missing]
and others.
So you can see there's a fair amount of processing going on there. I mean, if nothing else, just displaying the histograms is something that might be useful to the user. But here we're actually allowing the user to adjust those things. And we're still achieving very good frame rate movie playback.
As I said, there is the possibility of frames being skipped here. That doesn't mean they have to be skipped. This machine is more than capable of keeping up with this. And I assure you that I'm not using particularly sophisticated technique for drawing that graph. So we'll go back to slides now.
Okay, so we talk about processing the video. So what do we mean by processing the video? How can you actually perform the processing? Well, as I said, one thing you could do is QuickTime has a number of effects that are built in: blurs, sharpens, film noise, transitions, et cetera. You could take advantage of these effects and call them from any of the processing techniques that I just outlined.
Another common thing to want to do is to write some custom code. If you want to look for lobsters in your application, you're probably going to have to write some custom code. I don't think the lobster finding algorithm is going to make its way into QuickTime any time in the near future, but you never know. So you're going to have custom code in your application to perform the processing. The biggest issue here is if you want to be able to... Excuse me.
That was great. Okay, so the biggest issue here is if you want to be able to perform your processing live on video as it's being captured or on movies as they're being played back and you want to have the user have a good experience, you need to concern yourself with performance.
But even if your application is just a movie grinding type application, you're wanting to perform processing on a movie and save it out, you still should think about performance because if your particular movie, as it processes the movie and saves it out into a movie, takes four times real time, well, your users are going to be a lot less likely to want to feed an hour's worth of video through that thing.
So performance is always an issue. So how to deal with performance? Well, the first thing is just do your basic optimization, right? Make things faster. That's always a good idea. But you should also take advantage of things like the hardware, such as the velocity engine that's available in the G4.
A lot of pixel processing algorithms lend themselves very much to use of the G4 velocity engine. Similarly, almost all these algorithms have some amount of performance. So if you're going to be able to do a lot of processing live with a G4, you want to take advantage of the fact that you're able to use a lot of the processor separability to them. So being able to process a single frame of video using multiple processors via threads is a very smart idea when this is the kind of thing you're talking about.
A third type of processing that you might not have thought of is using the OpenGL hardware. There's some sessions here that talk about integrating OpenGL with QuickTime. Unfortunately, the session was just before this one, so some of you might have missed that. But in the session, they talk about using the 3D engine as essentially a math engine to perform some of this processing for you. Not all algorithms are suited to this, and it can sometimes be very hardware-dependent, but once again, that might be appropriate for your type of image manipulation.
Another thing that's often an issue in this realm or this space is how to actually get the data back to the screen. Surprisingly enough, I mean, often you would think that that would be obvious, but it's not. And part of the reason it's not obvious is there's a bunch of different ways to do it.
The simplest method, and if you don't have complicated performance requirements or special needs, these are probably listed in the order of simpler to more complicated, are the ways in which you should deal with this, is to make use of the 2D graphics calls. For example, copy bits if you're familiar with the Carbon World, NSImage if you're a Cocoa developer, and that's the type of mechanism you want to use.
A third thing that you consider, a lot of people aren't aware of this, but it's very easy to make a decompression sequence within QuickTime that can perform the 2D graphics operations for you for transferring data from an off screen onto the screen. The big advantage of using a decompression sequence is that with a decompression sequence you're able to factor out the setup portion of your call from the actual drawing portion of your call. This can make the subsequent draws when you're processing multiple frames much faster.
Another way to get data to the screen is to build in your application a decompression component that performs the video processing and at the same time draws the data to the screen. This is a very powerful technique as used in several of the demos that we did here, because you are pipelining the processing of the pixels along with the drawing of the pixels to the screen.
It's also a very easy way to deal with all of the various frame buffer formats, clipping, cursor hiding, flushing of the screen, et cetera, by simply putting your code inside of a decompression component, using it within that environment, all those issues are dealt with for you by QuickTime.
The third way to get the data to the screen is to take the QuickTime movie that's in the G world and place it in an OpenGL texture and then use the normal OpenGL calls to render that texture to the screen. This has a great advantage because OpenGL, assuming that you've set up your textures properly and they're aligned appropriately and everything and the hardware is set up so that it's doing all the things you need it to do, OpenGL is able to asynchronously DMA the memory directly from the buffer you have up onto the card.
This allows you to essentially pipeline additional processing while that transfer is taking place. There's two examples on Apple's website that are worth taking a look here. One is called Fast Textures, this is in the 3D section, another one is called OpenGL Movie. These both show techniques for taking data off screen and transferring it quickly on screen by using the OpenGL engine. How about more demos? We always like more demos. Everyone loves demos. Well, maybe not everyone.
Okay. So, we've got a... We showed various effects. We showed a single effect for a Sequence Grabber, for Movie Conversion, and for Playback. What I'm going to do now is show the effect you choose isn't dependent on the way you're doing it. So I'm going to show Sequence Grabbing with an alpha overlay.
So here you can see what it would be like if I had a lobster on my head. In this case, we're capturing the video with the Sequence Grabber, drawing it off screen, compositing that with the lobster image we used before, and drawing that back to the screen. And you can see the frame rate is keeping up pretty well because the Sequence Grabber is dealing with getting the data in the right format and getting it to the GWorld for you, and then you can do pixel processing after that.
Now we'll do the color adjustment on the Live Sequence Grabber now. So this is coming in. I can make myself more red. I can... Take out the blue. I can posterize myself. Make the screen go a funny color behind me. come up with this nice sort of 1970s-looking effect here. And again, this is doing the same pixel processing operations we were doing before on the movie as it was playing back, but we're doing it on the GWorld that's being fed from data from the Sequence Grabber, because the two parts are separable.
This is an example of applying a QuickTime effect. So... The film noise effect is being applied over the top of me, so I'm going to black and white. There are film scratches coming up. are all involved in the project. Tim Cherna, Kevin Marks, Sean Williams, Tom Dowdy, Jean-Michel Berthoud and as we have two IDC cameras attached, we can add a QuickTime effect that takes two sources and blends them together.
So we can have this nice background here, and then we can bring up the heart around, head in the right place. I can maybe respond to it, Tom. So we can see we're performing this effect on there. Both cameras are actually capturing that's going on here. And similarly, we can apply these same filters to the movie playback. So for example, we could do the movie playback of the jellyfish with the lobster over it in real time. The processing stage is independent of the way you get the data in and out. Okay. Thanks, Kevin. Okay, now to Jean-Michel.
Thank you, Tom. So Kevin talked to you about digitizing video. Then Tom talked to you about processing this video that you digitize on your computer. So the last step you might want to add to your application is the capability of outputting these movies to a specific video device. And that's what video output are all about.
So originally QuickTime was capable of playing movies only on your computer desktop, and that's why we had to come up with this video output component in order to support different devices like Analog One or Digital Now. But basically the idea is to be able to send this video to a device which is not a computer desktop. So that does include third-party I/O cards, fire-wired devices, and any other peripherals that third-party could add to your system.
Later on, we're going to show you some demo of this video output stuff, and the sample code is on our website, and it's called Simple Video Output. It's a simple application. You can use the code. We did add an old word post-it with the same name, so you might want to check it out for the updated version we have now.
So movies plays on the G-Wall anywhere, whether you play it on your desktop or on your video output devices. So this video output component provides the G-Wall so you can set them up and point to them. The thing you have to understand is that only QuickTime can use this G-World. It's not known as a QuickDraw G-World, so you cannot use QuickDraw and do a bunch of drawing on it. You have to use QuickTime in order to draw to this specific G-World.
Video output components provide some information about their capabilities. They might be attached to an audio output device as well, so you might want to figure out what's the hardware setup to play audio and video on the same device. They can provide some clock components so they maintain the synchronization between audio and video. Some of them support what we call the EchoPort, which is basically the capability to output to this specific device and to display a kind of preview on your desktop computer at the same time.
Video outputs are just components. You find them like any regular component. You use the FindNext component, and the type of this video output component is QtVideoOutputComponentType. One thing you have to be careful of is that in order to have this... Video output component, QuickTime has built in a base video output component, which is always going to show up in the list of video output component available in order to take it out of your list so you present to the user only the real device available on the system. You have to set this flag mask called KQT video output don't display to user, so we take it out of the list of available devices.
So the way you are getting this device's capabilities is to use three APIs. The first one is Qt Video Output Get Display Mode List, which provides a list of all the different resolutions of timing that this device supports, like, for instance, NTSC, square pixel, non-square pixel, everything this device can support, they should list them as a different mode in their list.
The way you figure out about the... The sound capability of this video output device is to use the QtVideoOutputGetIndexedSum output. And this will tell you if this device has a specific audio device attached to it or not. And the last thing is the QtVideoOutputGetClark API, which lets you figure out if this specific device needs or doesn't need a clock.
So how do you select this video output component as soon as you find it? Basically, it's a two-step process. The first one, you have to set up this video output component and tell it how you're going to decide to output your movies. So you're doing that using the video output component itself. The second step is to tell the QuickTime movie toolbox that the movie you had somewhere is going to be pointed to this video output component.
So there is four APIs in order to set up the video output component. You have to tell this video output component what display mode they have to use. You have to start it using the QtVideoOutputBegin API just to tell the hardware you are about to play a movie on this device.
You have to get this GRO from this component as well. And we're going to see how we use it later on. The last one is to tell this device if you want to have an echo port or not. In this example, I'm passing Neil to this Qt Video Output Set Echo Port API to tell this device I don't want to see anything on the desktop anymore.
So the second step is about the QuickTime movie tool block itself. So you have to tell the tool block that this movie is about to point to this video output component. And the next one is to basically set the movie G-roll to whatever this output component device returns to you.
So it's pretty simple, I mean, just not that much to be concerned about. The Echo Port. The Echo Port is something, as I say, which is optional. So not all devices are able to output to their hardware at the same time that they can still play something on your computer desktop.
Our own built-in FireWire component, for instance, do support a kind of preview on the desktop. It's not playing all the frame, it's not the best quality, but at least the user has some feedback of what's happening on the computer screen and is able to see on FireWire device connected the full frame rate of the video.
In order to figure out if this VR component is implemented or not, this echo port, you have to use this Get Component Implemented with this selector, kqt-video-output-set-echoport-select. It's a pretty short one, isn't it? Then you have, if as soon as a component does support the echo port, You can use this API Qt Video Output Set Echo Port to whatever your Windows want on your computer desktop. And the next thing they have to do is to use the regular QuickTime API to set the movie G-Wall to basically your window.
So when you want to stop to output this movie to this video device, basically you point back to your Windows URL, you use this API chooseMovieClock basically to reset the clock that QuickTime is using for the movie. We'll see later on why I'm putting that in this slide. I didn't show you any example of the clock, but in case the component had a clock, you had to reset it.
So it's much, much safer to call this API all the time you do that. And then you set the movie video to nil. And the last thing you have to do now that the movie is no longer pointing to this device is to stop the hardware by calling the Qt Video output N.
[Transcript missing]
If the component does implement it, then you get this sound output component instance using Qt Video Output Get Index Sound Output. And the last thing you have to do in order to point the movie to this specific sound device is to basically loop on all the sound media track that your movie has and use the media set sound output component with this sound component that the video output reported to you.
The clock. Most of these devices need some synchronization mechanism between audio and video. If they want that to happen, they need to provide their own clock, which is basically driving their own hardware, rather than relying on the clock we have on the CPU side. So, same thing that the two other optional functions of this video output component.
You need to figure out if they do implement that by using the kqt-video-output-get-clock-select. And if they do, then same thing for the sound one. You ask them what their clock component is all about using kqt-video-output-get-clock. And to set it, you're using this setMovieMasterClock API with the clock component they did recently. too.
So now let's switch to the demo and show you what all this stuff is all about. Can we switch to the demo one? Yeah. Okay, so we don't have video going straight into the projector, so I'm showing the video from the camera by digitizing it with the FireWire camera we were using earlier and displaying it in another window on the screen. So that's what that video input you can see there is now. It's a picture of the display on the camera.
I'm going to open the Simple Video Out example application, and it's searched the system for video output components. It's found that we've only got the FireWire video output component, and there's a choice of modes that this component can do. You'll notice that one of the features we've added in QuickTime 6 is DVC ProPal, which is very handy if you come from Europe like myself and Jean-Michel. But as we're in America, we'll stick with NTSC.
And it's telling us the dimensions of the output port, 720x480, the refresh rate, which is rounded off to 29 from 3997, and the pixel type it's expecting, which is DV. So we should try and play back a DV movie file to it. Fortunately, I have one of those I made earlier.
So you can see, as this has come up, it has switched the display on the camera out to show the frame. And as I scrub about in it, it will update with a slight lag over the movie I have here.
[Transcript missing]
We turn the sound off and I'll turn the video back on and you can see that it's playing through. You notice we can get a choice of using the video out clock or the default clock for the movie. And there's additional options in this thing that use the API that Jean-Michel was just showing us. And this is a QuickTime movie like any other, so we can scrub through it. are all here today to talk about the new feature of the QuickTime app.
[Transcript missing]
Thank you Kevin. Back to Tim. Thank you Jean-Michel, thank you Kevin. Standard feature to have Kevin's children in our demos during our session. slides please? There they are. So now you get to close off and talk a little bit about the road map for some of the sessions that you may have seen or you may have wanted to see or you may see in the future.
So, just to give you an idea, the overview session this morning was an overview of the QuickTime technologies in QuickTime 6 and Jaguar and where we were and where we're going. And then 601 was also this morning which was talking about techniques you can use to build a savvy QuickTime application during 602 which is this session. 603 is just coming up. It's integrating QuickTime media together, interactive technologies which can tie different kinds of media together and then more interactive sessions following that this afternoon.
And then and this is interesting, these are all different than mine. But in any case, the feedback forum is Friday at 10:30. Also Friday afternoon there's a session on QuickTime for the web and a session Friday later in the afternoon on QuickTime for MPEG-4. And so the important thing I wanted to point out as well was every afternoon from 1 to 4 p.m. the QuickTime engineering team is having hands-on.
and over here the contact information is DTS at Apple.com and Jeff Low is the technology evangelist and that's Jeff Low at Apple.com and I think I should have some URLs coming up which are these. So the standard QuickTime documentation is at developer.apple.com/quicktime and from there you can navigate down to the sample code and you can see the URLs for the sample code.
All these URLs are actually on the developer.apple.com/wwc/urls URL I believe and so if you are crafty with your web browser you can find all of these and so you can see the different ones we showed and the simple video out which has been updated and the SGData proc sample which used to be Minimun. and there's the slide for the hands-on which is in 1-4 in room G out all the way down and you'll find lots of QuickTime engineers there. Thanks.