H.264/AVC: Exceptional Video from 3G to HD - WWDC 2004

QuickTime • 58:08

H.264/AVC is the next-generation video codec that provides incredible video quality at a broad range of data rates. View this session to learn all about this ratified standard, including its history, technologies, and applications in the marketplace. See why you'll want to use H.264/AVC in your multimedia projects and find out why everyone is talking about this incredible video standard.

Speakers: Amy Fazio, David Singer, Greg Wallace, Hsi-Jung Wu

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

So that's me. They call me the ecosystem manager. I do sort of industry relations for the engineering group, which includes standards. And it meant going to some of these meetings. So I'm going to give you a background on the H.264 codec, where it came from, roughly what it does, and so on. And then I'm going to pass it over to real engineers to tell you things that are actually interesting.

So first of all, I'm going to cover a few of these acronyms. By now, you've probably heard quite a few of them. So let's go through them very quickly. ITUT, one of the two big standards bodies that brought you the codec, the other one, of course, being MPEG over there.

The Video Coding Expert Group, VKEG, is the group in ITUT that's responsible for video coding. And the team that they formed jointly between ITUT and MPEG was called the Joint Video Team, really imaginative name there. The codec has in its title advanced video coding in both the standards that are published, the H.264 standard from the ITU and MPEG-4 part 10 from MPEG. And MSG is a food additive commonly used in the food that gets served at late night standards meetings.

[Transcript missing]

Part of the reason for that is its broadness of field of application. It's taking over, or it's intended to take over from, in broadcast support from codecs like MPEG-2, or in stored content, again, typically a space occupied by MPEG-2. But it's also targeted to cope with the conversational low-delay applications, video telephone, video conference, traditionally occupied by H.263.

It's also looking at the video on demand, the streaming market, traditionally occupied by a variety of codecs, including MPEG-4. And indeed multimedia messaging and new applications like that, which are, in some sense, nascent applications already, but they are currently occupied by H.263. So you can see a very broad range of applications that this codec's targeted for.

But what is it? What is this revolutionary new codec? Where does it come from? Actually, there's nothing singularly revolutionary inside the codec. It is the latest in a long line of codecs that do frame differencing, motion estimation, DCT transform, entropy coding, and so on. It's got the same family resemblance to all those codecs I talked about before. So this is not a huge new departure in video coding.

What is new is that many of the features that were coupled in previous codecs have been decoupled, so there's a lot more orthogonality in the codec, and I'll explore one of those in a moment. It's got all the best ideas from the previous standards and a whole load more ideas too. This is really a big, rich codec in terms of features and choices and technologies.

Now, as usual, the standard tells you what is the bitstream syntax and what do you have to do to decode that bitstream, and it says absolutely nothing about how you encode it. That's entirely your problem. So, given the fact that there's a lot of technology in there, and this liberal approach to what you can do to encode, this is a standard with legs in terms of being able to incrementally improve over the years. If you were looking at the sessions yesterday, you saw the graph that showed MPEG-2 improving, improving, improving over the ten years since its introduction, and we're expecting to see the same kind of improvement curve, if not better, in H.264.

So here's an example of where features have been decoupled. Traditionally, if you were doing prediction, if you were doing a P-frame, you had IP structured video. The P's predicted backwards from the previous P or I frame, right? Well, AVC allows you a stack of references, some in the past, some in the future. You don't have to worry about which way they go.

And likewise, if you had a B-frame, you would have one in the past and one in the future. And again, it's decoupled. And your dependencies here, your P-frame depended on your previous ILP, and your B-frame depended in your bracketing ILP. It was very straightforward, simple structure. Well, in AVC, P-frames have a single dependency, and B-frames have two. And who knows where they point?

So the simple way it used to be ends up with diagrams like that. You have I-frames, which are decodable independently, and P-frames, which depend on the previous ILP. And the B-frames then decode from the bracketing ILP. So you get this very nice, simple, regular structure. Well, in H.264/AVC-- yes, you have I-frames.

But the P-frames can choose which way they point and what they're pointing at. They don't have to be to the adjacent one, and they don't have to be backwards. You'll notice that this P-frame over here is actually skipping the I-frame that's temporally preceding it to predict from something further away. I have no idea why you would do that, but you can do that.

And the B-frames, likewise, can go all over the place. Here's a classic B-frame predicting from its bracketing ILP. But this B-frame, hey, it's predicting from a B-frame. You didn't used to be allowed to do that. This B-frame, for some unknown reason, is predicting backwards from two things into the future. So you'll notice also that this here really is a synchronization point in this video. This I-frame here is not, because there's a frame after it which depends on something before it.

So this whole concept of what's an I-frame and what's a synchronization point, that's been decoupled as well. So you can see we have a lot of fun with the systems layer and coping with this kind of thing. And you'll learn a lot about that later in the week when we talk about video support.

Other high-level features: things like the way the codec is set up, the parameters you need to know to decode a frame or to decode a sequence. They used to be embedded in the stream in codecs up till now. Well, they've been taken out of stream now and they're provided in setup information. So now, finally, we've got the ability to actually flip between parameter sets while we're running and know that we gave that to you during setup.

There's a lot of error resilience and adaptability features in there: flexible macro block ordering, arbitrary slice ordering. These are tools that allow you to code the frame in funny orders so that when you're doing error resilience, if you've lost information, you can do better concealment. So expect to see better concealment down the road in a few years.

You can have redundant picture coding. You can say, "Okay, here's this picture coded depending on the previous one, but if you didn't get the previous one, I have another copy of this picture depending on something else." So if your server and your client are in tight communication, sometimes you can recover from loss there by supplying a different version of the same picture.

So you can do that. You can also do that with stream switching support. You can supply a picture that says, "Ah, I know you were running the 100 kilobit stream, but you seem to be having a lot of trouble with that. Why don't I switch down to the 70 kilobit stream, and as it happens, I don't have to wait for a sync point. I have a frame here that predicts from the 100 kilobit stream into the 70 kilobit stream." So we can actually switch at non-sync points.

And in fact, you can use these switching pitches to do trick modes within the same stream. You can have switching pitches that predict backwards so that if you're doing a rewind, you can skip back down the predicting stream. down the switching pictures and show the things in the past.

There's a better fit into the systems layers. In the past, video coding people haven't really worried about what they were going to fit into. Well, this time they did. The timing, for instance, which is typically embedded in stream for a video codec can now finally be out of stream. So, it's in your QuickTime movie file in the timing tables there or in your transport stream or in the RTP packets. And they defined not only a bit-structured syntax but also a packet-structured syntax for those of us who work on packet networks.

So, as I say, it comes from these two organizations, MPEG and the ITU, and it really is a family successor in both families, so it really meets up at the end here. And it gets, it was designed to get a 50% bitrate gain, roughly, over these. Obviously there are questions of exactly which profiles and levels and so on, and what kind of content. So what MPEG-2 might do in a megabit, we take about half a megabit to do.

And in iframe-only, it actually compares pretty well to classic JPEG. So if you really need an iframe-only coder, you could possibly use that one, though. JPEG, of course, still works really well. It's not a still-frame coder, of course. JPEG-2000 is a still-frame, a modern still-frame coder, which has a much more complex arithmetic coder and is wavelet transform and so on. So if you really want a still-frame coder, this is not your baby.

Profile structure. As you know, standards bodies like to define lots of features, and then they come along and they try to define profiles that they believe the majority of users will fit into. So there are three profiles here. Profiles are typically onion skins in our standards. Well, in this case, we failed. This is not an onion skin diagram.

The baseline profile, this blue one here, contains all the core coding tools, obviously, and the ability to do IP sequences, but no bidirectional prediction. And a lot of the error resilience tools are in there. So that's in your baseline profile. That's what's been adopted, for instance, into cell phones. Then there's an extended profile, which adds the ability to do B-frames and the stream switching and some other tools as well.

This one is what you might expect to use, for instance, in full-on streaming applications, where you've got the error resilience and the stream switching. Then there's a main profile, which doesn't include the error resilience and stream switching, but does include a really heavy-duty arithmetic coding system, which really kicks in at higher bit rates and higher frame rates and so on. So that's the profile you might expect to use for HD applications or for standard definition applications. This is the profile that people coming from the MPEG-2 tend to be looking at. MPEG-2 world.

Boy, do we have levels. As you know, these standards are also divided into levels. Profiles tell you what technologies are you choosing. Levels tell you, well, how much complexity can you have. And we have plenty of levels. They have levels that are designed to give you roughly 30 frames a second at everything from sub-cortisif at the tiny end, all the way up through, you know, VGA and XGA and 16 VGA is in there somewhere, and all the way up to 4K by 2K. So they really looked at the whole industry and said, okay, we can do a fine-grain division of this into levels. So there's plenty of levels to choose from, and I'm sure there are more coming.

What's going on in the standards body down the road? There actually are extensions underway. This codec is not completely done. They're working on fidelity range extensions, they call them, where we can do deeper pixels. Currently, it's an 8-bit codec. They're looking at 10-bit. We can do less subsampling of the chroma. If there's anybody in here who understands that, you can know that they're actually working on things that give you much better chroma fidelity for those professional applications. And they're also looking at alpha plane support in the upcoming.

What's coming in in the industry is that we're looking at a lot of companies. I mean a lot of companies are implementing this. The chair already has a list of more than 60 companies who said publicly to him that they are working on this codec. That's a huge number of companies, right? And they spread an entire gamut from hardware, software, and different fields of application. So, there's a terrific amount of interest behind this codec.

An enormous amount actually compared to previous codecs at this stage in its development, only just after the publication.

[Transcript missing]

There's also an MPEG-2 byte stream packing, so those of you who work in the MPEG-2 field, cable and so on that will need to use MPEG-2 transport or whatever, that's also fully defined.

Licensing. So I know all of you get worried about licensing. This is an improvement over MPEG-4 Part 2 licensing. There are UCs, but they're limited and they're well-defined, much better defined, I think, than in the MPEG-4 Part 2 licensing. And there's less counting of things that you've not already counted. So look more like the MPEG-2 licensing than the MPEG-4 Part 2 license, which I know was confusing to a lot of people.

So for example, as a manufacturer, we tend to cover encoder and decoder fees. Can you use the technology that we ship you? Yes, you can. We've paid the fees for you to allow you to do that. So if you're a software developer, stop worrying. We don't think there's much to worry about there. If you're a content developer, yes, there are some cases where there are additional licenses that are payable. But they're well-defined, and I think you'll be able to work it out pretty easily. I'm not going to go through this in detail.

But in general, if you have questions about this, read the license agreement we gave you. I know that's something that goes completely against the grain for those of us in the computer industry to actually read a license agreement. But it might answer a question. Some of us actually read it before we publish them. So we try to make it accurate for you. If you're still in doubt, feel free to talk to us. Go to the sites of these two licensing organizations that are each issuing licenses for this.

Have a look at their licenses and their frequently asked questions. Your question may well be answered there. And if you need further clarification, feel free to contact us or them. But our basic message for you is, come on in, the water's lovely. It's not a jacuzzi. It's not going to be that comfortable, but it's not the Arctic either. This is survivable licensing water for those of you in the content business.

So, adoption in other standards arenas. What's going on here? So, the ITUT has already published a video conferencing standard that includes this codec. And indeed, the RTP payload format that I just defined, that talked about the simple version of it. 3GPP for 3G cell phones, release 6 is almost done. It's in there as the next-generation video codec for 3GPP. DVD Forum, HD DVD, we've already talked about. Greg's going to give you more information about that in a moment.

It's under final consideration at DVB, the European body, and ATSC, the Advanced Television Something Committee, in the US for television purposes. Japanese broadcasters, a consortium of Japanese broadcasters, have already said, "Yes, we're happy with the technology, and we're happy with the licensing. We think we can use this codec." So they're going ahead. And it's on track at ISMA for the next generation of interoperable Internet streaming specs. So with that, I'd like to pass it over to Greg Wallace, who's going to cover for you what's going on with H.264, advanced video coding, and the HD DVD business.

[Transcript missing]

Okay, so let's generically talk about this as HD optical disc. And to really show you where H.264 fits in there and what its competitors and prospects are, I need to tell you a little bit about these two different optical disc camps, which some of you have probably heard of, and I wouldn't be surprised if many of you have felt confused from time to time because the terms and the technologies are really confusing. So there's actually two optical disc camps, high-def optical disc camps. There's one called Blu-ray, and there is the DVD Forum, a separate organization, which is what brought you today's DVD. And they, interestingly enough, have chosen the term HD DVD.

Both of these camps are planning or at least claiming, hoping that they will launch, in other words have a consumer launch, well in time for Christmas 2005. Launch meaning that you'll be able to go to the Good Guys or Fry's, your favorite store, and actually buy a HD DVD player, or maybe it would be called a Blu-ray HD player, and also be able to buy HD movies on discs that will be the same size and more or less look just like today's DVDs, but they'll be much higher capacity in some cases, and they can play on these players. Part of the reason there's uncertainty about this is that Hollywood Studios are still a little ambivalent about this launch.

If you're here from a Hollywood studio, raise your hand. Okay, good, I'll speak freely. Hollywood Studios like to earn as much money as they can, as all good companies do. And what they want to do is sell you their movies all over again, right? So one reason they're a little bit ambivalent about HD DVD is because they're not 100% sure that the time is right yet and that there's enough penetration of HD sets to really motivate everyone to buy their entire movie collection all over again. So that uncertainty is a major reason that the roadmap and which camp is going to win is a little bit uncertain.

I believe they will come around. There will be a certain momentum that gathered just like the original DVD, but kind of makes life interesting. Also for that reason, you'll increasingly hear as you read press articles that both of these camps are increasingly pitching this as not just about high definition, but also about improved copy protection.

So to talk about the Blu-ray method a little bit, you can think of Blu-ray as a really big bit bucket. They've decided to pursue an aggressive, at least relatively aggressive, blue laser technology. Blue laser basically means it's got a smaller or shorter wavelength than a red laser. Red laser are what's used today in DVD players. And the blue laser can focus on a smaller pit, can resolve a smaller pit. So that, coupled with basically etching and molding technologies which can put smaller pits on the disc are what give the higher capacity.

BlueRay, in fact, is 25 gigabytes on a single layer and 50 gigabytes on a dual layer. So that's compared to today's DVD, which is 4.7 and 8.5, single and dual. So it's something like pushing six times the capacity of today's DVD player. Because they have such a big bit bucket, they are not particularly motivated to go to an advanced codec, to go to H.264. So at the moment, at least, they're planning on using MPEG-2 high definition only, the same MPEG-2 that's used for today's HD broadcast in the U.S. and also in Japan.

So this requires about 15 or 20 megabits per second to encode an HD signal at really pristine quality. And if you do the math and you're not sure what you're doing, you're probably going to have to do a lot of math. But if you can work out the capacity it takes at that data rate to store a two to three hour movie, you'll see that that can fit on this disc pretty easily.

And there's not a lot of motivation, at least not on Hollywood's part, to say, "Oh, well, you know, can't use a higher compression rate and put two or three movies on a disc." Because they don't really want to sell you two or three movies on one disc. They want to sell you one movie at a time.

So one of the biggest challenges for Blu-ray is because it's a relatively aggressive technology, there are questions about manufacturability. Can you replicate this at an affordable price? And basically it comes down to are the yields coming off the line going to be high enough soon enough to be competitive with other technology?

So HD DVD, the DVD forums, HD optical disc technology and specification is remarkably flexible, which is good and bad in comparison. It can place an HD DVD dataset on either a red laser disc, the same kind we have today, or a blue laser. The Blue Laser Disc that they're developing is a different blue laser technology from what the Blu-ray group is developing.

They, and you'll sometimes hear this name, AOD, which stands for Advanced Optical Disc. That's just the name or acronym they've picked for their blue laser technology. And you can see it's not quite as aggressive in terms of capacity as Blu-ray. It's 15 or 30 gigabytes on a single or dual layer disc.

So, particularly if you want to use RED Laser and you want to consider putting a full-length movie, you absolutely have to have an advanced codec. H.264/AVC: This makes for an especially interesting technology for computer manufacturers because they already have a big installed base of red laser burners. As a matter of fact, dual-layer red laser burners are coming within the next year from the laser writer manufacturers.

With an 8.5 gigabyte dual laser and using H.264 at around 8 to 10 megabits per second, you can actually put an entire two-hour feature-length film on a red laser disk. So the HD DVD players, consumer electronics player boxes that will come out when HD DVD launches, they will be able to play an HD DVD data set off of either a red laser or a blue laser disc.

So, to add to the flexibility, HD DVD has tremendous codec flexibility. It's actually going to be a requirement for the HD DVD players to be able to play back MPEG-2 Hi-Def, H.264, or Windows Media 9 video. These will all be mandatory for the player manufacturer. Now, if you're making publishing tools or encoders, it doesn't mean, of course, that you have to be able to encode in all three. You can encode in any one of them, but the players will have to play back all three.

As I was mentioning earlier, so H.264 will give you about the same quality as MPEG-2 at half or even a little less than half the data rate. So for 24p material in particular, depending on the complexity of the scene and whether it's 1280x720 HD format versus 1920x1080, you'll be able to go as low as 6 megabits per second and really get superb, pristine quality.

Just a little bit about the other codec that some of you have heard of here and the SMPTE process that you might have heard. So SMPTE is another standards committee. They have a long history of standardizing both television and film standards, largely within the U.S., but internationally as well.

And there's a technical committee of SMPTE called C24, which does compression standards. And they have agreed, SMPTE has agreed to Microsoft's proposal to, quote, "standardize" the Windows Media 9 video codec. The name that you may have heard of in the press in the past year or so has been VC9, but just as of the last meeting in Milwaukee a couple of weeks ago, that has actually been, is in the process of being reexamined. So the name may be something entirely different. and maybe VCE1 or some other acronym, -1.

Just so you know, this standard is currently at the committee draft level. But there are, and our friends up north love to publicize that this is virtually done and out the door. But as a matter of fact, there's at least three more major hurdles to achieve. The other thing that Our friends love to publicize sometimes is that this VC9 codec will, or VC1, whatever it will be called, will be virtually free. Well, that's not really quite accurate.

In fact, there's a licensing pool forming in MPEG-LA, which is one of the licensing entities, as Dave said, for H.264. It's also the licensing entity for MPEG-2. And they are, in fact, in the process of identifying key patents and forming a licensing pool for Microsoft's Windows Media 9 video codec, because it's very difficult these days to make an advanced codec and not inadvertently or intentionally, either way, use someone else's patented technology.

So I personally will be very surprised if the licensing terms wind up being the same. They'll end up being a lot different for VC1 than they are for H.264. So with that, I'm going to turn it over to Hsi-Jung Wu, and he is the man who is doing the H.264 codec at Apple. Thank you very much.

Thanks. Hi. Cell phones are off, right? I don't know if that was-- that's cool. Hi, I have the distinct privilege of working with a bunch of cool codec guys. And I think some of them are here incognito. So that's a pleasure for me. So have you all seen all the demos that have been up yesterday, today? That kind of stuff?

A little bit, no? So let's cut this out. Let's cut to the chase. No, just kidding. Demo one, please. And I'm going to-- That will, um, just for fun. Can we dim the lights at all? This, uh, this is for Amy, by the way. This is Amy's favorite trailer. Here we go. Turn to page 394.

The Masked Man Prison. He's a murderer. Sirius Black is the reason the Potters are dead. And now he wants to finish what he started. I want you to swear to me you won't go looking for Black. Why would I go looking for someone who wants to kill me? There's something moving out there. It was a Dementor, one of the guards of Azkaban, is searching the train for Sirius Black. It is not in the nature of a Dementor to be forgiving.

[Transcript missing]

I'm going to try to show as many trailers as I can in the full volume because they cut them off all the time. Oh, I've got to stop this. Quick time, oops. Can we go back to one, I think? OK, so one of these-- Here we go.

So I've got a bunch of slides. I'm going to try to run through them as fast as I can and get to the demos. But I'll give you a little bit of what people talk about H.264. And I'll go through just three slides of key technologies. And I'll just go through them briefly. And then we'll get to some of the demos and stuff.

Got about 40 minutes? Excellent. So let's do this. So H.264 is-- It really is a huge improvement over what we've seen before. I've been making these trailers for a while now, and I've turned into a trailer junkie, because I've never used to watch these things, but they're so beautiful when you actually do this.

And when you see this on a HD screen, like one of our 23-inch guys, or 30-inch now, I guess, it's just amazing when you look at it, just totally crisp and much better. And they say this, they say it's the same quality as MPEG-4 Simple at half the bitrate.

I think it is, and I'll show you something later that might not prove the point, but something like that. The way the compression works is nicer. As you start cranking up the compression, the image turns softer, and I'll show you some of that, too, as well. So I like it a lot. One of the things that Amy likes to say is this thing is scalable.

We use this on 3G material as well as the HD ones. The Harry Potter sequence that you just saw right now was a 1280 by, well, it's kind of widescreen, so probably like 600 or something like that. That's a smaller HD version, and that thing did it at 6 megabits. So that's pretty impressive.

And it can do that kind of stuff. And I'll show you stuff at 3G as well. And it handles that really well. And like Dave was saying earlier, this codec really is, and it's like a web of little things that just got better. There is new technology in there, but a lot of it is just like, it's more of like an evolution than it is, a brand new this, brand new that.

But the thing is, when you stack it all together, it does magic, I guess. And the big thing about that is, again, it's at the beginning of the curve. We've just sort of gotten to know this thing. And we're just going to get to know it better. And you can expect to see quality improvements, speed improvements, all that stuff coming up.

three quick slides, promise, of technologies. So there is like a list of things. One of the things that Codecs do is it does transfer coding. So the big thing about this transfer coder is that it's integer based, which means that the reconstructions are bit exact, which means if I'm a decoder implementer and he's a decoder implementer, our decoders will be able to exactly reconstruct whatever it is that's been encoded.

Previously, in old school kind of coding, the precision to which you want to reconstruct your transform is up to the implementer, except there's a specification on the tolerance and how precise you have to be, that kind of stuff. Also, it's a 4 by 4, which means that-- what did I say? Small support reduces blocking and ringing artifacts. It's true. But basically, it means anything that you add within the 4 by 4 doesn't propagate the other 4 by 4. So you get crisper pictures. Improved interprediction.

Most people think of prediction in the time domain. Well, Codecs, since MPEG-1, have had prediction within a single image. But the thing about this guy is there's just a lot of ways to predict within an image. And that really helps the interprediction. And what you might not know-- I mean, all these compresses details better, gradients, blah, blah, blah. But it actually helps in high motion areas as well. Because what happens when the motion fails? You throw in interprediction, or interblocks, and then the prediction really helps you code that thing.

A couple of things on motion. The motion block sizes are, you got a ton of them now. MPEG-1, you've got this thing called the macro block, right? And the macro blocks still exist. There are 16 by 16 blocks that tile the image. And typically what happens is, well back then, you get to move that 16 by 16 block around the previous reference image to look for the best match. Well, in 264, you've got 16 by 16 blocks, and you've got like 16 by 8 blocks, all the way down to 4 by 4 blocks. You've got a lot of ways to search for previous stuff, and it's just very expressive.

And it really, really helps code some of the more complicated motion. Complicated motion doesn't mean a lot of motion. This isn't necessarily complicated, but a lot of stuff going in different ways, and a lot of details going on. So another thing about the motion estimation is this quarter-pill pixel.

Quarter-pill is the number of pixels that are in the image. Quarter-pill is the number of pixels that are in the image. Quarter-pill is the number of pixels that are in the image. Quarter-pill is the number of pixels that are in the image. Quarter-pill is the number of pixels that are in the Quarter pixel interpolation, quarter pixel precision, I guess is what I said.

It's been around, but this one is actually very good. Basically what that means is every time you go subpel pixel, you do a filtering, you filter your pixels. When you go quarterpel, you get another filter in there. And you get a lot of good stuff. And it really provides some of this crisp stuff that I've been talking about. And you'll see that in demos as well. And then my final slide, it's got a loop filter that does deep blocking on the 4.4 boundary.

So you've heard of pulse filters and stuff like that, right? The distinction between this and, say, a previous codec is that the filter actually sits inside of the encoding loop. Generically speaking, encoders have a forged encoding path. And then it's got a little decoder in its belly, I guess, to give you the reconstructed frames. But this filter sits inside there.

So the encoder runs this filter as well as the decoder and helps you with the prediction. It gets rid of some of the blocky artifacts and helps you with the motion estimation. And because it's done in 4x4s, basically it's got the capability of touching every single pixel. And it's a very, very effective smoothing filter. And you'll see that too.

There's a bunch more things. So one more thing I'll talk about is the entropy coding. This context-adaptive this, context-adaptive that. And it's actually this fact that it's context-adaptive is helping us a lot. And we see quite a bit of gain just because of this stuff. I mean, the first one is what you might know as Huffman. It's very close to Huffman, table-driven kind of stuff. The second one, if there's metacoding, that's been around for a while, too. But the fact that they worked out the context stuff was brilliant, I guess.

and more. So if you end up stacking all this stuff together, you get what we get at the end. And you'll see that the categories have been around for a long time, and David is absolutely correct. This is just yet another one in the evolution of stuff. But when you put them all together,

[Transcript missing]

If you find the right path through this search space, you can get some pretty amazing quality.

What did I say? OK, so now we can go and drop this thing. Can we go to demo one, I think? Thanks. Let's see another one, since we've got a little bit of time. This one you might have seen this morning. You guys should have been there this morning. But if you haven't, here it is. It's never before seen, right? Something like that.

What's interesting about this one is it was just like a barrage of images and scenes and half-second shots. I'll show you a side-by-side here. Let me put him away so that you don't get hosed. And we can see this is Will Smith in stereo. Oh wait, I got to do the play all, don't I? So I'll do it again.

Homicide. Spooner. Identify. Detective. Wow. Richest man in the world. Can I offer you a coffee? Sure, why not? I don't think anyone saw this coming, so whatever I can do to help. Sugar. I'm sorry. For the coffee. Sugar? Ah. Oh, you thought I was calling you sugar. Hey, you're not that rich.

So that's a vintage Will Smith. I should probably stop it in a better place. That's all right. So one of them is 264. Let me stop him in a-- there. So you can see this. The one on your left is H.264. It says so. And they're about comparable quality, I would say. But the big deal is that MPEG-4 is running at-- I think 1100 kilobits, which is what it takes. 480 by whatever it is, 240, something like that. 1100 kilobits, 1.1 megabits. And H.264 is running at 550 kilobits, exactly half the rate. So that ain't too bad, huh?

So if you want to post beautiful stuff, this is one way to do it. So the other way to look at this, of course, I think you've seen this before, is this is what you could do then at the same bitrate. My robots don't kill people. That thing threw somebody out of a window. Is that registering with you? You're not suggesting? So if you crank up the bitrate to 11 kilobits, you can actually send this and H.264.

This is 960 across by 520, something like that. So basically four times the size and the same bitrate. So there you go. Maybe I can show you something else as well. I can actually quit that. Can I ask a question? Yeah, why not? I might not answer it, though.

I'll answer that later. How's that? Okay. Any more questions? I'll just answer them later. All right. So with 26 minutes to go, let me show you a little bit more about what H.264 can do. A little bit about the artifacts. So this is a-- here. This is like the logo of somebody's fire truck.

It's nice, but it's got very hard edges and it's got these weird, very detailed areas. So if you remember that. So this is what you get with H.264 at a megabit of this. Oh, by the way, this is 640 by something. 640 by 400, 480. Actually, 640 to 480 exactly. Maybe not. Something like that. But this is H.264 running at what you would expect for standard dev sizes.

And you get this, and it's decent, right? And what I want to show you is what it does as it goes farther and farther down. At 500, you start to lose some of this stuff in here. I don't know if you can tell. Up there, you may not be able to tell.

But it looks pretty good at 500. And the edges are fairly crisp, right? If you really want to jerk this codec around, you might want to try 150 kilobits, right? Don't do this at home. You don't want to do standard dev at 150 kilobits. But if you did, what you'll notice is that none of the edges go away. Isn't that pretty cool? Yeah. But you get smoothness through the details.

You know, I've been looking at codecs for a while. This one impresses me. And I can say that publicly, I think. Okay, FireTrack, oh yeah. The scalability demo, I've got one for you. And I want to do it just to show you that it doesn't take a G5 to do this, so we've got a dual G5 in the back. This is somebody's stock iBook. I think it's a gigahertz. In fact, it is a gigahertz. And I want to show you a couple of things. We've seen Phantom on here, so we'll just show you Phantom again.

[Transcript missing]

What you were just seeing there is a megabit, broadband megabit type, standard def size phantom running on this G4 machine over here, gigahertz thing. And like I said, the codec actually, you can use it at all these various bit rates and sizes, and it doesn't fail, which is yet another amazing thing for me. I've got one at DSL rates. I call it DSL. It's about 300 kilobits. This would be what you would call--

[Transcript missing]

Doesn't it look good? It looks pretty decent. So, and if you wanted to see, for example, our 3G version.

The audio is a lot different. Our video hangs together, so I'm... Like I said, I'm... I'm pretty darn impressed with what can happen. Let's see. All right. So one of the things that we'll be shipping with our seed, or you'll find it in your seed, and you all got one, right?

The tiger seed? Go get one if you don't have one. Is the ability to export one of these things. What we've done is we've taken all the complicated stuff and we've pulled it down to one. Can you switch screens? Oh, yeah, thank you. So I'm on demo one now.

to one little thing. So this is maybe a three-second standard depth size clip. Let me get this going here. What you can do with your seed to get it home, and we really encourage you to try this out. And this is why we did this. We went through and put this together.

You'll see in your list of exporters, maybe I should do this so that you guys can see what's going on here. You turn into exporters. You get this little dialog up. And in the list of exporters, what you'll see is this thing called Apple 264/AVC Preview Movie. Just remember it's a preview. And you hit options and you get one.

And it's what do you want. So otherwise, you can go crazy with all the knobs. And we've figured out what to do with that kind of stuff. So let us deal with that, and you deal with this. So this is a standard-death movie, so maybe you want it about 800 kilobits. So that's what you do. And I want to put it on my desktop, so I go to the desktop and I just save it. Cool. So it's going to pump for a while. But we want you to try this.

We're pretty proud of what it can do. And I just want you to get out there and get your content through this thing and see what kind of stuff we can do. And we're just in the middle of this, and this is sort of a little snapshot of what we've been doing.

And I just want you to play with it and I'd love to hear feedback from you guys about what we can do. So, it's done. No, it's not done. Okay. Do you want to leave it on that? Yeah, that's fine. It's done. That should have been about 50 seconds, by the way. So, in case you want to know how fast it is.

And sure enough, look at this thing. All three seconds of it. Nice. And let's see. How big is it? Would it say 800? Let's see. 9 times 8 is, what is that? 720? Am I doing my math right? Times 4, it's about 800 kilobits, so it actually works out. So the way we did our exporter is it does multi-passing automatically.

It chooses how many times it wants to go through it. And it picks, it hits the bit rate within 5%. So if you don't see a 5%, call me. No, don't call me. Call Amy. And give her a hard time, okay? So, and by the way, all the trailers that you've been seeing, we've been making them like this. So, you know, just so you know what kind of quality you can expect out of this stuff.

So. Um, yeah, I, I, that's it. Um, so let's go back to, um, Do I need Tiger to play them? Yes. Yes, you do. And I got a point in my-- you guys are all jumping. Yeah, that's point number two. You'll see that. So let me just go back over what I just said.

So we've got a decoder in the TigerSeed. And we've got a video-only exporter in the TigerSeed. It's integrated into QuickTime, as you would expect. And then the multi-pass business, it does that on its own. The UI is very simple, so you guys don't get confused. But it works.

And it requires a G4 or G5, both the decoder and the encoder. And one thing you should know, the bit streams are keyed to the tiger seed, which means don't go and re-encode your archive and hope that it will play later because it won't. So yes, I don't know who asked that question, but yes, it's tiger only. So, okay.

Hsi-Jung Wu, and David Singer, Greg Wallace, Hsi-Jung Wu, and David Singer, Greg Wallace, What if we do all the questions at the end? And we've got just a few more minutes, so what I'll do is I'll end with my favorite disturbing movie, and then we can go into a question session. Oh, and can we go to demo one, please?

Ten days ago, one of my satellites over in Antarctica discovered a pyramid. What exactly on the ice is this? It's not on the ice. It's 2,000 feet under it. Let's make history. Oh, my God. Whoever built this pyramid believed in ritual sacrifice. Did you hear that? This room was called? Sacrificial chamber. It's like the best. So try it out. This is the codec to go to. I mean, this quality is-- I just haven't seen this stuff, this kind of stuff in a while. So I think we're done.