Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2004-734
$eventId
ID of event: wwdc2004
$eventContentId
ID of session without event part: 734
$eventShortId
Shortened ID of event: wwdc04
$year
Year of session: 2004
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC04 • Session 734

H.264/AVC: Exceptional Video from 3G to HD

QuickTime • 58:08

H.264/AVC is the next-generation video codec that provides incredible video quality at a broad range of data rates. View this session to learn all about this ratified standard, including its history, technologies, and applications in the marketplace. See why you'll want to use H.264/AVC in your multimedia projects and find out why everyone is talking about this incredible video standard.

Speakers: Amy Fazio, David Singer, Greg Wallace, Hsi-Jung Wu

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it may have transcription errors.

So that's me. They call me the ecosystem manager. I do sort of industry relations for the engineering group, which includes standards. And it meant going to some of these meetings. So I'm going to give you a background on the 264 codec, where it came from, roughly what it does, and so on. And then I'm going to pass it over to real engineers to tell you things that are actually interesting.

So first of all, I'm going to cover a few of these acronyms. By now, you've probably heard quite a few of them, so let's go through them very quickly. ITUT, one of the two big standards bodies that brought you the codec, the other one, of course, being MPEG over there. The Video Coding Expert Group, VKEG, is the group in ITUT that's responsible for video coding. And the team that they formed jointly between ITUT and MPEG was called the Joint Video Team, a really imaginative name there. The codec has in its title advanced video coding in both the standards that are published, the 264 standard from the ITU and MPEG-4 part 10 from MPEG. And MSG is a food additive commonly used in the food that gets served at late night standards meetings.

Some history for you. Way back in the mists of time, some Olympian gods defined codecs like H.261 and H.262, and prizes if you know what H.262 is also known as. Then, back in 93, the ITU actually started two projects, the H.26P and H.26L. H.26P became H.263, the codec we know and love, which has been evolving ever since. Then back in '97, the ITU formed the Video Coding Expert Group, actually under the chair of the-- the person in our chair is the joint video task-- joint video team. And they issued a call for proposals back in '98, so we're still quite a ways back. And they actually adopted a baseline technology in '99. They called it TML1. Now I don't remember what TML stands for. Now a little later, MPEG issued a call for proposals for a new video coding technology. And rather surprisingly, rather than having a corporation be the top response, the top response was from the ITU. And so they worked for a bit on the politics of forming a joint video team, and they actually started the joint video team. At this point, things got pretty hot and heavy, and the meetings got pretty frequent. And you can see the gap here from forming the JVT to the technical freeze was extremely short.

They went to final draft to international standard and consent back in 2003, which means basically the standard's done, and the final draft was issued. So you can see that it's been a long gestation but a very intense finish to this. These are the two largest standards bodies in this area, in the video coding area, ITU and ISO. There were perhaps 300, more than 300 people involved directly in the writing of this standard in terms of writing contributions, going to meetings, writing emails, providing reference software and so on. So a terrific effort from video coding people. There were meetings that had 150, 200, 200 plus people in the room frantically arguing about details of the codec. And they came from all walks of life in terms of their expertise. There were hardware people in there screaming, no, no, you can't define it that way. I'll never implement that in hardware cheaply. There were software people who wanted to do everything.

There were video telephony people who were interested in low delay, of course. DVD people, people who do multimedia authoring, streaming, general purpose multimedia companies, people from the cable and television industries, people who were doing equipment manufacturing for professionals and consumers and so on. And there were specialists in there, people who only understood things like entropy coding or transform coding, and general video coding people.

And the meetings ran often until 2 a.m., 3 a.m. I remember one that finished at 3 a.m. So this was, as I say, very intense with a very large group of people. So this is probably the most argued over, fought over, tested, verified, whatever video codec you've ever seen.

It's-- part of the reason for that is it's broadness of field of application. It's taking over or it's intended to take over from-- in broadcast support from codecs like MPEG-2 or in stored content again typically a space occupied by MPEG-2 but it's also targeted to cope with the conversational load delay applications, video telephone, video conference, traditionally occupied by H263. It's also looking at the video on demand, the streaming market traditionally occupied by a variety of codecs including MPEG-4 And indeed, multimedia messaging and new applications like that, which are, in some sense, nascent applications already, but they are currently occupied by H.263. So you can see a very broad range of applications that this codec's targeted for.

But what is it? What is this revolutionary new codec? Where does it come from? Actually, there's nothing singularly revolutionary inside the codec. It is the latest in a long line of codecs that do frame differencing, motion estimation, DCT transform, entropy coding, and so on. It's got the same family resemblance to all those codecs I talked about before. So this is not a huge new departure in video coding.

What is new is that many of the features that were coupled in previous codecs have been decoupled, so there's a lot more orthogonality in the codec, and I'll explore one of those in a moment. It's got all the best ideas from the previous standards, and a whole load more ideas too. This is really a big, rich codec in terms of features and choices and technologies. Now, as usual, the standard tells you what is the bitstream syntax, and what do you have to do to decode that bitstream, and it says absolutely nothing about how you encode it. That's entirely your problem. So given the fact that there's a lot of technology in there and this liberal approach to what you can do to encode, this is a standard with legs in terms of being able to incrementally improve over the years. If you were looking at the sessions yesterday you saw the graph that showed MPEG-2 improving, improving, improving over the 10 years since its introduction and we're expecting to see the same kind of improvement curve, if not better, in H.264.

So here's an example of where features have been decoupled. Traditionally, if you were doing prediction, if you were doing a P-frame, you had IP structured video. The P's predicted backwards from the previous P or I frame, right? Well, AVC allows you a stack of references, some in the past, some in the future. You don't have to worry about which way they go. And likewise, if you had a B-frame, you would have one in the past and one in the future.

And again, it's decoupled. And your dependence is here. Your P-frame depended on your previous ILP and your B-frame depended on your bracketing ILP. It was a very straightforward, simple structure. Well, in AVC, P-frames have a single dependency, and B-frames have two, and who knows where they point? So the simple way it used to be ends up with diagrams like that. You have I-frames, which are decodable independently, and P-frames, which depend on the previous ILP, and the B-frames then decode from the bracketing ILP. So you get this very nice, simple, regular structure. Well, in H.264 AVC, yes, you have I-frames, Ah, but the P-frames can choose which way they point and what they're pointing at. They don't have to be to the adjacent one and they don't have to be backwards. You'll notice that this P-frame over here is actually skipping the I-frame that's temporally preceding it to predict from something further away. I have no idea why you would do that, but you can do that. And the B-frames likewise can go all over the place. Here's a classic B-frame predicting from its bracketing ILP, but this B-frame, hey, it's predicting from a B-frame. You didn't used to be allowed to do that. And this B frame, for some unknown reason, is predicting backwards from two things into the future. Yeah? So, you'll notice also that this here really is a synchronization point in this video. This I frame here is not because there's a frame after it which depends on something before it. So, this whole concept of what's an I frame and what's a synchronization point, that's been decoupled as well. So, you can see we have a lot of fun with the systems layer in coping with this kind of thing. lot about that later in the week when we talk about video support.

Other high-level features, things like the way the codec is set up, the parameters you need to know to decode a frame or to decode a sequence, they used to be embedded in the stream in codecs up until now. Well, they've been taken out of stream now, and they're provided in setup information. So now, finally, we've got the ability to actually flip between parameter sets while we're running and know that we gave that to you during setup. There's a lot of error resilience and adaptability features in there. Flexible macro block ordering, arbitrary slice ordering. These are tools that allow you to code the frame in funny orders so that when you're doing error resilience, if you've lost information, you can do better concealment. So expect to see better concealment down the road in a few years. You can have redundant picture coding. You can say, okay, here's this picture coded depending on the previous one, but if you didn't get the previous one, I have another copy of this picture depending on something else. So if your server and your client are in tight communication, sometimes you can recover from loss there by supplying a different version of the same picture. You've got stream switching support. You can supply a picture that says, ah, I know you were running the 100 kilobit stream, but you seem to be having a lot of trouble with that. Why don't I switch down to the 70 kilobit stream, and as it happens, I don't have to wait for a sync point. I have a frame here that predicts from the 100 kilobit stream into the 70 kilobit stream. So we can actually switch at non-sync points. And in fact, you can use these switching pictures to do trick modes within the same stream. You can have switching pictures that predict backwards, so that if you're doing a rewind, you can skip back down the switching pictures and show the things in the past.

There's a better fit into the systems layers. In the past, video coding people haven't really worried about what they were going to fit into. Well, this time they did. The timing for instance, which is typically embedded in stream for a video codec can now finally be out of stream. So, it's in your QuickTime movie file in the timing tables there or in your transport stream or in the RTP packets. And they defined not only a bit structure syntax but also a packet structure syntax for those of us who work on packet networks.

So as I say, it comes from these two organizations, MPEG and the ITU. And it really is a family successor in both families. So it really meets up at the end here. And it was designed to get a 50% bit rate gain, roughly, over these. Obviously, there are questions of exactly which profiles and levels and so on, and what kind of content. So what MPEG-2 might do in a megabit, we take about a half a megabit to do. And in iframe only, it actually compares pretty well to classic JPEG. So if you really need an iframe only coder, you could possibly use that one though. JPEG of course still works really well. It's not a still frame coder of course. JPEG 2000 is a still frame, a modern still frame coder which has a much more complex arithmetic coder and is wavelet transform and so on. So if you really want a still frame coder, this is not your baby.

Profile structure, as you know, standards bodies like to define lots of features and then they come along and they try to define profiles that they believe the majority of users will fit into. So there are three profiles here. Profiles are typically onion skins in our standards. Well, in this case, we failed. This is not an onion skin diagram. The baseline profile, this blue one here, contains all the core coding tools, obviously, and the ability to do IP sequences, but no bidirectional prediction. And a lot of the error resilience tools are in there.

So that's in your baseline profile. That's what's been adopted for instance into cell phones. Then there's an extended profile which adds the ability to do B-frames and the stream switching and some other tools as well. This one is what you might expect to use for instance in full on streaming applications where you've got the error resilience and the stream switching. Then there's a main profile which doesn't include the error resilience and stream switching but does include a really heavy duty arithmetic coding system, which really kicks in at higher bit rates and higher frame rates and so on. So that's the profile you might expect to use for HD applications or for standard definition applications. This is the profile that people coming from the MPEG-2 tend to be looking at. MPEG-2 world.

Boy, do we have levels. As you know, these standards are also divided into levels. Profiles tell you what technologies are you choosing. Levels tell you, well, how much complexity can you have. And we have plenty of levels. They have levels designed to give you roughly 30 frames a second at everything from subcortisif at the tiny end all the way up through, you know, VGA and XGA and 16 VGA is in there somewhere and all the way up to 4K by 2K. So they really looked at the whole industry and said, okay, we can do a fine-grained division of this into levels. So there's plenty of levels to choose from, and I'm sure there are more coming.

What's going on in the standards body down the road? There actually are extensions underway. This codec is not completely done. They're working on fidelity range extensions, they call them, where we can do deeper pixels. Currently it's an 8-bit codec. They're looking at 10-bit. We can do less subsampling of the chroma. If there's anybody in here who understands that, you can know that they're actually working on things that give you much better chroma fidelity for those professional applications. And they're also looking at alpha plane support in the upcoming.

What's coming in in the industry is that we're looking at a lot of companies, I mean a lot of companies are implementing this. The chair already has a list of more than 60 companies who said publicly to him that they are working on this codec. That's a huge number of companies, right? And they spread an entire gamut from hardware, software, and different fields of application. So, there's a terrific amount of interest behind this codec, an enormous amount actually compared to previous codecs at this stage in its development, only just after the publication.

How does the system support look? Obviously, you can't do a video codec in isolation. You have to fit it into a system. Most people want audio with their video and so on. So in the ISO committee, we actually did go through the iteration of defining a file format for this codec. It's fully specified. It's actually part 15 of the same MPEG-4. So there's another part that Frank can fill in on his little scatter diagram. And it allows you to do simple things. It's an ISO family file. It's a QuickTime family file, if you like. It looks like a QuickTime file.

And the simple things look simple in it. So if you want to do just straightforward video, man, you can just lay down a sample description and your video and you're done. But there's also extended tools in there to help you track dependency, help you do stream switching, help you do dynamic parameter set update and so on. So again, there's new tools in the file format for those wanting to do complicated things. But for those wanting to do simple things, you've got the tools today. And at the ITF, there's a payload format that's, I believe, about to enter the last call. So it's nearly done. which defines how to stream this codec over RTP for those of you who want to do real true streaming. It's got lots of modes, lots of choices. It's the longest RTP payload format I have ever seen. It takes a lot of reading. It has three modes. There's a simple mode where you just basically put one piece of video coded information into a single packet and send it out. There's an aggregated mode where you can stick multiple pieces together. So those of you running at low bit rates, you can stick multiple things into the same packet. And then there's a full-on interleave mode you to do amazingly complicated things. And we have a suspicion that in a few years time we might work out how to use that. There's also an MPEG-2 by stream packing, so those of you who work in the MPEG-2 field, cable and so on that we need to use MPEG-2 transport or whatever, that's also fully defined.

Licensing, so I know all of you get worried about licensing. This is an improvement over MPEG-4 Part 2 licensing. There are UCs but they're limited and they're well-defined, much better defined I think than in the MPEG-4 Part 2 licensing. And there's less counting of things that you've not already counted.

So look more like the MPEG-2 licensing than the MPEG-4 Part 2 license which I know was confusing to a lot of people. So for example, as a manufacturer we tend encoder and decoder fees, can you use the technology that we ship you? Yes you can. We've paid the fees for you to allow you to do that. So if you're a software developer, stop worrying, we don't think there's much to worry about there.

If you're a content developer, yes there are some cases where there are additional licenses that are payable, but they're well defined and I think you'll be able to work it out pretty easily. I'm not going to go through this in detail, but in general, if you have questions about this, read the license agreement we gave you. I know that's something that goes completely against the grain for those of us in the computer industry to actually read a license agreement, but it might answer a question. Some of us actually read it before we publish them, yeah? So we try to make it accurate for you. If you're still in doubt, feel free to talk to us, go to the sites of these two licensing organizations that are each issuing licenses for this. Have a look at their licenses and their frequently asked questions. Your question may well be answered there, yeah. And if you need further clarification feel free to contact us or them, yeah. But our basic message for you is, come on in, the water's lovely, it's not a jacuzzi, you know, it's not going to be that comfortable, but it's not the Arctic either, this is survivable licensing water for those of you in the content business. Thank you.

So adoption in other standards arenas, what's going on here? So the ITUT has already published a video conferencing standard that includes this codec and indeed the RTP payload format that I just defined that talked about the simple version of it. 3GPP for 3G cell phones, release six is almost done. It's in there as the next generation video codec for 3GPP. DVD forum, HD DVD we've already talked about. Greg's going to give you more information about that in a moment. It's under final consideration at DVB, the European body and ATSC, the Advanced Television Something Committee in the US for television purposes.

Japanese broadcasters, a consortium of Japanese broadcasters already said yes, we're happy with the technology and we're happy with the licensing. We think we can use this codec. So they're going ahead. And it's on track at ISMA for the next generation of interoperable internet streaming specs. So with that, I'd like to pass it over to Greg Wallace, who's going to cover for you what's going on with H.264, advanced video coding, and the HD DVD business.

Thanks, Dave. It's really exciting to be here and talk a little bit about one aspect of H.264 and a very important application area. I'll just mention, by the way, that many, many years ago, I won't even say how many years ago, I was chairman of the original JPEG committee, and we thought the JPEG standard was pretty complex at the time. And then we thought MPEG 1 and 2 were really complex, but it's really impressive to see this world collaboration that's taken place of the world's video codec experts creating something so sophisticated as this H.264 video standard. I'm here for this little segment kind of to be a single-issue candidate and just tell you a little bit about one important application area, and that's HDDVD. I've spent about the last almost seven years now kind of living and breathing professional authoring and encoding tools for DVD and tracking what's going on with--in the HD optical disc technology area. And so it's--I think it's going to be something really important to consumers and professionals alike in the coming few years. So I thought it deserved a few slides just on that topic.

Okay, so let's generically talk about this as HD optical disk. And to really show you where H.264 fits in there and what its competitors and prospects are, I need to tell you a little bit about these two different optical disk camps, which some of you have probably heard of, and I wouldn't be surprised if many of you have felt confused from time to time because the terms and the technologies are really confusing. So there's actually two optical disc camps, high def optical disc camps. There's one called Blu-ray and there is the DVD Forum, a separate organization, which is what brought you today's DVD. And they, interestingly enough, have chosen the term HD DVD.

Both of these camps are planning or at least claiming, hoping that they will launch, in other words have a consumer launch well in time for Christmas 2005. Launch meaning that you'll be able to go to the Good Guys or Fry's, your favorite store and actually buy a HD DVD player or maybe it would be called a Blu-ray HD player. and also be able to buy HD movies on discs that will be the same size and more or less look just like today's DVDs, but they'll be much higher capacity in some cases, and they can play on these players. Part of the reason there's uncertainty about this is that Hollywood Studios are still a little ambivalent about this launch.

If you're here from a Hollywood studio, raise your hand. Okay, good, I'll speak freely. Hollywood studios like to earn as much money as they can, as all good companies do. And what they want to do is sell you their movies all over again, right? So one reason they're a little bit ambivalent about HD DVD is because they're not 100% sure that the time is right yet and that there's enough penetration of HD sets to really motivate everyone to buy their entire movie collection all over again. So that uncertainty is a major reason that the road map and which camp is going to win is a little bit uncertain.

I believe they will come around. There will be a certain momentum that gathered just like the original DVD, but kind of makes life interesting. Also for that reason, you'll increasingly hear as you read press articles that both of these camps are increasingly pitching this as not just about high definition, but also about improved copy protection.

So to talk about the Blu-ray method a little bit, you can think of Blu-ray as a really big bit bucket. They've decided to pursue an aggressive, at least relatively aggressive, blue laser technology. Blue laser basically means it's got a smaller or shorter wavelength than a red laser. Red laser are what's used today in DVD players. And the blue laser can focus on a smaller pit, can resolve a smaller pit. So that coupled with basically etching and molding technologies which can put smaller pits on the disk are what give the higher capacity. thing.

Blu-ray in fact is 25 gigabytes on a single layer and 50 gigabytes on a dual layer. So that's compared to today's DVD which is 4.7 and 8.5 single and dual. So it's something like pushing six times the capacity of today's DVD player. Because they have such a big bit bucket, they are not particularly motivated to go to an advanced codec, to go to H.264. So at the moment, at least, they're planning on using MPEG-2 high definition only, the same MPEG-2 that's used for today's HD broadcast in the U.S. and also in Japan. So this requires at about 15 or 20 megabits per second to encode an HD signal at really pristine quality. And if you do the math and work out the capacity it takes at that data rate to store a two to three hour movie, you'll see that that can fit on this disc pretty easily. And there's not a lot of motivation, at least not on Hollywood's part, to say, oh, well, you know, can't use a higher compression rate and put two or three movies on a disc because they don't really want to sell you two or three movies on one disc. They want to sell you one movie at a time.

So one of the biggest challenges for Blu-ray is because it's a relatively aggressive technology, there are questions about manufacturability. Can you replicate this at an affordable price? And basically it comes down to are the yields coming off the line going to be high enough, soon enough to be competitive with the other technology?

So HDDVD, the DVD forums, HD optical disc technology and specification is remarkably flexible, which is good and bad in comparison. It can place an HDDVD dataset on either a red laser disc, the same kind we have today, or a blue laser. The blue laser disc that they're developing is a different blue laser technology from what the Blu-ray group is developing. They -- and you'll sometimes hear this name, AOD, which stands for Advanced Optical Disc. That's just the name or acronym they've picked for their blue laser technology. And you can see it's not quite as aggressive in terms of capacity as Blu-ray. 15 or 30 gigabytes on a single or dual layer disk.

So particularly if you want to use red laser and you want to consider putting a full length movie, you absolutely have to have an advanced codec. like H.264. So this makes for an especially interesting technology I think for computer manufacturers because computer manufacturers already have a big installed base of red laser burners. And as a matter of fact, dual layer red laser burners are coming within the next year from the laser rider manufacturers. And so with an 8.5 gigabyte dual laser and using H.264 at around 8 to 10 megabits per second, you can actually put an entire two-hour feature length film on a red laser disk. So the HD DVD players, consumer electronics player boxes that will come out when HD DVD launches, they will be able to play an HD DVD data set off of either a red laser or a blue laser disk.

So to add to the flexibility, HDDVD has tremendous codec flexibility. It's actually going to be a requirement for the HDDVD players to be able to play back MPEG-2 HiDef, H.264, or Windows Media 9 video. These will all be mandatory for the player manufacturer. Now, if you're making publishing tools or encoders, it doesn't mean, of course, that you have to be able to encode in all three. You can encode in any one of them, but the players will have to play back all three. Thank you.

As I was mentioning earlier, so H.264 will give you about the same quality as MPEG-2 at half or even a little less than half the data rate. So for 24p material in particular, depending on the complexity of the scene and whether it's 1280x720 HD format versus 1920x1080, you'll be able to go as low as 6 megabits per second and really get superb, pristine quality.

Just a little bit about the other codec that some of you have heard of here and the SMPTE process that you might have heard. So SMPTE is another standards committee. They have a long history of standardizing both television and film standards, largely within the U.S., but internationally as well. And there's a technical committee of SMPTE called C24, which does compression standards.

And they have agreed. SMPTE has agreed to Microsoft's proposal to, quote, standardize the Windows Media 9 video codec. The name that you may have heard of in the press in the past year or so has been VC9, but just as of the last meeting in Milwaukee a couple of weeks ago, that has actually been, is in the process of being reexamined. So the name may be something entirely different. maybe VC1 or some other acronym, -1.

Just so you know, this standard is currently at the committee draft level. But there are, and our friends up north love to publicize that this is virtually done and out the door. But as a matter of fact, there's at least three more major hurdles to achieve. The other thing that... our friends love to publicize sometimes is that this VC9 codec will--or VC1, whatever it will be called, will be virtually free. Well, that's not really quite accurate. And in fact, there's a licensing pool forming in MPEG-LA, which is one of the licensing entities as Dave said for 264, is also the licensing entity for MPEG-2. And they are in fact in the process of identifying key patents and forming a licensing pool for Microsoft's Windows Media 9 video codec because it's very difficult these days to make an advanced codec and not inadvertently or intentionally, either way, use someone else's patented technology. So I personally will be very surprised if the licensing terms wind up being a lot different for VC1 than they are for H.264. So with that, I'm going to turn it over to Shilong Wu, and he is the man who is doing the H.264 codec at Apple. Thank you very much.

Thanks. Hi, cell phones are off, right? I don't know if that was-- that's cool. Hi, I have the distinct privilege of working with a bunch of cool codec guys. And I think some of them are here incognito. So that's a pleasure for me. So have you all seen all the demos that have been up yesterday, today, that kind of stuff? Little bit? No? So let's cut this out. Let's cut to the chase. No, just kidding. Demo one, please. And I'm going to-- that will, just for fun. Can we dim the lights at all? This is for Amy, by the way. This is Amy's favorite trailer. Here we go. - Turn to page 394.

  • He's a murderer.
  • Sirius Black is the reason the Potters are dead.
  • And now he wants to finish what he started.
  • I want you to swear to me you won't go looking for Black.
  • Why would I go looking for someone who wants to kill me? There's something moving out there.
  • It was a Dementor, one of the guards of Azkaban, is searching the train for Sirius Black.
  • It is not in the nature of a Dementor to be forgiving.

Fuck it. - Expecto Patronum! - That was pretty, wasn't it? Finally, and full. I'm going to try to show as many trailers as I can in the full, full of them because I know they cut them off all the time. Oh, I got to stop this. Quick time, oops.

Can we go back to one, I think? OK, so one of these-- Here we go. So I've got a bunch of slides. I'm going to try to run through them as fast as I can and get to the demos. But I'll give you a little bit of what people talk about 264. And I'll go through just three slides of key technologies. And I'll just go through them briefly.

And then we'll get to some of the demos and stuff. Got about 40 minutes? Excellent. So let's do this. So 264 is-- It really is a huge improvement over what we've seen before. I've been making these trailers for a while now, and I've turned into a trailer junkie, because I've never used to watch these things, but they're so beautiful when you actually do this.

When you see this on a HD screen, like one of our 23-inch guys, or 30-inch now, I guess, it's just amazing when you look at it, just totally crisp and much better. And they say this, they say it's the same quality as MPEG-4 Simple at half the bitrate. I think it is, and I'll show you something later that might not prove the point, but something like that. The way the compression works is nicer. As you start cranking up the compression, the image turns softer and it's -- and I'll show you some of that, too, as well. So I like it a lot.

One of the things that Amy likes to say is this thing is scalable. We use this on 3G material as well as the HD ones. The Harry Potter sequence that you saw right now was a 1280 by, well, it's kind of widescreen, so probably like 600 or something like that. That's a smaller HD version and that thing did it at 6 megabits. So that's pretty impressive. And it can do that kind of stuff. And I'll show you stuff at 3G as well. And it handles that really well. And like Dave was saying earlier, this codec really is and it's like a web of little things that just got better. There is new technology in there.

But a lot of it is just like, it's more like an evolution than it is a brand new this, brand new that. But the thing is when you stack it all together, it does magic, I guess. And the big thing about that is, again, it's at the beginning of the curve. We've just sort of gotten to know this thing. And we're just going to get to know it better. And you can expect to see quality improvements, speed improvements, all that stuff coming up.

three quick slides, promise, of technologies. So there's a list of things. One of the things that codec do is it does transform coding. So the big thing about this transform coder is that it's integer based, which means that the reconstructions are bit exact, which means if I'm a decoder implementer and he's a decoder implementer, our decoders be able to exactly reconstruct whatever it is that's been encoded. Previously in old school kind of coding, the transform is actually sort of -- the precision to which you want to reconstruct your transform is up to the implementer, except there's a specification on the tolerance and how precise you have to be, that kind of stuff.

Also, it's a 4x4, which means that -- what did I say? Small support reduces blocking and ringing artifacts. is true. But basically it means anything that goes bad within the 4x4 doesn't propagate the other 4x4. So you get crisper pictures. Improved interprediction. Most people think of prediction in the time domain. Well, codecs since MPEG-1 have had prediction within a single image.

But the thing about this guy is there's just a lot of ways to predict within an image. And that really helps the interprediction. And what you might not know, I mean all these like compresses details better, gradients, blah, blah, blah, but it actually helps in in high motion areas as well. Because what happens when the motion fails? Well, you throw in intra-prediction, or intra-blocks, and then the prediction really helps you code that thing.

that way. A couple of things on motion. The motion block sizes are, you got a ton of them now. MPEG-1 you've got this thing called the macro block, right? And the macro block still exists. There are 16 by 16 blocks that tile the image. And typically what happens is, well back then, you get to move that 16 by 16 block around the previous reference image to look for the best match. Well, in 264 you've got 16 by 16 blocks and you've got like 16 by eight blocks all the way down to four by four blocks. We've got a lot of ways to search for previous stuff and it's just very expressive. And it really, really helps code some of the more complicated motion. Complicated motion doesn't mean a lot of motion. This isn't necessarily complicated but a lot of stuff going in different ways and a lot of details going on. So another thing about the motion estimation is this quarter-pixel interpolation, quarter-pixel precision I guess is what I said. It's been around but this one is actually very good. Basically what that means is per--every time you do like a--go sub-pel pixel, you do have filtering, you filter your pixels. When you go quarter-pel, you get another filter in there and you get a lot of good stuff and it really provides some of this crisp stuff that I've been talking about and you'll see that in demos as well. And then my final slide, It's got a loop filter that does deep blocking on the 4.4 boundary. So you've heard of pulse filters and stuff like that, right? The distinction between this and say a previous codec is that the filter actually sits inside of the encoding loop. Generically speaking, encoders have a forged sort of encoding path and then it's got a little decoder in its belly, I guess, to give you the reconstructed frames. Well, this filter sits inside there. So the encoder runs this filter as well as the decoder and helps you with the prediction. Gets rid of some of the blocky artifacts and helps you with the motion estimation. And it's--because it's done in 4x4s, it basically--it's got the capability of touching every single pixel and it's a very, very effective smoothing filter and you'll see that too.

And there's a bunch more things. So one more thing I'll talk about is the entropy coding. context adaptive this, context adaptive that. And it's actually this fact that it's context adaptive is helping us a lot. And we see quite a bit of gain just because of this stuff. I mean, the first one is what you might know as Huffman. It's very close to Huffman, table driven kind of stuff. The second one, if there's metacoding, that's been around for a while too.

But the fact that they worked out the context stuff was brilliant, I guess. and more. So if you end up stacking all this stuff together, you get what we get at the end. And you'll see that the categories have been around for a long time. And Dave is absolutely correct. This is just yet another one in the evolution of stuff. But when you put them all together, it does good. And we're not kidding about the 1.8 times 10 to the 15 modes over there. You see that? Yeah. That was courtesy of Chris John, who did the math. So it's complex, but you'll see that-- If you find the right path through this search space, you can get some pretty amazing quality.

What did I say? So now we can go and drop this thing. Can we go to demo one? Let's see another one, since we've got a little bit of time. This one you might have seen this morning. You guys should have been there this morning. But if you haven't, here it is. It's never before seen, right? Something like that.

What's interesting about this one is it was just like a barrage of images and scenes and half second shots. I'll show you a side by side here. Let me put them away so that you don't get hosed. And we can see this is Will Smith in stereo. Oh, wait. I got to do the play all, don't I? So I'll do it again.

Spooner. Identify. Detective. Wow. Richest man in the world. Can I offer you a coffee? Sure, why not? I don't think anyone saw this coming. So whatever I can do to help. Sugar. I'm sorry. For the coffee. Sugar? Ah. Oh, you thought I was calling you sugar. Hey, you're not that rich.

So that's vintage Will Smith. I should probably stop it in a better place. That's all right. So one of them is 264. Let me stop him in a-- there. so you can see this. The one on your left is 264. It says so. And they're about comparable quality, I would say. But the big deal is that MPEG-4 is running at-- I think 1100 kilobits, which is what it takes. 480 by whatever it is, 240, something like that. And 1100 kilobits, 1.1 megabits. And 264 is running at 550 kilobits, exactly half the rate. So that ain't too bad, huh?

So if you want to post beautiful stuff, this is one way to do it. So the other way to look at this, of course, I think you've seen this before, is this is what you could do then, at the same bitrate. My robots don't kill people. That thing threw somebody out of a window. Is that registering with you? You're not suggesting? So if you crank up the bitrate to 11 kilobits, you can actually send this and H.S.64. This is 960 across by 520, something like that. So basically four times the size and the same bitrate. So there you go. Maybe I can show you something else as well. I can actually quit that. Can I ask a question? Yeah, why not? I might not answer it though.

or just the Concoder engine built in a quick time? I'll answer that later. How's that? OK. OK. Let's see. Any more questions? I'll just answer them later. So with 26 minutes to go, let me show you a little bit more about what 264 can do. A little bit about the artifacts. So this is a-- here. This is like the logo of somebody's fire truck.

It's nice, but it's got very hard edges, and it's got these weird, very detailed areas. So if you remember that-- So this is what you get with 264 at a megabit of this. Oh, by the way, this is 640 by something. 640 by 400, 480. Actually, 640, 480 exactly. Maybe not.

Something like that. But this is 264 running at what you would expect for standard dev sizes. And you get this, and it's decent, right? And what I want to show you is what it does as it goes farther and farther down. At 500, you start to lose some of this stuff in here. I don't know if you can tell. Up there, you may not be able to tell. But it looks pretty good at 500. And the edges are fairly crisp. If you really want to jerk this codec around, you might want to try 150 kilobits. Don't do this at home. You don't want to do standard def at 150 kilobits. But if you did, what you'll notice is that none of the edges go away. Isn't that pretty cool? Yeah. But you get smoothness through the details. I've been looking at codecs for a while. This one impresses me.

And I can say that publicly, I think. OK, FireTrack, oh yeah. The scalability demo, I've got one for you. And I want to do it just to show you that it doesn't take a G5 to do this, whether we've got a dual G5 in the back. This is somebody's stock iBook. I think it's a gigahertz. In fact, it is a gigahertz. And I want to show you a couple of things. We've seen Phantom on here, so we'll just show you Phantom again. if it would open. Here we go. Oh, this-- thank you.

What you were just seeing there is it's a megabit, broadband megabit type, standard def size phantom running on this G4 machine over here, gigahertz thing. And like I said, the codec actually, you can use it at all these various bit rates and sizes and it doesn't fail, which is yet another amazing thing for me. I've got one at DSL rates, I call it DSL, it's about 300 kilobits. This would be what you would call--.

QVGA, I guess. It's widescreen, so what I've done is I've cropped-- I've scaled it so that the number of pixels is the same as QVGA. So it's a little wider than 320, but the number of pixels are the same. And again, you'll see-- Doesn't it look good? It looks pretty decent. If you wanted to see, for example, our 3G version... The audio is a lot different. Our video hangs together, so I'm... pretty darn impressed with what can happen. Let's see.

All right. So one of the things that we'll be shipping with our seed, or you'll find in your seed, and you all got one, right? The tiger seed? Go get one if you don't have one, is the ability to export one of these things. So, what we've done is we've taken all the complicated stuff and we've pulled it down to one... Can you switch screens? Oh, yeah, thank you. So I'm on demo one now.

to one little thing. So this is maybe a three second standard depth size clip. And let me get this going here. What you can do with your seed to get it home-- and we really encourage you to try this out. And this is why we did this. We went through and put this together. You'll see in your list of exporters-- maybe I should do this so that you guys can see what's going on here. You turn into export, and you get this little dialogue up. And in the list of exporters, what you'll see is this thing called Apple 264 RBC Preview Movie. Just remember it's a preview. And you hit Options, and you get one.

And it's, what do you want? So otherwise, you can go crazy with all the knobs. And we've figured out what to do with that kind of stuff. So let us deal with that, and you deal with this. So this is a standard death movie, so maybe you want it about 800 kilobits. So that's what you do. And I want to put it on my desktop, so I go to the desktop and I just save it. Cool. So it's going to pump for a while. But we want you to try this.

We're pretty proud of what it can do. And I just want you to get out there and get your content through this thing and see what kind of stuff we can do. And we're just in the middle of this. And this is sort of a little snapshot of what we've been doing. And I just want you to play with it.

I'd love to hear feedback from you guys about what we can do. So it's done. No, it's not done. Do you want to leave it on that? Yeah, that's fine. It's done. That should have been about 50 seconds, by the way, in case you want to know how fast it is.

And sure enough, look at this thing. All three seconds of it. Nice. And let's see. How big is it? Would it say 800? Let's see. 9 times 8 is-- so what is that? 720. Am I doing my math right? Times 4. It's about 800 kilobits. So it actually works out. So the way we did our exporter is it does multi-passing automatically. It chooses how many times it wants to go through it. And it picks-- it hits the bit rate within 5%. So if you don't see a 5%, call me.

No, don't call me. call Amy and give her a hard time. And by the way, all the trailers that you've been seeing, we've been making them like this, just so you know what kind of quality you can expect out of this stuff. Yeah, that's it. So let's go back to-- Do I need Tiger to play them? Yes. Yes, you do. And I got a point in my-- you guys are all jumping.

Yeah, that's point number two. You'll see that. So let me just go back over what I just said. So we've got a decoder in the TigerSeed. And we've got a video-only exporter in the TigerSeed. It's integrated into QuickTime, as you would expect. And then the multi-pass business, it does that on its own. The UI is very simple, so you guys don't get confused. But it works. And it requires a G4 or G5, both the decoder and the encoder.

And one thing you should know, the bit streams are keyed to the tiger seed, which means don't go and re-encode your archive and hope that it'll play later, because it won't. So yes, I don't know who asked the question, but yes, it's tiger only. So, OK. when we ship. I think that's safe to say, isn't it? OK, there she goes. You listen to her when she's-- What if we do all the questions at the end? And we've got just a few more minutes. So what I'll do is I'll end with my favorite disturbing movie, and then we can go into a question session. Oh, and can we go to demo one, please?

Ten days ago, one of my satellites over in Antarctica discovered a pyramid. What exactly on the ice is this? It's not on the ice. It's 2,000 feet under it. Let's make history. Oh my God. Whoever built this pyramid believed in ritual sacrifice. Did you hear that? What did this room was called? Sacrificial chamber. It's like the best. So try it out. This is the codec to go to. I mean, this quality is-- I just haven't seen this stuff, this kind of stuff, in a while. So I think we're done.