Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2003-720
$eventId
ID of event: wwdc2003
$eventContentId
ID of session without event part: 720
$eventShortId
Shortened ID of event: wwdc03
$year
Year of session: 2003
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC03 • Session 720

Content Creation for Mobile Phones: Top 10 Success Factors

QuickTime • 59:33

The mobile phone is quickly gaining popularity as a medium for capturing and playing multimedia of all kinds. What are the key factors for successfully creating wireless multimedia? This session outlines the essentials in creating quality mobile content using QuickTime-based products and tools. Learn how to tailor your production process to accommodate the wide range of network bandwidths, mobile phone capabilities, and distribution processes.

Speakers: Aliza Hutchison, Kay Johansson

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

So, let's take this clicker and see how it works. As she said, I'm the CTO of PopEye Technology. We are based in Sweden. We work with encoding systems, so that's the capacity why I'm here, and we work a lot with wireless. So, what are we going to talk about today? Well, the goal is to understand, as I said, all aspects of wireless content creation. And I don't think you will have all aspects of it. We'll talk briefly about business models, how the radio networks is acting, content creation, how the content workflow works for an operator, carrier, etc. And also show you tools on how to make this kind of content.

So, how it all started. It started basically with SMS, small messaging services. I know it's coming up now in the US, you send this text message to each other. SMS was actually created as a diagnostic system for GSM networks. But in 1998, it started to grow a lot and you had this kind of SMS messaging, person to person.

It grew very a lot in Europe and Asia. After that, we got something called Enhanced SMS, which is basically an SMS with like a two or four bit graphics with it. In 2000, they actually released it earlier than that, but in 2000, something called iMod was released by Docomo. It was basically like a multimedia SMS.

You had graphics, you had sound, you had animation, etc. and you sent it to each other. So that's really when it started to take off. Then in 2002, the so-called MMS was released, which stands for Multimedia Messaging. It's like an SMS, but you can have video, audio, text, animations, etc. It's pretty much close to what iMOD is.

And now we're in the phase where video is coming in. And basically today you have services where you can download video, and it works for different kind of networks. I will talk about that later on. And you also have streaming. Mainly today streaming is about live streams like traffic surveillance, etc. And also video telephony is coming on the 3D networks. The cameraman doesn't like me because I move around, so I'm going to stand here. So, a brief look at the business models. If you look at SMS, there's a lot of operated SMS services.

But most are third-party SMS services like Let's say weather, news, that kind of thing. And it's based on revenue sharing. So the operator gets a piece of the SMS and the SMS supplier also gets a piece of it, exactly. It also works as a bit pipe. And when I say bit pipe, I mean that when I send an SMS to one person to another, the operator is used to pay for sending the SMS. In this case, it doesn't count bits.

It's a unity price, but an SMS used to have X amount of size. So that's why it's a unity price. Then we have MMS, and it's pretty much the same thing. You have operator services based on MMS. You have third-party MMS services, revenue sharing. And you also have the kind of bit pipe where I send an MMS from me to another person.

When we look at video download and streaming, I set a question mark because it just started. And I see a lot of operators, carriers, doesn't really know what kind of business model they're going to have. So we have operators which act as content creators. Not that they sit down and record like a football game, but they take content from the TV station and edit it. So they make their own content.

Some of them work with content service suppliers, like revenue sharing, exactly as you did with SMS. And some only work as bit pipes. Basically, you have the same kind of business models all the way. When we talk about video telephony, it's time-based. It's the same as making a call. So that's pretty easy. So this is the kind of business model.

I'm not going to get into too much talking about this because I'm a techie guy. What kind of networks do we have today? We have the 2G networks, which is like GSM, which we have in Europe and Asia a lot, the CDMA, TDMA in North America. And you can have 9.6 kilobits per second when we talk about data transferring if you would to do a multimedia service. And as you all can understand, you can't really do that much. It's not like you send a streamed video with 9.6.

We have the 2.5 generation networks, which goes up to 30 kilobits per second today. But the theory is that GPS can go up to 171, and you have HS-CSD, hard for Swedish to say in English, which is up to 56. and you have Edge, which some argue and say that's 3G.

So it's sort of 2.5.5 or something like that, which can come up to 474. But that's in theory. The networks today are in practice when we talk about 2.5G is 30 kilobits per second. That's the fact. Then we have 3G, which today only has 64 kilobits per second. You can get 128 kilobits per second services, and I will talk more about that later on how that works.

But in practice, it's 64 kilobits per second today. In theory, it will be able to go up to 2 megabits per second, both WCDM and CDMA 2000. So that is the kind of network we are talking about when we talk about wireless today. I'm mainly going to focus on 2.5G and 3G because that's where you can use video.

So a brief look at what kind of services do you have per network, because that's actually, the network is some way steering what kind of service you can have. Of course, for two years speech, and this is true for all. If you're three year, two and a half year, speech is one of the major businesses for, or services for the operators. It's still where they make much, much, make much, much money, most money.

You have SMS, as we talked about before, where you do this kind of communication, person to person, infotainment, etc. And ringtones is very popular, at least in Europe. I don't actually know how it is in the US, but in Europe it's very popular to download the latest song from your favourite artist and have that as a ringtone. On two and a half year, you have this MMS, and, which is like an enhanced SMS service.

So pretty much the same kind of service which you would have on SMS, but they've added on multimedia to it. And you have video download, sports news, etc. And usually when we talk about GPS networks, which is 2.5, we talk about streaming, we talk about traffic. You can have traffic surveillance, you can check out how's the traffic on that road, the 101, etc.

It's small clips you can download, or stream, news, sports, etc. So it's the same kind of services all the time. And you have more content that you can use, or richer content you can use, depending on the bandwidth. For MMS, it's the same thing for 3G. You have MMS, you have video downloading, you have streaming, the same kind of services. And on top of that, you have video telephony.

And in Sweden, for instance, we have an operator called 3G, which actually is up and running and doing these kind of services as video download, where you can download news, sports, entertainment. You can get videos from your pop artists, etc. So this is really up and running. So this is how the services sort of lay down on the networks.

So, a bit about how a radio network acts. I'm not going to talk about the structure of a radio network like RNCs and etc. If you have questions about that, you can take them in the Q&A. But basically, a radio network is built out of cells. And if we talk about Deep RS, we start with that because that's the easiest one. You get time slots.

So you get the time slot, and many people think like, "Okay, it's a time slot. How is that working?" In a Deep RS network, every call has a very, very small time slot. So let's say you have 100 slots per cell. Of course, you have a lot more. But this is just so I can calculate in my head. So, one time slot when you enter the cell is pretty much, in theory, 12 kilobits per second. But you always calculate with 10 because you have a lot of overhead, etc.

So, usually when you come in, typically you get three time slots. That's how it starts. Then you move over to the next slot. But in that slot, you have a lot of other uses. You might just get two slots in that cell. And this is one of the things that differentiates radio networks from internet, for instance, is that you actually move within. You're not connected to the same access point all the time. You're handed over to new access points all the time.

So, it's supposed to be much easier that you can be in the cell and X amount of people comes into your cell. And then, once again, you might just get two time slots. So, 3G, well, it's the same kind of thinking, but we talk about megabits per second instead. It's the same kind of thinking. You have X amount of bitrate in a cell. So this is pretty much how the radio network acts.

So, let's talk about delivery or distribution. If you look at streaming for GPS, as we said before, one time slot is 12 kilobits per second, but we calculate with 10 usually. So, we're talking about very, very low bit rates like 20 to 30 kilobits per second, which in essence is two to three time slots. But one of the most important things about a GPS network, or actually an edge network as well, is that the operators can prioritize the services.

And I haven't met an operator carrier so far that doesn't prioritize speech. If there are any operators here who do the opposite, they can raise their hands, but I don't believe that. So, that means that if there are X amount of people in the cell and they want to use ordinary speech, the people who's appointed the GPS slots, the time slots, they will prioritize that down, so you will get like one time slot or two or something like that.

And it's also like this, that they can prioritize in a way that says when you come into a cell, you have three time slots, but after 30 seconds or one minute, then you get down to two time slots, because they want to give more time slots in the beginning when the user comes into a cell, because you usually do something like 10, 20, 30 seconds to begin with, and then you want to have good access. Then, of course, you have network congestion, you have a lot of packet loss. Wireless networks, radio networks, it's like a... low reliable internet.

You can have half packets coming through a radio network, which doesn't exist on the internet, for instance. And the other thing is latency. When you do streaming, you have the RTCP keep alive, and because of the latency, you might not get that keep alive back to the stream server.

So, depending on how often that the stream server expects to get this RTCP callback, it will shut down after a while. So that's why it's important to do short clips when you talk about streaming on GPRS, because if you do a 30-second clip, nothing's going to happen, because it probably usually takes about a minute before the stream server expects to get the RTCP back.

[Transcript missing]

Then you have the same thing as on GPS network, network congestion, packet loss. It's much better than on GPS network. And still, most of the 3D networks are circuit switched, not fully like a packet-based network. So it's a packet bearer which runs on a circuit switched environment within the radio network. And the circuit switch is basically like point-to-point, like that's what you do when you have an internet, you know, point-to-point connections. So that's what circuit switch is.

And then you have X amount of bottlenecks in the infrastructure which you have to consider as well. For instance, on the internet you have companies like Akamai and those kind of companies that place relay cache service out in the network. That doesn't exist on radio networks up to date, but that's also something that is coming in the next releases. So today you have bottlenecks with... You can think of it as routers as well which you have to take into consideration. It's not really routers, it's RNCs, radio network controllers, etc.

The most important, however, is the firewall. I don't know how many people had asked me about firewalls. Because the problem is that usually operating carriers doesn't open UDP access in firewalls. So you can't stream. And everyone's trying to stream and you just can't do it. In Europe, they started opening it up, and I know that in Asia they have it. A lot is open up there. But it also depends on the business model. Do the operators allow for third-party services which doesn't have a contract with the operator? So that is one of the major issues with streaming on this kind of wireless networks.

So, I'm going to talk about something called 3GPP. 3GPP is not really a standard. It's something called Third Generation Partnership Program, which is where you have pretty much all the big telcos like Ericsson, Nokia, Motorola, Docomo, Vodafone, Hutchison, that kind of companies. And they made a file format called .3GP. And now you can ask the question, why its own file format? In the beginning, it was actually using .MP4. Well, the answer is, in 3GPP, you use an AMR speech codec.

And in MP4, you're allowed to have MPEG-4 or AAC, or MPEG-4 with AAC, or CALP as the audio speech codec, but you don't have AMR. So, that's why they created its own file format. And now, you can have a file format called .3GP. But it's very similar to MP4, which is very similar to QuickTime. It's built on atoms, etc. It's very similar. It has some extra atoms to handle AMR.

In a .3GP file, you're allowed to have MPEG-4 simple visual profile level 0. I will explain that a bit later what the difference is. You can have H.263 baseline, AMR narrowband, AMR wideband, and AAC low complexity, which is what you can have in MP4 as well. And you can also have timed text. That's what you can have in MP4.

You can also have timed text. That's what you can have in MP4. The SMILE 2.0 basic is not really in the 3GP file. I just wanted to put that out because the players on the phones are supposed to be SMILE players. So, the presentation language is SMILE 2.0 basic. And in that, you can call like 3GP files, RTSP. And I heard a lot of questions about it, but it's actually SMILE. 3GP uses SMILE as the presentation language.

So, what's the difference between 3GPP and ISMA? I assume many of you know about ISMA. Well, one of the things is this MPEG-4 Simple Visual Profile Level 0. In ISMA you have Level 1. And the difference between 0 and Level 1 is basically nothing. It's 64 kilobits per second, but on Level 0 you can only have one video object, and on Level 1 you can have four.

So, this is just to minimize everything for phones. So, the Level 0 was created by MPEG-4 for 3GPP. So, that's why it's Simple Visual Profile Level 0. You both have AAC, but when you do streaming, in ISMA you use generic packetizing, which my personal thought is much better, but in 3GPP they use Latin packetizing.

And then of course is the H.263 baseline and in ISMA you can have MPEG-4 from Simple Vision Profile 1 to 3. I think they also put in Advanced Profile at this point. And the AMR Codec. So this is like what differentiates 3UP and ISMA. And ISMA is like 3UP, it's not really a standardization body. So that's why it's easy to compare it. They just point at different standards which they use.

So, a quick talk about terminals. I say all support 3GPP, and that's not really true, but I'm from Europe, and in Europe it's very true. So, let's start with Nokia. Nokia is the leading provider of handset terminals. And everyone tells me that they use ReelPlayer. Yeah, they do. But the ReelPlayer in the Nokia phone supports 3GPP. So you can play a .3GP file in a Nokia phone. You can, however, not stream 3GPP to it. You can only stream Reel format.

But the new Phone 6600 supports streaming of both 3GP and Real. So if you make a .3GP file, it will work in the Nokia phones even if it's the real player. In the Sony Ericsson telephones, the P800 is a very popular phone on this one, which is a GPS telephone.

They have the Packet Video Player, which is an excellent player. It supports 3DPP, it supports streaming according to 3DPP and ISMA. So that's why, I'm also going to talk about it later, is in QuickTime, for instance, you have streaming, but it's not streaming according to 3DPP, it's according to ISMA, but it will work in Sony Ericsson telephone.

And all use QSYF size 176x144. And then on top of this, you have the Motorola telephones, you have the Neck telephones, Masashita, etc., etc. And most of them in the world support 3DPP, and some of them also support proprietary phones like Reel or Windows Media or something like that.

So, a quick thing about MMS. This is also something I saw on the QuickTime list, a lot of misunderstanding. MMS is a multi-mime formatted message. It's like an email. It's basically an email you're sending, but it's to the phones. And you can have JPEG images, GIF images, AMR, sound, you can have smile in it, you can embed a 3D P file, you can have RTSP links, etc.

So, that's what an MMS is. So, when you make a 3D P file and you want to transfer it to the phone, it's not an MMS message you're sending. It's just you're transferring a file and you're playing it back. So, an MMS is not a 3D P file.

You can use a 3D P file in MMS. And video messaging, which a lot of phones have today, is actually not MMS. From a technical point of view, it's more like just an email with an attachment. But that is more moving into standardization. So, you have trouble sometimes.

When you make a clip on one phone and you want to send it to another, you might not be able to play it, or the phone might not even be able to recognize it. Because it's a proprietary way of signaling what kind of message you're getting to the phone.

All MMSs are distributed by an MMSC, which stands for Multimedia Messaging Central. It's pretty easy. And that's pretty much like an email server. For all of you working in IT, it is an email server, basically. And when you get an MMS, you have something called a notification, and that is sent by the PPD, Push Proxy Gateway. I think everyone knows what that is.

Anyway, what it does is it sends an SMS to your phone saying, "You have an MMS." And you say, "Okay, fine." You click "Okay." Then it goes to the MMSC and downloads it. So it's like a store-and-forward function, exactly like an SMS. So, let's get in to talk about content.

Oh, so this is how it works. How to get content. Well, I see a lot of things like the operating carriers. This is not really their business. But they're starting to look at it because they need to. So, and many times they ask the wrong questions. I've seen companies, for instance, I was in a meeting with BBC and they came to BBC and they said, we won't have streaming, we won't have streaming content. BBC didn't understand what they were asking about, so they gave them, oh, we have real Windows Media file. They came back and said, oh, what would we do with this? And said, oh, we have to use Nokia because that's the only phone that supports real.

But what they should have asked is, what kind of content do you have? What can you give us? Because BBC has like, I don't know, enormous amount of MPEG-2, for instance. They do not have a lot of streaming. It's just 1%, which is actually in streaming format from what BBC produces, for instance. So that is usually one of the questions, that they ask the wrong questions. So that's good for you to know if you're a content creator.

Make sure you understand what they're actually asking for. Aliza Hutchison, Kay Johansson And also another thing is that internet content, you can't use that in the phones. I also heard that, but it's a lot of content on the internet. Why can't I just use it in the phone? Well, pretty easy. It's not adapted for wireless.

And if you just look at the size of movies on the internet, you realize that you can't get it into mobile phone. And then everyone's talking about real-time transcoding and it really doesn't work because you would have like this kind of building used to do transcoding in real-time of this internet formats.

So, the production-wise, well, the productions are obviously not made for mobile phones. It's very few production companies out there which makes a football game to work in a cell phone. And it's basically like when you do encoding or compression for any kind of thing is that when you work with these low bit rates, you want to avoid transitions, you want to have clean cuts, you don't want to have too much zoom, you don't want to have too much movement. And on a phone which has even smaller size, it's even more true.

And also it's like when they get content sometimes, it's not the production format. It's not like you can sit down and edit, you get like MPEG-2 or something, which usually requires pretty expensive equipment to work with. So it's not like you get motion JPEG or like YCBCR file or something like that. You usually get a distribution format.

Rights is another interesting thing which is an issue. Some operators have the rights to the video, the images, but they do not have the rights for the commentary. So that means that they have to take it in and put on their own commentaries on it. So they need to edit, they need to do voiceovers on the content coming in.

So, what I've seen today is that how to drive the business is the operator in the short term has to be the content creator. They have to take responsibility for getting the content. Because it's hard to go to a content provider and say, I want to have content, I want you to put it in 3GP, etc., etc., because it's not that much money for them to make at this point.

So the idea is that you migrate in the future from the operator being the content creator, supplier, whatever we call it, to actually get out to the real content creators. So they need to migrate to business and say, supply them with content, but that comes when you have a lot of users, of course, because then it's a business also for the content creators.

I'm not going to talk too much about this. This is just to give you an example of how a streaming system looks at an operator. But I'm going to do it very briefly. From the left, you have this CM, which is the Content Management System. And then we have, which I'm going to talk about later, product from us, the compression engine, which actually does the compression. So you put in content in the Content Management System. It automatically encoded into the kind of performance you'd like to have. And then it's being distributed into the stream service and HTTP service, et cetera.

What usually differentiates an operator carrier, in this case, from an internet provider is that they have some kind of RTSP proxies or something like that. What the RTSP proxy does is it takes care of billing, et cetera, because this has to be connected to billing system. And still, I don't think there's any internet service which really makes money. So this is probably a new thing. So what happens is when I set up an RTSP call, the proxy will intercept that.

It will go down, as you see. It will check with the billing system to see, does this person have credit enough? Yes, he or she does. It also checks with the Content Management System to see, is it a unit price for the content? Is it per packet? And from that, it gets an OK or a knock saying, yeah, you can play this content.

And after the session is over, it creates a CDR, which CDR is used for the billing, actually. So in the CDR, it says that this person, this person, this person, looked at this link for this amount of time, whatever it is. And then it comes on the bill. You also have all this kind of authentication.

Down below, you see you have triple A's. That's the authentication service. So when you log in from a client, you need to-- the network knows who you are and what accesses you have, et cetera. So it's a bit complex, the whole chain. But this is just to give you an idea on how it works for a wireless operator.

So, now we're going to talk a bit about creation. And that slide is not supposed to be there. Cool, this is by QuickTime. I do it twice then. Okay, small operations. You can have a workflow which is pretty straightforward. You have this content, it's quiet. Basically, you have like a DV camera or whatever it is.

You use like Final Cut Pro or something, you capture it in, you edit the file, you do the encoding of it into 3GP file and you directly distribute it. With QuickTime now released, which has 3GP support, any kind of third-party software can use this QuickTime API to create content.

So, even if you wouldn't use Final Cut Pro, but maybe Avid or some other editing software, you will be able to create 3GPP compliant content.

[Transcript missing]

Now we're going to get into techy stuff about how we encode. So first of all, we're talking about low bandwidths, below 64 kilobits per second.

The most important thing is to have as high quality content coming using as possible. This is true for all kind of encoding compression. The best, the highest quality you can get, that's what you should start with on the source. Of course, use good quality codecs. You have to do some testing to see which kind of codecs you like.

How does it work? High quality pre-processing is just as important. You have to scale the pictures, you have to crop the pictures, you have to correct gamma, etc., etc., and you have poor pre-processing, well, you don't get a good quality. And also it's that you have to reduce frame rate, and of course you have to do that all the time. I mean, you reduce frame rate whatever you do when you go to internet usually. But on a wireless network, you usually, instead of maybe going to 12.5 is PAL or 15, you might go to 6.25 or 5 or something like that.

Because it's better to have better spatial quality on the frames instead of having a movement because the screens on the mobile phones isn't that fast. So the kind of motion that you're trying to reduce by having more frames is achieved anyway on the mobile phones. So it's better to have lower frame rate and better quality on the frames itself.

You have bandwidth constraints, of course, and always use CBR, Constant Bit Rate. I don't know how many of you really know exactly what it means. Constant Bit Rate is not flat. Constant Bit Rate means that you can fluctuate within a certain time frame, and that is set by the Video Buffer Verifier.

So if you set that to two seconds, you will have a window going over the file all the time, and under those two seconds, if you set it to 64 kilobits per second, it can never go over that. So that is how the Video Buffer Verifier works. So that's very important, so you have a very, very constrained bandwidth. Drop frames.

If you have an encoder which can drop frames to maintain bitrate, use that. Because this is the most important thing about wireless. You don't have any handroom to work with. And of course, natural keyframes is possible. That's the best way to do it. But also if you can add on the longest distance to a keyframe, because that also has to do with error handling. If you lose a keyframe, you don't want it to take too long until you can recapture it on the images. And of course, error handling in radio network. When we talk about streaming, you want to be able to set packet size, how many frames per packet, etc.

And something called RVLC, reverse length coding. And that's very useful when you have congestion on a network. So the codec can determine itself on the player side. It can build new pixels. So this is dependent on the players. You can use RVLC. And also, some operators have, for video telephony, and they want to use video telephony to play static files, not being live. Then you usually need to use RVLC, because that's how the video telephony client expects to get the frames.

Also, what we talked about before is that make short clips. Before capacity increases, it's about 30 seconds, a minute. That's a good file length to do it. As we said before, new users are prioritized in sales when they're coming in. Speech is prioritized, etc. So make short clips because speech, as I said, has priority over all kind of data.

So, streaming versus download. Well, streaming for wireless, the proof is of course that memory capacity in phones are limited, so streaming in that sense is very good. It's less overhead in RTP than in HTTP, so you can put more pixels or bits to the media itself, which of course is very good on these low bit rates.

It's lightweight DRM in itself, because you have to be very skilled to get into a phone, hack that, and download the video. On the other hand, why do you want to download a video of 64 kilobits? So, but that's one of the things. And it usually is viewable in about 2 to 4 seconds.

Cons. Yeah, of course, firewalls. Same thing. Firewalls, firewalls. And it's not supported by all phones present. I think most of the phones by the end of this year and the beginning of next will have support for streaming. Download. Well, it's supported by all phones. All phones. So that is a good thing, of course.

The cons is that most of them don't support progressive download, which might seem a bit strange, but that's just a fact. It doesn't support interlaced content. So it needs to be fully downloaded before you can start viewing the pictures, images, or video. Sorry. And you need large storage on the phone. In our sense, on a computer, it's not anything large, but you still need to have the storage.

This is the thing everyone tells me, but the quality of download files is usually much better. Because you are not constrained to bandwidth limitations, you can have a higher bitrate because it's just download time. Well, yeah, that's true for internet and it's in theory true for wireless as well. But the download time, well, if you have breaking news and you have to wait four minutes to get the breaking news, it's not breaking news anymore. But one of the more important things is that you pay per packet.

So, even if it's a download or stream, you pay for the amount of packets you send. And then people say, well, it's usually a unity price sometimes. Well, the operator has a cost for every packet, and this is what drives it. If you can make content on a low bandwidth, good enough, so the user wants to pay for it, and it costs as little as possible for the operator, that's when you're going to get business, because it costs for the operator. So, you have to treat download as a stream. You don't have to be as constrained.

But don't think, I can do 300 kilobits per second because I want to have a better quality, because the operator or your customer, whoever it is, will come and knock on your head and things like that. So, this is very important to think about. So, this is where the QuickTime thing was supposed to be. So, let's get over to this machine, which is laptop number one.

Do I change or do you change? What do you want me to do? Ah, so that's how it works. So, let's take a look at QuickTime then. I have a small file here somewhere. This is just a little video. This is Motion JPEG, 30 seconds for dead kilohertz PCM sound.

So, if I want to export it into 3D-P file,

[Transcript missing]

So, video. I have my MPEG-4 codec. I can use H.263 for instance. But I choose to do it at 40 kilobits a second. I use 12.5 frames at this point because it's PAL content, so it's 25. That's why I use 12.5. I put a keyframe every second second.

And in this case, it's audio speech, and this is a more narrow band you're using right now. And you have fixed bit rates for it, as you can see when you drag it. This is called modes. And this is mode 7, the highest one. And it's always mono. AMR only supports mono.

It's always 8,000 kHz of sampling. Frames per sample is when you do streaming. And then we come into the things about streaming. Well, if I do streaming basic, this is QuickTime hinting. This is not 3GPP hinting. But it would work in a P800, for instance. And we probably work in some other phones as well because some of them support streaming according to ISMA. But in this case, I don't want to have it streamed. So, okay. Put it on desktop. Save. And off it goes.

This is now using QuickTime Pro, as I said before. But this kind of thing is accessible through the QuickTime API. So any kind of third-party software, any kind of software can use this to create this kind of content. Then, as you saw, there's not much about cropping, etc., pre-processing, but that is what the other kind of software you probably will use to do that in. So that's how you build it on top of the QuickTime API, basically. And this is just an 800 MHz old PowerBook, so don't worry about the speed. I think.

So, here's the file. With the beautiful AMR sound. This is of course not down for this kind of rich audio. But that's what you have to live with sometimes. Yeah, so. Let's get back to the...

[Transcript missing]

No? Now it is. Now it's not. So. What I showed you right now was a small operation. This is where you can use Final Cut Pro or that kind of editing tools and directly go to 3GP.

You can use QuickTime Pro if you have that to take in files, edit them, do that kind of thing. When you get up to a medium scale operation, when you have more content coming in from several sources, where you have several editing stations like Final Cut Pro, etc., then you have a demand for larger sort of tools.

So what I'm going to show you right now is actually a pop-up product called Compression Master, which isn't released yet. It will be released on the 15th of July. So this is the beta version. and all of you know what beta means. So, let's go over to the laptop.

Great. So, this is the compression monster. We're going to start to do some files here. We're going to start with some settings, actually. So I've done some settings pre-hand here. So I open up this and we go to 64 Kbit. And in this, of course, you do the same thing. You choose the file format. You can choose a lot of other, but now we're talking about 3GP. So it's a 3GP file. I want to use MPEG-4, a more narrow band. I want to do streaming.

So let's move on to the video side. On the video, I put this into-- well, 64, maybe 42, probably. Something like that. I put the key frame at 25 because still this is PAL informant I'm going to use. Simple profile level 0 of the MPEG-4. This is very important. Why there's a button like this is because the MPEG-4s usually detect, depending on the bitrate, which profile it is. But since profile 1 and 0 is basically the same, you need to be able to override that.

And it also is because some operators will come to you and say, I want to have 128 kilobits of content, but it still has to be simple visual profile level 0. So this is why you have this override thing. Then we have the buffer verifier. And I should actually set this to the buffer on the phone. So if you know that the phone has a two-second buffer, usually if it's a 3D phone, it means that it can handle 64 kilobits a second. So you have to sort of calculate as well.

But I set this to two seconds. The sliding window over the content will say that during this two-second sliding window, it can never be above 42 kilobits per second. I say framescape probability, which in this case is dropping frames, because I want to maintain the bandwidth. So I put that pretty high, like 44% or something.

And I can use RVLC, etc., but that's not what I'm going to do right now. I put the frame rate down to 6.25 frames per second, because I'm thinking about, I'm going to play this on a, I don't know, I can't say the brand, but it's a Swedish brand. And I choose the QC format.

I can choose whatever I like, actually, but I use the QC format as it is now. I don't want to do any cropping on this file. And then I can do, like, you know, gamma corrections, all that kind of thing, but that's not really important now for this. for the devil.

Audio, same thing, AMR, Narrowband, 12.2 kbps, sample rate, etc. I can do metadata, and I put in stream and I say 5 frames per second. 5 frames per packet, sorry. So, now I've done my settings. I can take a look at some others as well. This is like a 128.

But the important thing here is about the bandwidth constraint. All I'm doing is I'm trying to keep the bandwidth. That's the most important thing, which means sometimes I will get lower quality. But that's actually the fact of life. So, that's how it is. So, let's put in a file. Let's do the same file as we did before. I add on a setting to it.

So I used the 64 and I put it on desktop and off we go. And this is of course how you can do batches. You can put on a lot of files, a lot of settings. It's just about, you know, how many files do you want to do.

And with this sense you can build a small but very effective system. So you can have several editing stations coming in like MotionJPEG or you export it as native from Final Cut Pro and you just put it in here. You have your settings which you created. You can sit and work with them if you like to or you can just use the settings you've done once. So, while that one is going, let's get over to the slides. Oh, I forgot this one.

So, when you have a large-scale operation, then you need to have an automated system. And that's exactly what we built. So, with the Compression Master, you can export XML settings. The settings I showed you before can be exported, and you put them into something called the Compression Engine, which acts with watch folders. So, when you've done this once, the content provider can put in like MPEG-2s on a watch folder, and the compression engines will automatically start to encode for it.

So, just to give you an idea, an average portal, if you look to the right, you have like two news feeds, 10 times times 30 seconds, financial information, traffic. It's not a lot of files you actually want to encode. It's not a lot of, if you look at it from sort of service, it's not that much, but it still generates a lot of data. So, you have about 243,000 megs that has to be converted daily, and the turnover time from when you drop it into the system should not take more than one minute. And then you need to have a pretty powerful system to do that.

This is a real-world example of how an operator works. They hired a post-production company which has the compression master, which creates the XML settings. So they work like a sort of quality control of how it's done. And they can also make trailers of Nemo, of course, very good film, directly and work with it.

Because it's a trailer, you sit down and you work with it, you test with it, and you make this in the 3D file, you send it to the operator. But when we talk about automated service like news, sports, etc., what happens is that the broadcasters deliver it as MPEG-2 files, which they have on the Grass Valley servers or Quantel or whatever it could be. And they take these MPEG-2 files, which they produce.

So they don't have to do any special production in their case. They don't have to do any pre-formatting of files. They just drop it into the watch folders. And the compression end is on the right, which decides that the operator starts automatically to compress this into 3D files.

And the other thing is that they also take in files and they do editing of it. And it's a tape transfer. So they take the tape, take the tape, put it into Final Cut Pro, edit it, and then put it into the compression master, compression engines. And this is pretty much the same thing.

This is pretty cumbersome. So the next step, what they're looking at, is to set a compression engine at the Quantel provider, still having the post-production company deliver the XML file, so you control what kind of settings it is. The production company drops their MPEG-2 directly into the compression engine, and it goes maybe as a UUV file, so you can edit it in the Final Cut Pro without going to tape, which takes a lot of -- takes away a lot of time. Or it can go direct as a 3DP file, so you will be able to remove that. So that's how an operator works today. So right now, I'm going to show you -- so let's go to the laptop about how this workflow actually goes.

So, in my compression master, right now I'm looking at these machines. I have a watch folder on these Xsers, which I see here. I've done two settings. What I can do is that I can from here export a setting. So I take this 128, I don't think I have that one. So let's export this one. So export it. Run Dexport. Yes.

So, now I have this setting. So, now I'm going to move it over to the compression and inside. You just move it over. This is Demo Guides. So, now I have three settings in here. So, now I can take any file from here, when I've edited it, and I can drop it in.

And now it's a 100 megabit network, so it will take a while to do this. So while I'm doing that, we can switch over to this machine because I'm going to put in a lot more files, but it will take too long time to do it from there. So let's go to the excerpts. So, right now, here is my watching folder, and now I've done... Some media files. I've copied this one. This is a 30-second clip. So, let's put this in.

So now I'm copying it over to the compression engines. And I don't know if you can see if it started working or anything. I'm still copying, so...

[Transcript missing]

And this is done purely automatic on this XR. So, let's step this up. Let's put in some MPEG-2s. So, let's take all these, put them in there as well. Oh, the netbook hates me. So, still the files are coming out there. So we can take a look at, for instance, this one. This is the MPEG-2 file, which is converted.

[Transcript missing]

Sorry about that. So in this way, I can just have this system running all the time, and it does it fully automatic. And I think now it's a lot of files. Is it working? Does anyone see anything? One of them. Oh, it's a network. Sorry about that. Let's see what happens here. This is typical.

Well, anyway, I'm putting in like 50 files in this system. So now it's starting automatically. And this is the kind of system that the operator needs to have when they're talking about automated processes. So they put this sort of clusters of engines, of Xers, for instance, and they just drop in files and they work towards doing it because the turnover time has to go so fast to do this.

So they really need to have it working. Like a one-minute clip should come out in the phone within maybe one to two minutes later on. So that's how fast it needs to go. And since the network isn't really doing what it's supposed to do, you won't really see this, unfortunately.

Can you go to the... Yeah, you have it here. So now you see the files are coming up here. So this is like the... another... the Tricia McNeil video. And as you see, this is down from MPEG-2, so the quality becomes pretty good if you put in really good content.

Because this is 64 kilobits total, so the video is about 40 kilobits a second at this point. So, I think you get the idea on how the compression engines work and you have even more files working. I think now most of them are up and running in the XSERVs. So, let's go back to the slides.

[Transcript missing]

First of all, you need to understand the operator business. You need to understand if it's a unity price, if it's a price per packet. You need to understand if the operator is actually competing with services. Because today the operators will compete with what kind of service you have, so you really need to understand that.

You need to be fully integrated with the existing production environment, which means that you need to be fully integrated with what the TV producers are doing, like MPEG-2, whatever it is, so you can take that kind of form and with the legacy systems as well, so you don't add any time on that kind of pre-formatting. Time to show time is very important. Breaking news has to be breaking news.

Because why should you pay for service to your mobile phone if you won't get it before anyone else, if you can just go home and put it on the TV? The output format, I say, shall fully support the standard set by 3GPP, because that means that you will have the most coverage of all terminals, telephones, etc. Use as high quality source material in as possible.

[Transcript missing]

Of course, the most obvious thing is to optimize the compression for distribution in wireless networks and to be displaying in mobile devices. Streaming vs. Downloads: Same thing there. Download exactly as a stream because the operator has a cost for every bit. And you definitely need to be the expert because this is new for operators.

Even if you don't work with the whole chain, you need to understand how a stream server works, you need to understand maybe how the RTSP proxy works, you need to understand all the way. So be the expert, help them out because that is one of the major problems. And the most important thing is test. It seems like I've heard so many people say, "Well, we did a clip and we tried it in a phone and it worked." And that's all they do.

This is completely new for all people involved doing this kind of services. So you really need to test to try out to see how does it look, what kind of content can I have, how does it work, etc. That's the only way to do it. So... Wrap up. Which means you. I'd like to thank Kay Johansson very much for his presentation today.

Next, we usually go through a roadmap slide. Unfortunately, session 721 that was going to follow the session in the room next door has been canceled. So I encourage you to go to session 722, Advanced QuickTime Interactivity, which actually is in this room. Then on Thursday we have a variety of other QuickTime sessions and of course our feedback forum that's going to be at 3:30 and it's now in the Marina Room.

We also have a couple of sessions on Friday as well. If you have any questions about QuickTime development, please contact Guillermo Ortiz, who is our QuickTime technology manager at [email protected]. For more information, we have a list of these websites for you. First of all, there is a lot of information about 3GPP on the apple.com website.

Popwire technology can be found at popwire.com to look at all the products that Popwire has created for this industry. Obviously, there's some general information on our website about QuickTime, developer information, and so forth. Some other things not to miss, hopefully you've seen these a couple of times by now, but we have a QuickTime content creation and QuickTime development lab. The lab hours are there, open until 6:00 PM tonight, then Thursday only until 4:30, and also on Friday.