Content Creation for Mobile Phones: Top 10 Success Factors - WWDC 2003

QuickTime • 59:33

The mobile phone is quickly gaining popularity as a medium for capturing and playing multimedia of all kinds. What are the key factors for successfully creating wireless multimedia? This session outlines the essentials in creating quality mobile content using QuickTime-based products and tools. Learn how to tailor your production process to accommodate the wide range of network bandwidths, mobile phone capabilities, and distribution processes.

Speakers: Aliza Hutchison, Kay Johansson

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it may have transcription errors.

So let's take this clicker and see how it works. As she said, I'm the CTO of Poppite Technology. We are based in Sweden. We work with encoding systems, so that's the capacity why I'm here, and we work a lot with wireless. So, what are we going to talk about today? Well, the goal is to understand, as I said, all aspects of wireless content creation. And I don't think you will have all aspects of it. We'll talk briefly about business models, how the radio networks is acting, content creation, how the content workflow works for an operator, carrier, etc. and also show you tools on how to make this kind of content.

So how it all started. It started basically with SMS, small messaging services. I know it's coming up now in the US. You send this text message to each other. SMS was actually created as a diagnostic system for GSM networks. But in '98, it started to grow a lot. And you had this kind of SMS messaging, person to person. It grew very a lot in Europe and Asia. After that, we got something called enhanced SMS, which is basically an SMS with like a two or four bit graphics with it. In 2000, they actually released it earlier than that, but in 2000, something called iMod was released by Docomo. It was basically like a multimedia SMS.

You had graphics, you had sound, you had animation, etc. and you sent it to each other. So that's really when it started to take off. Then in 2002, the so-called MMS was released, which stands for multimedia messaging. It's like an SMS, but you can have video, audio, text, animations, etc. It's pretty much close to what iMOD is.

And now we're in the phase where video is coming in. And basically today you have services where you can download video. And it works for different kind of networks. I will talk about that later on. And you also have streaming. Mainly today streaming is about live streams like traffic surveillance, etc. And also video telephony is coming on the 3D networks. The cameraman doesn't like me because I move around. So I'm going to stand here. and So, a brief look at the business models. If you look at SMS, there's a lot of operated SMS services. But most as third party SMS service like, let's say, weather, news, that kind of thing. And it's based on the revenue sharing. So the operator gets a piece of the SMS, and the SMS supplier also get a piece of it, exactly. It also works as a bit pipe. And when I say bit pipe, I mean that when I send an SMS to one person to another, the operator is used to pay for sending the SMS. In this case, it doesn't count bits. it's a unity price, but an SMS used to have X amount of size. So that's why it's a unity price. Then we have MMS, and it's pretty much the same thing. You have operator services based on MMS. You have third party MMS services, revenue sharing, and you also have the kind of bit pipe where I send an MMS from me to another person.

When we look at video download and streaming, I set a question mark because it just started and I see a lot of operators, carriers, doesn't really know what kind of business model they're going to have. So we have operators which act as content creators. Not that they sit down and record like a football game, but they take content from the TV station and edit it. So they make their own content.

Some of them work with content service suppliers, like revenue sharing, exactly as you did with SMS. And some only works as bit pipe. So those, basically, you have the same kind of business models all the way. When we talk about video telephono, it's time-based. It's the same as making a call. So that's pretty easy.

So this is the kind of business model. I'm not going to get in too much talking about this, because I'm a techie guy. What kind of networks do we have today? We have the 2G networks, which is like GSM, which we have in Europe and Asia a lot, the CDMA, TDMA in North America. And you can have 9.6 kilobits per second when we talk about data transferring if you would to do a multimedia service. And as you all can understand, you can't really do that much. It's not like you send a stream video with 9.6.

We have the 2.5 generation networks, which goes up to 30 kilobits per second today. But the theory is GPRS can go up to 171, and you have HS-CSD, hard for a Swede to say in English, which is up to 56. Thanks. and you have edge which some argue and say that's 3G so it's sort of 2.5.5 or something like that which can come up to 474 but that's in theory the networks today are in practice when we talk about 2.5G is 30 kilobits per second that's the fact then we have 3G which today only have 64 kilobits per second you can get 128 kilobits per second services and I will talk more about that later on how that works. But in practice, it's 64 kilobits per second today. In theory, it will be able to go up to two megabits per second, both WCDM and CDMA 2000. So that is the kind of network we are talking about when we're talking about wireless today. I'm mainly going to focus on 2.5G and 3G because that's where you can use video.

So a brief look at what kind of services do you have per network, because that's actually, the network is some way steering what kind of service you can have. Of course, for two years speech. And this is true for all. If you're three years, two and a half years, speech is one of the major businesses for, or services for the operators. It's still where they make much, much, make much, much money, most money. You have SMS, as we talked about before. where you do this kind of communication person to person, infotainment, etc. And ringtones is very popular, at least in Europe. I don't actually know how it is in the US, but in Europe it's very popular to download the latest song from your favorite artist and have that as a ringtone. On 2.5G, you have this MMS, which is like an enhanced SMS service. So pretty much the same kind of service which would have an SMS, but they've added on multimedia to it. And you have video download, sports news, et cetera. And usually when we talk about GPS networks, which is 2.5, we talk about streaming. We talk about traffic. You can have traffic surveillance. You can check out how's the traffic on that road, the 101, or et cetera. It's small clips you can download or stream, news, sports, et cetera. So it's the same kind of services all the time. but you have more content that you can use, or richer content you can use, depending on the bandwidth. For MMS, it's the same thing. For 3G, you have MMS, you have video downloading, you have streaming, the same kind of services, and on top of that, you have video telephony.

And in Sweden, for instance, we have an operator called 3G, which actually is up and running and doing these kind of services as video download, where you can download news, sports, entertainment. You can look at videos from your pop artists, et cetera. So this is really up and running. So this is how the services sort of lay down on the networks.

So, a bit about how a radio network acts. I'm not going to talk about the structure of a radio network like RNCs and etc. If you have questions about that, you can take them in the Q&A. But basically, a radio network is built out of cells. And if we talk about Deep RS, we start with that because that's the easiest one.

You get time slots. So you get the time slot. And many people think, like, okay, it's a time slot. How is that working? In a GeekBadges network, every call has a very, very small time slot. So let's say you have 100 slots per cell. Of course, you have a lot more. But this is just so I can calculate in my head. So one time slot when you enter the cell is pretty much, in theory, 12 kilobits per second. But you always calculate with 10 because you have a lot of overhead, etc.

So usually when you come in, typically you get three time slots. That's how it starts. Then you move over to the next slot. But in that slot, you have a lot of other uses. You might just get two slots in that cell. And this is one of the things that differentiates radio networks from Internet, for instance, is that you actually move within. You are not connected to the same access point all the time. You are handed over to new access points all the time. But it can also be much easier that you can be in the cell, an X amount of people comes into your cell, and then once again, you might just get two time slots. So, 3G, well, it's the same kind of thinking, but we talk about megabits per second instead. It's the same kind of thinking. You have X amount of bitrate in a cell. So this is pretty much how the radio network acts.

So let's talk about delivery or distribution. If you look at streaming for GPRS, as we said before, one time slot is 12 kilobits per second, but we calculate with 10 usually. So we're talking about very, very low bit rates, like 20 to 30 kilobits per second, which in essence is two to three time slots. But one of the most important thing about a GPRS network or actually an edge network as well, is that the operators can prioritize the services. And I haven't met an operator carrier so far that doesn't prioritize speech. If there are any operators here who do the opposite, they can raise their hands, but I don't believe that. So that means that if there are X amount of people in the cell and they want to use ordinary speech, the people who have appointed the GPS slots, the time slots, they will prioritize that down, so you will get like one time slot or two or something like that. And it's also like this that they can prioritize in a way that says when you come into a cell you have three time slots, but after 30 seconds or one minute, then you get down to two time slots because they want to give more time slots in the beginning when the user comes into a cell because you usually do something like 10, 20, 30 seconds to begin with and then you want to have good access. Then of course you have network congestion, you have a lot of packet loss. networks, radio networks, is like a low reliable internet. You can have half packets coming through radio network which doesn't exist on internet for instance.

And the other thing is latency. When you do streaming you have the RTCP keep alive and because of the latency you might not get that keep alive back to the stream server. So depending on how often that the stream server expects to get this RTCP callback, it will shut down after a while. So that's why it's important to do short clips when you talk about streaming on GPRS. Because if you do a 30-second clip, nothing's going to happen, because it usually takes about a minute before the stream server expects to get the RTCP back.

If we talk about 3G streaming, well, it's a bit of a difference. They have something called PDP context. And you can have different kind of PDP context. The idea is that when you do surfing, you get the PDP context called best effort. That's like internet. You get 64 kilobits, but it's best effort. It depends on how many share the available bandwidth in that cell.

At this point, this is how all 3G networks act. In the next release of 3G, you will have something called a dedicated PDP context or streaming context, which means when you do the surfing and you access an RTSP stream, you will set up a new PDP context, which is a dedicated 64-K bit PDP context. So that means you will have 64 kilobits for sure all the way. But that's not really there yet. The Press: Thank you.

Then you have the same thing as on the GPS network, network congestion, packet loss. It's much better than on GPS network. And still, most of the 3D networks are circuit switched, not fully like a packet-based network. So it's a packet bearer which runs on a circuit switched environment within the radio network. And the circuit switch is basically like point-to-point, like that's what you do when you have an internet you know, point-to-point connections. So that's what CircusWish is. And then you have X amount of bottlenecks in the infrastructure, which you have to consider as well, is that, for instance, on Internet, you have companies like Akamai and those kind of companies who place relay cache service out in the network that doesn't exist on radio networks up to date, but that's also something that is coming in the next releases. So today you have bottlenecks which you can think of as routers as well, which you have to take into consideration. and it's not really routers, it's RNCs, radio network controllers, et cetera.

The most important, however, is the firewall. I don't know how many people had asked me about firewalls, because the problem is that usually operating carriers doesn't open UDP access in firewalls, so you can't stream. And everyone's trying to stream, and you just can't do it. In Europe, they started opening it up, and I know that in Asia they have it. A lot is open up there, but it also depends on the business model. Do the operators allow for third-party services which doesn't have a contract with the operator? So that is one of the major issues with streaming on this kind of wireless networks.

So I'm going to talk about something called 3GPP. 3GPP is not really a standard. It's something called Third Generation Partnership Program, which is where you have pretty much all the big telcos like Ericsson, Nokia, Motorola, Docomo, Vodafone, Hutchison, that kind of companies. And they made a file format called.3GP. And then you can ask the question why its own file format. In the beginning, it was actually using.mp4. Well, the answer is in 3GPP, you use an AMR speech codec. And in MP4, you're allowed to have MPEG-4 or AAC, or MPEG-4 with AAC, or CELP as the audio speech codec, but you don't have AMR. So that's why they created its own file format called.3GP. But it's very similar to MP4, which is very similar to QuickTime. builds on atoms, etc. It is very similar. It has some extra atoms to handle AMR. In a.3GP file, you are allowed to have MPEG-4 simple visual profile level serum. I will explain that a bit later what the difference is. You can have H263 baseline, AMR narrowband, AMR wideband, and AAC low complexity, which is what you can have in MP4 as well. And you can also have timed text that's what you can have in 3dp file. The smile 2.0 basic is not really in the 3dp file, I just wanted to put it out because the players on the phones are supposed to be smile players so the presentation language is smile 2.0 basic and in that you can call like 3dp files RTSP and I heard a lot of questions about it but it's actually smile 3dp uses smile as the presentation language.

So what's the difference between 3GPP and ISMA? I assume many of you know about ISMA. Well one of the things is this MPEG-4 simple visual profile level zero. In ISMA you have level one and the difference between zero and level one is basically nothing. It's 64 kilobits per second but on level zero you can only have one video object and on level one you can have four.

So this is just to minimize everything for phones. So the level 0 was created by MPEG-4 for 3GPP. So that's why it's simple visual profile level 0. You both have AAC, but when you do streaming, in ISMA you use generic packetizing, which my personal thought is much better. But in 3GPP they use Latin packetizing.

And then of course it's the H.263 baseline and in ISMA you can have MPEG-4 from simple visual profile 1 to 3, I think they also put in advanced profile at this point. And the AMR codec, so this is like what differentiates 3UP and ISMA. And ISMA is like 3UP, it's not really a standardization body, so that's why it's easy to compare it. They just point at different standards which they use.

So a quick talk about terminals. I say all support 3GPP. And that's not really true. But I'm from Europe, and in Europe it's very true. So let's start with Nokia. Nokia is the leading provider of handsets terminals. And everyone tells me that they use ReelPlayer. Yeah, they do. But the ReelPlayer in the Nokia phone supports 3GPP. So you can play a.3GP file in a Nokia phone. You can, however, not stream 3GPP to it. You can only stream Reel format.

But the new phone, 6600, supports streaming of both 3GP and real. So if you make a.3GP file, it will work in the Nokia phones even if it's the real player. In the Sony Ericsson telephones, the P800 is a very popular phone on this one, which is a GPS telephone. They have the Packet Video Player, which is an excellent player. It supports 3DPP. It supports streaming according to 3DPP and ISMA. So that's why I'm also going to talk about it later. It's in QuickTime, for instance, you have streaming, but it's not streaming according to 3DPP, it's according to ISMA. But it will work in Sony Ericsson telephone. And all use QSYF size 176 times 144. And then on top of this, you have the Motorola telephones, you have the NEC telephones, Masashita, et cetera, et cetera. And most of them in the world support 3GPP, and some of them also support the proprietary phones like Reel or Windows Media or something like that.

So, a quick thing about MMS. This is also something I saw on the QuickTime list. A lot of misunderstanding. MMS is a multi-mime formatted message. It's like an email. It's basically an email you're sending, but it's to the phones. And you can have JPEG images, GIF images, AMR, sound. You can have smile in it.

You can embed it through your P5. You can have RTSP links, etc. So that's what an MMS is. So when you make a 3GP file and you want to transfer it to the phone, it's not an MMS message you're sending. It's just you're transferring a file and you're playing it back. So an MMS is not a 3GP file. You can use a 3GP file in MMS. And video messaging, which a lot of phones have today, is actually not MMS. From a technical point of view, it's more like just an email with an attachment. But that is more moving into standardization.

So you have trouble sometimes when you make a clip on one phone and you want to send it to another, you might not be able to play it or the phone might not even be able to recognize it because it's a proprietary way of signaling what kind of message you get into the phone.

All MMSs is distributed by an MMSC, which stands for Multimedia Messaging Central. It's pretty easy. And that's pretty much like an email server. For all you working in IT, it is an email server, basically. And when you get an MMS, you have something called notification. And that is sent by the PPD, Push Proxy Gateway. I think everyone knows what that is. Anyway, what it does is it sends an SMS to your phone saying you have an MMS and you say okay fine you click OK then it goes to the MMS see and download it so it's like a store and forward function exactly like in an SMS so let's get in to talk about content Oh, so this is how it works. How to get content. Well, I see a lot of things like the operating carriers. This is not really their business. But they're starting to look at it because they need to.

So many times they ask the wrong questions. I've seen companies-- for instance, I was in a meeting with BBC, and they came to BBC and they said, we won't have streaming content. BBC didn't understand what they were asking about, so they gave them, oh, we have real Windows Media file. They came back and said, oh, what would we do with this? And said, oh, we have to use Nokia, because that's the only phone that supports real. But what they should have asked is, what kind of content do we have? what can you give us because bbc has like i don't know enormous amount of mp2 for instance they do not have a lot of screen is just one percent which is that in streaming format from bbc produces for instance so that is all usually one of the questions that they ask the wrong questions that's good for you to know if you're content creators make sure you understand what they actually asking for and also nothing is that internet content you can't use that in the phones also for that but it's a lot of content on the internet. Why can't I just use any phone? Well, pretty easy, it's not adaptive for wireless. And if you just look at the size of movies on the internet, you realize that you can't get it into a mobile phone. And then everyone's talking about real-time transcoding and it really doesn't work because you would have like this kind of building used to do transcoding in real-time of these internet formats.

So the production-wise, well, the productions are obviously not made for mobile phones. It's very few production companies out there which makes a football game to work in a cell phone. And it's basically like when you do encoding or compression for any kind of thing. It's that when you work with these low bit rates, you want to avoid transitions. You want to have clean cuts. You don't want to have too much zoom.

You don't want to have too much movement. And on a phone which has even smaller size, it's even more true. And also, it's like when they get content sometimes, it's not the production format. It's not like you can sit down and edit. you get like MPEG-2 or something, which usually requires pretty expensive equipment to work with it. So it's not like you get a Motion JPEG or like a YCBCR file or something like that. You usually get a distribution format. Thank you.

Rights is another interesting thing which is an issue. Some operators have the rights to the video, the images, but they do not have the rights for the commentary. So that means that they have to take it in, put on their own commentaries on it. So they need to edit, they need to do voiceovers on the content coming in.

So, what I've seen today is that how to drive the business is the operator in the short term has to be the content creator. They have to take responsibility for getting the content. percent. because it's hard to go to a content provider and say i want to have content i want you to put it in 3dp and etc etc because it's not that much money for them to make at this point so the idea is that you migrate in the future from the operator being the content creator supplier whatever we call it to actually get out to the real content creators so they need to migrate to business and say supply and content but that comes when you have a lot of users of course because then it's a business also for content creators I'm not going to talk too much about this. This is just to give you an example of how a streaming system looks at an operator.

But I'm going to do it very briefly. From the left, you have this CM, which is the Content Management System. And then we have, which I'm going to talk about later, product from Musk, the compression engine, which actually does the compression. So you put in content in the Content Management System. It automatically encoded into the kind of performance you'd like to have.

And then it's being distributed into the Stream Service and HTTP service, et cetera. What usually differentiates an operator carrier, in this case, from an internet provider, is that they have some kind of RTSP proxies or something like that. What the RTSP proxy does is it takes care of billing, etc. Because this has to be connected to a billing system. And still, I don't think there's any internet service which really makes money. So, this is probably a new thing. So, what happens is when I set up an RTSP call, the proxy will intercept that. It will go down, as you see. check with the billing system to see does this person have credit enough yes he or she does it also checks with the content management system to see is it a unity price for the content is it per packet and from that it gets an okay or a knock saying yeah you can play this content and after the session is over it creates a cdr which is uh cdr is used for for the billing actually so in the CDR it says that this person looked at this link for this amount of time whatever it is and then it comes on the bill you also have all this kind of authentication down below you see you have triple A's that's the authentication service so when you log in from a client you need to the network knows who you are and what accesses you have etc so it's a bit complex the whole chain but this is just to give you an idea on on how it works for a wireless operator So, now we're going to talk a bit about creation. And that slide is not supposed to be there. Cool. This is by QuickTime. I do it twice then.

Okay. Small operations. You can have a workflow which is pretty straightforward. You have this content, it's quiet. Basically, you have like a DV cam or whatever it is. You use like Final Cut Pro or something. You capture it in. You edit the file. you do the encoding of it into 3GP file and you directly distribute it. With QuickTime now released, which has 3GP support, any kind of third-party software can use this QuickTime API to create content. So even if you wouldn't use Final Cut Pro but maybe Avid or some other editing software, you will be able to create 3GPP compliant content. So if you have QuickTime Pro, in that you can export as 3GP files, but you don't need to have it if you're working from an application.

So if you're in like iMovie, you can directly from there create a.3GP file. I will actually show that. That's why this isn't supposed to be here. So, Now we're going to get into techy stuff about high-end call. So first of all, we're talking about low bandwidths, below 64 kilobits per second.

The most important thing is to have as high quality content coming using as possible. This is true for all kind of encoding compression. The best, the highest quality you can get, that's what you should start with on the source. Of course use good quality codecs. You have to do some testing to see which kind of codecs you like. How does it work?

High quality pre-processing is just as important. You have to scale the pictures. You have to crop the pictures. You have to correct gamma, et cetera, et cetera. And you have poor pre-processing. Well, you don't get a good quality. And also it's that you have to reduce frame rate. And, of course, you have to do that all the time. I mean, you reduce frame rate whatever you do when you go to Internet usually. But on a wireless network, usually instead of maybe going to 12.5 is PAL or 15, you might go to 6.25 or 5 or something like that. Because it's better to have better spatial quality on the frames instead of having a movement because the screens on the mobile phones isn't that fast. So the kind of motion that you're trying to reduce by having more frames is achieved anyway on the mobile phones. So it's better to have lower frame rate and better quality on the frames itself. - Good answer.

You have bandwidth constraints of course and always use CBR, constant bit rate. I don't know how many of you really knows exactly what it means. Constant bit rate is not flat. Constant bit rate means that you can fluctuate within a certain time frame and that is set by the video buffer verifier. So if you set that to two seconds you will have a window going over the file all the time and under those two seconds if you set it 64 kilobits per second it can never go over that. So that is how the video buffer verifier works. So that's very important. So we have a very, very constrained bandwidth. Drop frames.

If you have an encoder which can drop frames to maintain bitrate, use that. Because this is the most important thing about wireless. You don't have any handroom to work with. And of course, natural keyframes is possible. That's the best way to do it. But also if you can add on the longest distance to a keyframe. Because that also has to do with error handling. If you lose a keyframe, you don't want it to take too long until you can recapture it on the images. And, of course, error handling in radio network. When we talk about streaming, you want to be able to set packet size, how many frames per packet, etc. And something called RVLC, reverse length coding. And that's very useful when you have congestion on a network. so the codec can determine itself on the player side, that it can build sort of new pixels. So this is dependent on the players. You can use RVLC. And also, some operators have, for video telephony, and they want to use video telephony to play static files, not being live, then you usually need to use RVLC, because that's how the video telephony client expects to get the frames.

Also, what we talked about before is that make short clips. Before capacity increases, it's about 30 seconds, a minute. That's a good file length to do it. As we said before, new users are prioritized in sales when they're coming in. Speech is prioritized, et cetera. So make short clips. Because speech, as I said, has priority over all kind of data.

So streaming versus download. Well, streaming for wireless, the proof is of course that memory capacity in phones are limited. So streaming in that sense is very good. It's less overhead in RTP than in HTTP so you can put more pixels or bits to the media itself, which of course is very good on these low bit rates. It's lightweight DRM in itself because you have to be very skilled to get into phone hack that and Download the video on the other hand. Why do you want to download a video of 64 kilobits? So but that's one of the things and it usually is viewable in about two to four seconds I Cons, yeah, of course, firewalls, same thing, firewalls, firewalls, and it's not supported by all phones present. I think most of the phones by the end of this year and the beginning of next will have support for streaming.

Download, well, it's supported by all phones, all phones, so that is a good thing, of course. The cons is that most of them don't support progressive download, which might seem a bit strange, but that's just a fact, it doesn't support interlaced content. So it needs to be fully downloaded before you can start in the pictures, images or video, sorry. And you need large storage on the phone. In our sense on a computer you don't, it's not anything large, but you still need to have the storage. So, This is the thing everyone tells me, but the quality of download files is usually much better. Because you're not constrained to bandwidth limitations, you can have a higher bitrate because it's just download time. Well, yeah, that's true for internet, and it's in theory through wireless as well. But the download time, well, if you have breaking news and you have to wait four minutes to get the breaking news, it's not breaking news anymore. But one of the more important things is that you pay per packet.

So even if it's a download or a stream, you pay for the amount of packets you send. And then people say, well, it's usually a unity price sometimes. Well, the operator has a cost for every packet, and this is what drives it. If you can make content on a low bandwidth good enough so the user wants to pay for it, and it costs as little as possible for the operator, that's when you're going to get business, because it costs for the operator. So you have to treat download as a stream. You don't have to be as constrained.

But don't think, I can do 300 kilobits per second because I want to have a better quality. Because the operator or your customer, whoever it is, will come and knock on your head and things like that. So this is very important to think about. So this is where the QuickTime thing was supposed to be. So let's get over to this machine, which is laptop number one. Do I change or do you change?

What do you want me to do? So that's how it works. Let's take a look at QuickTime then. I have a small file here somewhere. This is just a little video. This is motion JPEG, 30 seconds for dead kilohertz PCM sound. So, if I want to export it into 3D-P file, I get in, move to 3GP, as always, I mean, like you always do, put it on desktop. And I have the options. So, on the file format, I have something called Mobile MPEG-4, and this is basically done for Docomo. And I don't know that much about this, so Apple has to answer questions about it, but this is basically done for Docomo because they have an MP4 file format, which is basically where you can have an MR and things like that. So, now we're going to go to 3GPP. 3DP will release 5.1 means that it can time text in it. 4.3 you can't. So basically I don't have text right now, so I can choose whatever, but I choose the 5.1. I choose the size, and in this case it's QC format because the 3DP says that you should use QC format, which is 176 times 144 or sub-QC, which is 128 times 96. One interesting thing here is that QC aspect ratio is 11.9, and subcursive is for free and usually do you do wrong with that and on all the track on this guy use I say I want to do speech So, video, I have my MPEG-4 codec, I can use H.263 for instance, but I choose to do it, I do it at 40 kilobits a second. I used 12.5 frames at this point, because it's PAL content, so it's 25, so that's why I'm using 12.5. I put a keyframe every second second.

and in this case it's audio speech and this is a more narrow band you're using right now and you have fixed bit rates for it as you can see when you drag it and this is called modes and this is mode seven the highest one and it's always mono AMR only supports no mono it's always 8,000 kilohertz of sampling frames per sample is when you do streaming and then we come into the things about streaming well if I do streaming basic this is quick time hinting this is not 3GPP hinting but it will work in a P800 for instance and we probably work in some other phones as well because some of them support streaming according to ISMA. But in this case I don't want to have it streamed. So, okay, put it on desktop, save, and off it goes. So, this cannot use in quick some pro is a sad before but this can see is is is uh... accessible through the quick some a_p_i_ any kind of third-party software and counsel for can use this to create this kind of content then as a sort of smoke much about cropping a set of pre-processing but that is dot accounts also for you probably we used to you can do that and so that's how you build on top of the quick time a_p_o_ a_p_o_ and this is just a eight hundred megahertz old PowerBook, so don't worry about the speed. I think.

So, here's the file. With the beautiful AMR sound. This is of course not down for this kind of, of sort of rich audio. But that's what you have to live with sometimes. Yeah, so. Let's get back to the... Slide. Am I doing something wrong? No? No, it is. No, it's not. So-- What I showed you right now was a small operation. This is where you can use Final Cut Pro or that kind of editing tools and directly go to 3D P5. You can use QuickTime Pro if you have that to take in files, edit them, do that kind of thing. When you get up to a medium-scale operation, when you have more content coming in from several sources, where you have several editing stations like Final Cut Pro, etc., then you have demand for larger sort of tools.

So what I'm going to show you right now is actually a pop-up product called Compression Mask which isn't released yet. It will be released on the 15th of July. So this is the beta version. and all of you know what beta means. So, let's go over to the laptop.

Great. So this is the compression monster. And-- We're going to start to do some files here. We're going to start with some settings, actually. So I've done some settings pre-hand there. So I open up this and we go to 64 KBit. And in this, of course, you do the same thing. You choose the file format. You can choose a lot of other, but now we're talking about 3GP. So it's a 3GP file. I want to use MPEG-4, a more narrow band. I want to do streaming.

So let's move on to the video side. On the video, I put this into-- well, 64 maybe 42 probably something like that I put the keyframe at 25 because still this is PAL in format I'm going to use. simple profile level 0 of the MPEG-4. This is very important. Why there's a button like this is because the MPEG-4's usually detect, depending on the bitrate, which profile it is. But since profile 1 and 0 is basically the same, you need to be able to override that. And it also is because some operators will come to you and say, "I want to have 128 kilobits of content," but it still has to be simple visual profile level 0. so this is why I have this all right thing. Then we have the buffer verifier and I should actually set this to the buffer on the phone so if you know that the phone have a two second buffer usually if it's a 3D phone it means that it can handle 64 kilobits a second so you have to sort of calculate as well but I set this to two seconds the sliding window over the content will say that during this two seconds sliding window it can never be above 42 kilobits per second. I say frame skip probability which in this case is dropping frames because I want to maintain the bandwidth so I put that pretty high like 44 percent or something and I can use like RVLC etc but that's not what I'm going to do right now. I put the frame rate down to 6.25 frames per second because on a, I don't know, I can't say the brand, but it's a Swedish brand. And I choose the QC format. I can choose whatever I like actually, but I use the QC format as it is now. I don't want to do any cropping on this file. So, and then I can do like, you know, gamma corrections, all that kind of thing, but that's not really important now for the demo.

audio same thing a more narrow bond 12.2 kilobits per second sample rate etc I can do metadata and I put in stream and I say five frames per second five frames per packet sorry so now I've done my settings can take a look at some others as well this is like a 128 but the important thing here is about the bandwidth constraint all I doing is I'm trying to keep the bandwidth that's the most important thing which means sometimes I will get a lower quality but that's actually the fact of life so that's how it is so let's put in a file and let's do the same file as we did before I add on a setting to it So I used the 64 and I put it on desktop and off we go. And this is of course how you can do batches. You can put on a lot of files, a lot of settings. It's just about, you know, how many files do you want to do? And with this sense, you can build a small but very effective system. So you can have several editing station coming in like motion, JPEG, or you export it as native from from funny cap pro and use put it in here you have your settings which created you can see the work with them if you like to work in use use the the settings you've done So while that one is going, let's get over to the slides. Oh, I forgot this one.

So, when you have a large-scale operation, then you need to have an automated system. And that's exactly what we built. So, with the Compression Master, you can export XML settings. The settings I showed you before can be exported, and you put them into something called the compression engine, which acts with watch folders. So, when you've done this once, the content provider can put in like MPEG-2s on their watch folder, and the compression and this will automatically start to encode for it.

So, just to give you an idea, an average portal, if you look to the right, you have two news feeds, 10 times times 30 seconds, financial information, traffic. It's not a lot of files you actually want to encode. If you look at it from service, it's not that much. But it still generates a lot of data. So you have about 243,000 megs that has to be converted daily and the turnover time from you drop it into the system Should not take more than one minute And then you need to have a pretty powerful system to do that so this is a real world example of how an operator works and they hired a post-production company which has the compression master which creates the XML settings so they work like a sort of quality quality what do you say quality control of how it's done and they can also make like trailers of Nemo of course, very good film, into directly and work with it, because it's a trailer.

You sit down and you work with it, you test with it, and you make this into a 3D file, you send it to the operator. But when we talk about automated service like news, sports, etc., what happens is that the broadcasters deliver it as MPEG-2 files, which they have on the Grass Valley servers or Quantel, or whatever it could be, and they take these MPEG-2 files, which they produce anyway, so they don't have to do any special production in their case.

They don't have to do any pre-formatting of files. They just drop it into the watch folders, and the compression engines on the right, which resides at the operator, starts automatically to compress this into 3D files. And the other thing is that they also take in files and they do editing of it, and it's a tape transfer. So they take the tape, put it into Final Cut Pro, edit it, and then push it into the compression engines. And this is pretty cumbersome.

So the next step, what they're looking at, is to set a compression engine at the content provider, still having the post-production company deliver the XML file so you control what kind of settings it is. The production company drops their MPEG-2 directly into the compression engine, and it goes maybe as a UUV file so you can edit it in the Final Cut Pro without going to tape, which takes away a lot of time. Or it can go direct as a 3D P file, so you will be able to remove that. So that's how an operator works today. So right now, I'm going to show you-- so let's go to the laptop about how this workflow actually goes.

So, in my compression master, right now, here is, right now I'm looking at these machines. I have a watch folder on these Xsers, which I see here. I've done two settings. What I can do is that I can, from here, export a setting. So I take this 128, I don't think I have that one. So let's export this one. So export it. Run Dexport. Yes.

So, that one as well. So, now I have this setting. So, now I'm going to move it over to the compression and inside, you just move it over. Ah. this is demo gods so so now I have three settings in here so now I can take any file from here when I've edited and I can drop it in and now it's a 100 megabit network, so it will take a while to do this.

So while I'm doing that, we can switch over to this machine, because I'm going to put in a lot more files, but it will take too long time to do it from there. So let's go to the XSERVs. Thank you. So right now here is my watching folder and now I've done some media files, I've copied this one. This is a 30 second clip. So, let's put this in.

so now i'm copying it over to the compression engines and i don't know if you can see if it started working anything on state coppings of and probably saying the same thing.. So now the files are coming out. and this is done throughout the matic on the sex or so let's step this up that's putting in some and picked to use stake all these put them in there as well Oh, the netbook hates me. So, still the files are coming out here. So we can take a look at, for instance, this one. This is the MPEG-2 file, which is converted. Thank you. Audio only. So this is the files coming out of it. Sorry.

Sorry about that. So in this way, I can just have this system running all the time, and it does it fully automatic. And it's, I think, now it's a lot of files. Is it working? Does anyone see anything? One of them, oh, it's a network, sorry about that. Let's see what happens here. This is typical.

Well, anyway, I'm putting in like 50 files in this system. So now it's starting automatically. And this is the kind of system that the operator needs to have when they're talking about automated processes. So they put this sort of clusters of engines, of Xers, for instance, and they just drop in files and they work towards doing it because the turnover time has to go so fast to do this. So they really need to have it working. Like a one-minute clip should come out in the phone within maybe one to two minutes later on. So that's how fast it needs to go. And since the network isn't really doing what it's supposed to do, you won't really see this, unfortunately. Can you go to the... Yeah, you have it here. So now you see the files are coming up here. So this is like the another Patricia McNeil video.

and as you see this is down from MPEG-2 so the quality becomes pretty good if you put in really good content because this is 64 kilobits total so the video is about 40 kilobits a second at this point So I think you get the idea on how the compression engines work and you have even more files working. I think now most of them are up and running in the XSERVs. So let's go back to the slides.

So to conclude this. first of all you need to understand operated business you need to understand if it's a unity price is a price per packet and you need to understand if the operator is actually competing with services because today the press will compete with what kind of source he has to really need to understand that you need to be fully integrated with existing production environment which means that you need to be fully integrated with like what the team producers are doing like mp2 whatever it is so you can take the kind of formats in and with legacy systems as well so you don't add any time on that kind of pre formatting time to show time is very important breaking news has to be breaking news because why should you pay for service to mobile phone if you won't get it before anyone else if you can just go home and put on the TV the output format I say shall fully support the standard set by 3g PP because that means that you will have the most coverage of all terminals telephones etc use as high quality source material in as possible because that's the only way you can get really good quality and low bit rates. So don't use pre-compressed content from the internet or whatever it is. Understand the network constraints. Learn to understand how much bandwidth you have to play with. How you prioritize, how the operators prioritize the services. This is very important to know. Do you prioritize speech? How much? Do you get three time slots when you get in, etc.?

Of course, most, I mean, obviously, is optimized compression for distribution in wireless networks and to be displaying in mobile devices. Streaming versus download, same thing there. treat download exactly as a stream because the operator has a cost for every bit and you definitely need to be the expert because this is new for operators even if you don't work with the whole chain you need to understand how stream server works you need to understand maybe how the RTSP proxy works you need to understand all the way so be the expert help them out because that is one of the major problems and the most important thing is test. It seems like I've heard so many people say, well, we did a clip and we tried it in a phone and it worked.

And that's all I do. This is completely new for all people involved doing this kind of services. So you really need to test to try out to see how does it look, what kind of content can I have, how does it work, et cetera. That's the only way to do it. So wrap up, which means you. I'd like to thank Kai Johansen very much for his presentation today.

Next, we usually go through a roadmap slide. Unfortunately, session 721 that was going to follow the session in the room next door has been canceled. So I encourage you to go to session 722, advanced QuickTime interactivity, which actually is in this room. Then on Thursday, we have a variety of other QuickTime sessions.

And, of course, our feedback forum that's going to be at 3.30, and it's now in the Marina Room. Amen. We also have a couple of sessions on Friday as well. If you have any questions about QuickTime development, please contact Guillermo Ortiz, who is our QuickTime technology manager at QuickTimeMan at Apple.com. Thank you.

for more information we have a list of these websites for you first of all there is a lot of information about three g_p_p_ on the apple dot com website Popwire technology can be found at popwire.com to look at all the products that Popwire has created for this industry. Obviously, there's some general information on our website about QuickTime, developer information, and so forth.

some other things not to miss hopefully you've seen these a couple of times by now but we have uh... a quick time content creation and quick time development lab the lab hours are there open until six p_m_ tonight then thursday only until four thirty and also on friday