Graphics & Media • iOS, OS X • 50:12
HTTP Live Streaming lets you deliver video using HTTP from a standard Web server. Gain a practical understanding of the details behind deployment of live and on-demand streams. Learn how to design for mobility and the best practices for delivering video into your application or on the web.
Speakers: Roger Pantos, Eryk Vershen
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
Welcome to this year's HTTP Live Streaming Session. My name is Roger Pantos, and I work on our streaming client and also I am one of the people who defined the protocol and continues to do that. So it was exactly one year ago that I stood up here at WWDC and introduced this new thing called HTTP Live Streaming.
And how are we doing? Well, apparently a lot of people have been waiting for a simple, standards-based way to cost effectively broadcast live video to wireless devices. People are jumping on this. We started with our launch partners, Major League Baseball, CNN, Akamai, Inlet, Invivio. And since then we have seen everyone from top-tier networks like ABC, Sky, to start-up companies use this technology to produce incredible applications, incredible.
We worked with some of the most innovative companies in media. Folks like Netflix and they have helped us focus on the most important aspects of delivering their content, your content to mobile devices. Over the past year we have refined our protocol and our implementation, the protocol, our implementation of it.
And our goal has been to give you the tools that you need to take your application from incredible to mind blowing App Store chart topping oh my God I am living in Star Trek!
[ Laughter ]
[ Applause ]
And so today what're going to do is give you a quick overview of how the technology works and walk you through the new things that we've added to it in iOS 4.0. And to do that what I would like to do is introduce our media technology evangelist for HTTP Live Streaming. His name is Eryk Vershen. Eryk.
[ Applause ]
Thanks Roger.
Good morning everybody! So it's been a busy year and I want to talk, my talk is going to cover four main things. First of all we are going to have a technology walk-through for those of you who may not be that familiar with the technology or having been looking at the details recently, then we are going to talk about the new features and functionality that we've added over the past year. Some of that was added early in the year. Some of that's been added very recently.
And I'll have Roger back up for a demo and we'll talk about the tools that we provide so that you can create your own HTTP Live Streams. And then lastly, I will talk about some tips and tricks. So let's get started. Now, before we get started with the technology walk-through I want you to think about how we do streaming with HTTP. I mean, HTTP doesn't do streaming, right? It just delivers discreet files. So essentially what were doing is we're turning your long movie or your live content into discreet files.
Now, you've probably seen this slide last year or some version of it, workflow or architecture now. Whenever we're doing-- converting video, what we are going to do-- we have to initially have some video. Now, there is two disparate things here. I might have live material or I might have some video on demand material. And the workflow is slightly different between those two.
So we're going to start out with an audio/video source that might be in the case of live that might be an STI connection that's coming in to my desktop machine, or in the case of video on demand I have already got a movie file around on my file system.
Now the first thing we have to do is run it through a media encoder. That means turning it into H.264 and AAC so we can play it on an iOS device and it also means wrapping it in an MPEG-2 Transport Stream. Now once we wrapped it in an MPEG-2 Transport Stream we are going to pass it into our segmenter. And what the segmenter is doing is it's chopping that movie into discreet chunks of roughly equal length.
Now the segmenter is also going to create at the same time a playlist that lists those segments, and it's going to put those files somewhere that the web server can see them. Now, there is nothing special about the web server in this context. It is an ordinary web server. The web server would make it available in the cloud and so you can download it on your devices.
Now remember I talked to you about, we've got these two different worlds, either I've got a live stream or I've got video on demand. In the case of live streaming that first part is going to happen at the same time as I am serving it out. I am going to be continually getting new content.
I am going to have to create new segments and serve those out. But in the case of video on demand, I'll have done that initial phase once and I'll be serving it out repeatedly. Now, the center of this is segments and playlist. So, I'm going to be creating segments, here let's imagine I've got several segments that are 10 seconds long, and you'll see that 10-second number a lot because that's the number we tend to recommend for the length of a segment. Now, segments by themselves aren't any good. I need to have a playlist and that playlist has to point at the segments.
And the playlist is really the centerpiece of how the client finds out about the material. So the playlist does several things. First of all, it lists the segments in playback order and if I am doing a live stream this defines the playback window. That is, in a live stream I'm not necessarily showing you everything because in fact the stream might exist 24/7. I can't have all that content around all the time. So I am going to give you a window into the content that rolls along.
Now, you are going to want to protect your content in some cases and so we can encrypt the segments, but in order for the client to decrypt them it has to know how to-- what key to use. So we have to have in the playlist the key associated with this encryption.
And an important feature of HTTP live streaming is that I can adapt to different bit rates. So as my network characteristics change I can go ahead and the client can fetch a version that's suitable for that bit rate. And so the playlist has to be able to define multiple variations of the content.
Now first, let's look at what's a fairly straight forward playlist, a video on demand playlist. Now, when you look at this the main thing to notice is it's just a list of URLs. Those are URLs of segments. Now, initially they can be absolute URLs but generally speaking you're going to want to make them relative URLs. That's more portable, it's just going to work better.
Now the other lines in the file, you know, some or all got a hash mark in front of them and that makes them comments. But if those comments start with a word that we recognize they become tags that actually affect what is going on. Now the first tag the EXTM3U tells the client what the format of the file is and at the moment we only support the one format so that is always going to be the first tag in your file.
The second tag, TARGETDURATION indicates the maximum duration of any segment in seconds, 10 seconds here. And you'll notice that there is this extra INF tag in front of each segment indicating the duration in seconds of that segment. In this case they are all 10 segments, 10 second segments, but they could be shorter.
The next tag to notice is the MEDIA-SEQUENCE tag. This indicates the sequence numbers associated with the segments that will become more important when I show you a live playlist later on. Only thing I want you to notice now is that sequence number does not have anything to do with the filenames associated with the segments.
The last tag I want you to notice is the ENDLIST tag. Now, the ENDLIST tag indicates to the client that this playlist is complete. It is never going to change. So the client is going to fetch this playlist and it knows-- it never changes. It is complete. Now, if I am doing live I can't give you a complete playlist. So here is an example of a live playlist and it looks a lot like my video on demand playlist except there is no ENDLIST tag and when there is no ENDLIST tag the client knows that he has to re-fetch this periodically.
Well, how often? TARGETDURATION is going to indicate that. Although if I have, one of my segments is shorter that is going to vary that a little bit. And let us say that I've highlighted the sequence number here because the client knowing he has to re-fetch thisalso implies that the server has promised that he is going to update it.
So let's imagine that I pulled the next time. Now it is up to it, it knows my sequence has changed and the set of files that are in the playlist have changed. Let's do it again. OK, so now my sequence number is 3 and the list of files have changed.
And what we're giving you is a rolling list of the content. Now, the sequence number has to stay consistent between one version of the playlist that I download and the next version. So every time I get that playlist, if I am dropping a file off the front, I've got to bump that sequence number to keep it consistent.
Now, we can keep doing this as long as we want and naturally I am not limited to having only 5 or 6 segments in the playlist. I could have 10, I could have 100, I could have 500. When I have that on the live playlist the client's is going to come in and the client is going to start by default, near the end of that playlist. But the client can seek around and so what I am giving you is a window into what's happening as it goes along. Now there is a third kind of basic playlist which is what we call an Event Playlist.
Now the difference with an Event Playlist, it looks a lot like, to start out it looks like a Live Playlist. It does not have an ENDLIST tag which means the client knows he's got to fetch it again. This time when the client fetches it we're just going to add a segment on to the end. We're not going to get rid of the segments on the front. We are going to keep on adding segments. We can do this again as long as we want until at some point it is over.
Now why would I do with an Event Playlist? Well, for an event, for a sporting event, for a rock concert something like that I'd want to deliver it while it was live but I'd want it to complete at the end, alright. So once the client sees that ENDLIST the client knows that this playlist is done, I do not have to fetch it anymore. Now, if I am delivering an event like that I really want to protect my content so I'm going to want to turn on encryption..
So here's a playlist that has some encryption and what we've added is this key tag. Now, the key tag indicates the method of encryption were using. At the moment we only support AES 128 although you can, if you have a portion of your content that is not encryption, you can switch out of encryption by specifying the encryption method as being none.
Now the other point to notice here is the URI. The URI indicates where the client should go to fetch the key that he needs in order to decrypt the segment. Now, this key is going to apply to all subsequent segments until a point where I specify another key.
At this point, the client's going to go and fetch that key and that going to apply to subsequent segments going on. Now, at this point we've only been talking about one kind of, you know, one example of a playlist, it's only got one data rate. We want to be able to handle variants. We want to be able to handle multiple data rates at the same time. So what's a variant? A variant is a version of the stream at a particular bit rate.
Now, each variant is in a separate playlist and what we call the Variant Playlist or the Master Variant Playlist describes all of the variants that I have available. Now the client is going to be given that variant playlist and the client is going to switch based on the measured bit rate-- on the bit rate that is actually seeing over the network which variant he should play and the client's player has been tuned to minimize stalling playback. We want to give the user a good experience. We don't want him to have the video drop out if we can help it.
Now, I've got a nice picture here of a variant playlist. Let us imagine that I have a playlist at some kind of a medium resolution and bit rate, and let's make two more variants available, one at a lower bit rate and one at a higher bit rate. Now, what I do is I create a variant playlist that points at the individual playlist. I hand that to the client and the client can play it.
Unless I put the audio here and the audio hasn't changed size. The reason I am doing that is you want your audio to be identical between all the streams and I mean identical, not just the same bit rate and sample rate but actually the very same audio. And the reason you want to do that is if you don't do that you can get pops and clicks when you switch between streams.
It is a bad user experience and you'd rather, the best way to go about this is to make the audio completely consistent between the variants. Now, here is an example of what a variant playlist looks like. It looks a lot like another playlist. It is just a list of URLs, but in this case those URLs are to other playlists. And instead of being preceded by that extra INF tag that we saw with one of the segments it is preceded by the STREAM-INF tag.
The STREAM-INF tag ties the individual variants together and in particular specifies the bandwidth that is the maximum data rate that this version of the stream can take. I want to call out two of these variants in particular. The first one, because the first one is the default, when the client picks up the variant playlist it's going to start out with the first stream.
The other one I want to call out is the last one, that's a 64 kilobit stream, audio only. You want to have a low data rate stream for fallback if you happen to be on cellular. You want to have something that the client can go down to that you're basically going to be able to serve no matter what happens.
Oh yes, one last thing I wanted to point out about variant playlists. The variant playlist, even though it does not have an ENDLIST tag, is not reread. Once you've read the variant-- the client has read the variant playlist, it assumes that the set of variations isn't changing. Now, if the individual variations are Live or Event Playlist, as soon as it sees an ENDLIST tag on one of the individual variants, as soon as it hits that point, that ends the stream. It's not the case that you could say, oh well, I just do not want to serve that bit rate anymore. I'll put an ENDLIST on it, it is like, no, these streams are all supposed to be the same content.
Now, in terms of playback you can play back using Safari either on the web or mobile Safari and the best way to do that these days is with the HTML5 video element but in-- that's the wrong version it's supposed to say iOS. In iOS there are several possibilities, UIWebView which gives you something like HTML5 but also MPMoviePlayerController which has existed for a while and now in iOS 4 AVPlayerItem with is part of AV Foundation, allows you to play back HTTP live streams.
Now, let's start talking about new features. I've got four main new features that I want to talk about. The first is Stream Discontinuities. Streams aren't continuous. That is, they aren't always the same. I might be wanting to deliver something where I am delivering a set of different short movies, TV shows, or whatever.
And I'm going to be stitching those together into my stream that I am actually delivering and those might be encoded at different times. There might be variations in between them. So we've got to handle those discontinuities. We also want to provide metadata that goes along with the streams and we want that metadata to be associated with particular times.
The third thing is custom protocols. I will go into more detail when I talk a little bit later but basically this allows you to have a greater degree of control over how your keys are delivered to the client. I'll talk about performance improvements and then I have a few odds and ends I'll talk about after that.
OK, discontinuity. So let's say I have a whole bunch of movies that I am delivering and I really like to put some kind of bumper at the front of the movie. Some sort of idents or some sort of branding that indicates these are coming from my site. Now, how am I going to do that? Cause I won't have just three movies. I may have hundreds or thousands of movies. Well, I could take that bumper and I could merge it into the movie but then if I decided to change the bumper I am going to have to re-encode all those things in order to make it work right.
And further, I've used that space: if I've got a thousand movies, I've got a thousand copies of that bumper that I stuck on my server. What was the point of that? So it is a brittle solution. Now, you could say, well what if we just delivered the bumper as one movie.
You know, we'll play one movie and then play the next movie. And you think that might work but there is a problem with that, and that is that the client when you switch to a new movie forgets about what's going on, what it was getting in terms of data rate, because it doesn't know that essentially we're going to be getting this from the same place.
And so what will happen as I start playing my bumper I'll have a low data rate which will start out because we want to start out conservatively and make sure the clients' likely to be able to read it. And then it is going to go up and then when I hit the end of the bumper I am going to go back to my movie and I have to start ramping up.
So I am going to get this break in quality. Now, further if I'm deciding to do these things in the middle, if it's TV shows and I am doing a station ident in between shows, then again I am getting these drops in quality as I go along. So we really want a different solution. Because our streams can change we can have timecode breaks. We can change the encoding parameters. So the solution is to let the client know that there is a change coming up. We do that with a discontinuity tag.
So here is an example of a stream that has a discontinuity tag in it. And I think that, well that is perfectly fine. OK we're done. Well, what if we want to encrypt it? There we go. So if I am encrypting, OK it still looks straightforward, what's the problem with this? The problem is the default Initialization Vector for encryption is the sequence number, and you're going, I know you're going What? What's an Initialization Vector and why do I care? OK, what is encryption trying to do? Encryption is trying to make your data, which is definitely not random, look random.
And the problem is at the beginning of a segment it's hard to make it look random. It's just not that easy. And what Initialization Vector does in essence is make the beginning of the segment look more random. Now, ideally an Initialization Vector should be a random sequence of bits that changes often enough.
So, OK our default Initialization Vector for encryption is the sequence number. So what are our sequence numbers? 0,1,2,3. OK, so where's the problem here? Well, the first problems is, I've got this bumper, right? And what if I decide to change the bumper? Right now it's 18 seconds. What if I decided in the future to make that 22 seconds? Then it's going to take three segments, right? So the sequence number for the movie is going to start out at 3 instead of 2 and now I'd have to re-encrypt all my movies. Well, that's not good. The other problem is these aren't really outstanding Initialization Vectors. They've got lots of zeros in them.
The solution is to add an attribute to the key that allows us to specify an Initialization Vector which is a 128 bit number. And that Initialization Vector is going to apply to all the subsequent segments until I specify another Initialization Vector. Now those of you who were looking closely might have noticed that there was a new tag in these playlists, the VERSION tag.
And the VERSION tag, we have to add the VERSION:2 because the Initialization Vector is not compatible with the previous version, the old client wouldn't be able to understand Initialization Vector. So we have added the VERSION tag and that is required. Now, you can put the VERSION tag, you can put a VERSION tag with version number 1 into your playlist and that will be fine with an old client because old clients, if they don't recognize the tag it just becomes a comment.
There is one other point I wanted to make about Initialization Vectors and that's that you can continue to specify the Initialization Vector. You can re-specify it with the same encryption key that you already have. So in this case, I've specified a new Initialization Vector starting with a third segment and another one with a fourth. Now notice that it's the same encryption key. Now, we're not going to re-fetch that encryption key.
If we look at the URI and see it's a URI that we already know about, we are not going to re-fetch it. And that also holds true across, across multiple variants. If I have a playlist and I am using the same encryption key on the variant when I switch to a different variant, if I've already seen the key I don't have to re-fetch it.
Now, Timed Metadata. So if you were at the, the graphics and media state of the union yesterday, they would have talked about synchronized metadata. Well we're talking about the same thing when we say Timed Metadata. The reason I'm saying Timed Metadata is that's what, what we call it in the code. And what's Timed Metadata? So it is metadata so it is data about the video and it's timed. It occurs at a specific movie time.
We want to communicate this info about a specific moment in time and we want to communicate it to our particular player, our dedicated player app. This is not totally generic in the sense that I can just send arbitrary metadata and an arbitrary client will be able to understand it. I can say well, why do we have to add this into the movie? We could do it as an independent channel. I can do that already.
It is like, well you could but it's kinda hard to synchronize if I am getting this through another TCP connection or something, it becomes harder to rewind, to seek, to replay that stuff properly because now I've got to seek in these two independent channels. So we add a time stamped information stream into the movie and I'll give you an example. So, here we've got our movie playing and some metadata is coming along with it. Now, it doesn't have to be text like we're seeing here. It could be just a number like 92 miles per hours or it could be a picture.
And we already use this to time stamp an audio only stream. We also use it to add pictures to an audio only stream. But now we're making it available in iOS 4 to your apps as well. So what can you do with it? Besides text, you can use images to overlay.
Maybe I've got a bug, you know, a station ident bug that I want to put over my stream and I don't want to actually encode it into the movie. Maybe I've got text to display like we saw. I could even use this to do subtitling although it's not ideal for that.
I can use it to mark points in the movie, things of interest, chapters, or other things I could mark where I am doing insertions, where my bumper or where an ad occurs. Or to give you a more complicated example I could use this on a sandwich, filming a lecture like this, and I've got a camera on me and I've also got the slide deck.
Now, I could feed the slide deck as discrete pictures and I could add metadata along that said whether I should display the slide by itself, the slide with me as a picture in picture, or me as the main screen and the slide is a picture in picture. So there's a lot of flexibility here.
Now, in order to provide the metadata we're using ID3 tags. Pretty well known standard. And this exists in the movie as a separate elementary stream. So in our MPEG to transport stream it's a separate elementary stream-- except when we are doing an audio only stream. Then its actually piggy-backed into the audio stream. Now, I can add this with one of our tools mediafilesegmenter and with mediastreamsegmenter. I'll talk about that more later on. And it's supported starting in iOS 4 in both MPMoviePlayerController and in AVPlayerItem. And you find this using the timed metadata property.
Now, we also had some things with the encryption keys. Now, it's tricky to get certificates to work right, especially on iOS and a number of our clients wanted more secure key delivery than just HTTPS. So we decided to add private protocols for keys. Now, how does that work? OK, we're using the custom URL scheme. Some of you may have used this in some of your apps for other reasons. It uses the NSURLProtocol class and if you're not familiar with it, the URL Loading System Programming Guide gives you good explanation.
It terms of the way it looks in a playlist, it looks like this. There is my key and I've got, I just specify my protocol and what happens is the player, this framework when it sees that my protocol, it's OK, I'll go and ask the app and your code is responding and say, oh yeah, yeah I know how to handle my protocol and you go off and fetch that key however you want.
Now, the only gotcha there is because you are giving that key to us, right? That key is going to be one of these 128 bit imagers as you've got to abide by the rules. You got to give us the same key every time or at least that key has to apply to subsequent segments just like it would if it was a file being fetched.
Now, we've made a number of performance improvements. In particular, faster stream switching. So when I want to pop up to a higher bit rate because my network's gotten better that transition happens much faster in iOS 4 than it did previously. We also get faster startup, then the movie starts up initially much faster when you have a fast connection.
We also added, because some of clients who were delivering really long video on demand playlists found that the playlist was taking a little bit longer that they liked to download. So what we did was we added support in the client to un-gzip compressed playlists and you can turn that on in your server.
For example if you're using Apache, you just turn on the mod deflate module. [Noise] OK, so now I've got to my odds and ends. Failover is the first one. Now when you're delivering a variant playlist, you're not required to just supply one variant at each bit rate. You can supply multiple at each bit rate.
So in this case, I've got two variants at a lower bit rate and two variants at a higher bit rate. Now the client when he comes in, he's just going to pick the first one. So the client's going to be fetching from server 1. Now what happens if server 1 goes down, because, let's face it, I mean even the best servers aren't a hundred percent up time. So if the client tries to fetch the Playlist from server 1 and server 1 doesn't respond, the client's going to failover to server 2.
Now you'll notice that in my higher bit rate example, I'm actually getting from different servers. There's no requirement that it'd be the same set of servers serving the different copies of the same stream. There's no requirement that each bit rate have the same number of variations. I could have 3 for the low and 2 for the high. I could even only have 1 for one of the others.
Now, one point I want to make is this only fails over if the server doesn't supply the file. Now if the server supplies the file but something's gone wrong, and the server isn't updating file anymore it's not going to failover in that case but hopefully we can fix that at some future date.
Now the last thing I want-- new feature I want to point out is actually something that's been around. Program Date-Time is a tag that allows you to associate a wall clock time, real calendar time like, you know, June 8th, 2010 at 11 o'clock in the morning and it associates that with the start of a segment.
So it's associating a wall clock time with a point in the movie and that association is going to carry forward as you go through subsequent segments in your movie. And in iOS 4, AVPlayerItem let's you seek to dates. And the seeking is very straight forward. It's just an NSDate that you pass.
Now one thing I want to point out is if you use this and you have a discontinuity, the discontinuity said, well things have changed. Well one of the things that changed is it says I don't know that the program date is still valid at a discontinuity. So after each discontinuity, if you want that program date association, you're going to have to re-insert one of those tags. Now at this point, I'd like to invite Roger back up on stage to give you a demo that ties together some of the things that we've been talking about.
[ Applause ]
Thank you Eryk.
Just before I do the demo, one thing I'd like to mention is that with regard to seek to date, if you've had experience of HTTP streaming and seeking before, you'll know that the seek is a little bit rough and there'll be seek at the beginning of the first segment. One difference with seek to date is its subsecond accuracy. You can seek very, very finely within a-- so that's [laughs] some folks here are happy about that.
Great, because it's hard to write. So what we have for you today for the demo, it's a very simple little application and it is designed to show you how you can use two of the features that Eryk talked about just now, the discontinuity tag and Timed Metadata to stitch together two different types of content and use that to support kind of a custom playback user interface. So let's launch the app here, OK now let's play the movie.
So what we've got here is a video that's taken at a park and what we've done is drop in some bonus content into the middle of it and so what you can see here, we have a custom controller and the bonus content is marked with those different colors, the red, the green, and the purple. So what we'll do is start the playback here. So there's my cat.
The first thing you'll notice is as the controller reaches the first discontinuity, the red area, you'll see a transition take place and so here it comes. And so, here I am there was a discontinuity tag between that last segment that had the cat in it and me over here. The next thing you'll notice is we've disabled seeking while you're in some of this bonus content.
Obviously you could implement any kind of policy you wanted but that's kind of a simple one to show as an example. You can still, you know, the play-pause still works but seeking is disabled. The next thing we have here is some back to back bonus content. And what's happening is that every-- as it plays forward, your application is getting a callback which is synchronized to playback and it's-- the callback carries a little bit of Timed Metadata. In this case, all it is, is a URL to kind of an imaginary ad server. And so the application here is using that callback to trigger the enabling and disabling of that playback controller.
So we can seek, we hit the bonus content, there's me again. You must listen to me, you cannot seek away from me. But OK, now we're back and so now we can run back here and we can seek again. We can seek back in the bonus content and there we go. So, that's it. This sample code is actually available. It's associated with the session. You can find through the WWDC site.
So the content is up there on a public server, in case you want to download and take a look at how the metadata is embedded into it. And we'll be available in the lab tomorrow if you'd like to come by or even maybe for a little bit later today, if you want to come by and ask us questions about that sample code. So I'll hand it back to Eryk.
[ Applause ]
OK, so I want to talk some about the Tools that we used to create that sample, particularly to create the streams that are in that sample.
So we have a set of Tools and I'm happy to announce today that we've added a fifth tool into our Tool set and these Tools as always are available at connect.apple.com in the downloads iPhone folder. So I'm going to talk about each of these Tools, and to create the content for this demo, we used mediafilesegmenter and the id3taggenerator.
So first point I want to make about mediafilesegmenter is it's really easy to use. I mean, if you want to get started with HTTP live streaming, use mediafilesegmenter. I mean honestly, this is how easy it is use. All you need is a movie file that's already H.264 and AAC and you pass it to mediafilesegmenter. Boom, you're done. You've created a playlist and segments. Because mediafilesegmenter will do the transport,MPEG-2 transport stream wrapping for you. And now some people get a little scared on mediafilesegmenter because it's got a few options, you know, 20 year or so. Not that many.
In fact they break into four categories. So it's really not as complicated as it might look. So there's the main options, really important ones like if I want to create just an audio only stream or if I'm generating Variant Playlists in particular, what's my target duration going to be.
The next set of options is-- those are associated with names and locations. This is where I want to put the files on the file system. What URL they're going to be located in? If it's an absolute URL, what's the prefix? Things like that. The third set is encryption tags.
These are things that specify how often I want to rotate my Initialization Vector. How often I want to rotate my key and also the same sorts of names and location things that I have with the segment files I also have for the key files. What's the prefix going to be on the key files? Where am I putting the key files on my file system? That sort of thing. And the fourth set of options are those associated with metadata. And I'll talk about those a little bit more when I talk about the id3taggenerator. Basically, there's a few odds and ends that aren't important but those are the basic options on mediafilesegmenter.
Now once I've messed around with mediafilesegmenter, I really want to try and find out about how to create Variant Playlists. So if I'm creating a variant playlist, then I want to have several variants of my movie at different data rates, right? So in this case, I'm starting out with one variation of my movie. I'll make a directory, I'll tell mediafilesegmenter to put the files in that subdirectory and by passing the generate variant playlist option, I'm telling mediafilesegmenter to create a plist that describes that variant. So it's going to create that on the side.
It's going to use the name and in fact the location of the movie file. So if that movie file was in some relative directory, the plist is going to be created alongside it. Now I have another variation. In this case, I'm just going to have two variations. This one let's say it's my cellular.
I'm going to make a subdirectory and call mediafilesegmenter again to segment that version and then it's going to again create a plist. Now I can call Variant Playlist Creator and what I do is I tell it for each variant, where's the playlist for that and what's the plist that describes it? I give those in the order I want them to be in my variant and it creates it.
So Variant Playlist Creator is great. Once you've gotten started with that, you can work your way up to mediastreamsegmenter. Now mediastreamsegmenter is very similar to mediafilesegmenter. The big difference is it's not taking it from a file. It's taking it from a pipe or a UDP port and it's not expecting to get a movie, it's expecting-- or not an H.264 and AAC, it's expecting to get a transport stream. That's what it wants as input.
And it has even more options than mediafilesegmenter but once you understand mediafilesegmenter, you'll understand most of the options because you've got that same basic set that you had. With the exception, you don't have generate variant playlists anymore. Now you're going to have to create your variant playlist on your own and there's some slight differences in the way the metadata options work. But the big add-ons are Playlist Structure.
OK so, Playlist Structure is-- because with mediastreamsegmenter I'm going to be creating Live or Event Playlists, I need to be able to tell like, is this is a Live Playlist? This is an Event Playlist and how big is my sliding window of content and how soon do I want to start dropping playlists on, you know, do I want to wait until I have a whole window of content or, you know, will I start once I have a minute or even 30 seconds of content. And also if I'm doing that wall on window, what do I want to do with the files after I get rid of them.
The last group of options of mediastreamsegmenter is what I call Actions. Particular important ones there are because I'm getting my data through a UDP port or a pipe, I could get a timeout. I could not get data. What do I want to do when I don't get data? So there, I've created my streams.
The next thing I'm really kind of want to do is I like to validate my streams. Now if you're using our tools, you don't really need to validate them 'cause we already went through a lot of effort to make sure that they do the right thing. But if you're creating your own playlist, you can use the Media Stream Validator to validate your playlist.
If you use the PARS option, what it does is it simply looks at the playlist not at the segments and checks to see whether it's following the rules. If you use a validate option, what it's doing is it will actually look at the segments and in fact if it's a variant playlist that you're passing it, it will look at all the individual variants and check them as well.
The last Tool I want to talk about is the ID3 Tag Generator. This is a new tool that we just added. It creates ID3 files and you use it with the mediafilesegmenter with the meta-macro file option. What does a meta-macro file looks like? Well there's a sample and basically you're saying at this point in time and seconds, I want you to pull in this content, this file. So it's either an ID3 file that I generated with the generator or it can be a picture.
Now with mediastreamsegmenter, it's a little bit different. With mediastreamsegmenter-- the mediastreamsegmenter is actually listening on a port for metadata and you can tell ID3 Tag Generator to send it to a port. And what it's going to do, it's going to send right at that moment in time. So you're actually-- can insert the metadata wherever you want.
Now some tips and tricks. OK, so for variant playlists, you need to remember that the first alternative is the one that's going to play initially, and when you're delivering over both cellular and Wi-Fi, it's really a good idea to create two variants of your variant playlists. One that you'll deliver on cellular and one that you'll deliver on Wi-Fi and you use the Reachability APIs to decide.
The reason why you do that is because it's going to play the first variant initially and you want to have it be a good data rate for whatever network you're going over, whether it be cellular or Wi-Fi. Now the set of variations that you should have in that playlist should be identical between cellular and Wi-Fi, the only difference should be which one is first.
The reason you want them identical is because you're going to move around. Your client is going to move around networks. I might start of in here on Wi-Fi and go outside in the street and now I'm on cellular or vice versa and then I come back in, right. The network's going to be changing all the time.
So you want to have the full range of possibilities available to the client. Now, if you're delivering these movies via web delivery, you can use makerefmovie which is a tool that we make available and that can target cellular or Wi-Fi and it can also target desktop versus iPhone and iPad. Now, encoding.
File size is very important over mobile, right? And if you look at our recommendations, I'll give you a pointer to the tech note that has our recommendations a little later, you'll see that we're pretty conservative about what data rates we think you can support. And when you're doing that, don't forget the container overhead. The transport stream is going to add some overhead into your-- on your data rate and also now that you've got metadata, the metadata is going to add some overhead as well.
And you don't need to encode to the full screen dimensions. You can encode-- we've got a very good video scaler on our IOS devices, so you can encode it 2/3 or 3/4 of the screen size and still get a very, very good experience. Now because you're trying to minimize your data rate, you can trade off frames per second versus video quality. You've got this option-do I make the image a little worse and keep this frame rate up or do I decrease the frame rate and keep the quality of the images up.
People have different opinions about how they should make that trade off. Now when you're doing this, you want to have multiple IDR frames per segment. The more IDR frames you have-- if you have more IDR frames per segment, we're going to do a better job of stream switching and I want to reinforce the point that the audio needs to be identical across all the variants so that you won't get audio artifacts.
Now if you're doing your encoding with something like QuickTime Player 7, you want to use the movie to MPEG-4 exporter because it gives you more control over the encoding. If you're just using export to web, it gives you a very restricted set of options. Now the 3 important things I want you to take away from this: We're continuing to evolve HTTP streaming.
We've changed-- made a bunch of changes over this year. We're anticipating making more changes in the future. So you want to stay current. You want to go and check on connect.apple.com if we've updated the tools. We try and announce that on the dev forums but sometimes we miss. And you also, give us your feedback.
The changes we made in key delivery were the result of feedback from people who were trying to use this, trying to do various things with the system. Again, my name is Eryk Verhsen I'm the media technologies evangelist. So if you have any questions about HTTP live streaming, you can send me e-mail. The big points on the documentation, you can just go to iPhone developer site and search for HTTP live streaming and you'll find these. First one is the HTTP Live Streaming Overview.
The second one is our best practices for creating and deploying HTTP live streaming for the iPhone and iPad which outlines what we recommend in terms of data rates, in terms of resolutions. And lastly, I want to mention that we've made the specification for HTTP live streaming public. It's available. We do update it. We've been through three versions in the last year. We'll probably go through more.
And lastly, the dev forums are a great place to go. The engineers who work on a HTTP live streaming do answer questions on the dev forum. That pretty much wraps it up for us today. We're not going to do a stand-up Q&A. If you have questions, you can come up and talk to us for the few minutes we have before we have to vacate the room and I invite you to come to the labs tomorrow. Thank you.
[ Applause ]