Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: tech-talks-2011-extra-10
$eventId
ID of event: tech-talks
$eventContentId
ID of session without event part: 2011-extra-10
$eventShortId
Shortened ID of event: tech-talks
$year
Year of session: 2011
$extension
Extension of original filename: m4v
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: [2011] [Tech Talk China] Working...

China Tech Talks #10

Working with Video in iOS

2011 • 39:16

The iOS SDK provides several ways of playing and recording video. Get started with video. Learn about functionality you can add to improve your user's experience. Get an overview of the more advanced video APIs.

Speaker: Eryk Vershen

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Hi, I'm Eryk Vershen, Media Technologies Evangelist. In this video, you'll learn all about video playback in iOS, how to handle remote control events, and a brief introduction to AV Foundation, essential knowledge for incorporating video in your iOS app. This talk is going to get you started on video if you're new to it.

It's going to get you to add functionality if you're already using video in your app. And it's going to introduce you to some of the more advanced APIs. You're going to learn how to do basic movie playback and recording. You're going to learn about notifications, remote control, AV output, and a brief introduction to AV Foundation.

So let's look at the technology frameworks that we're going to be talking about here. There's three frameworks in particular. The first one is the Media Player Framework. The Media Player Framework is responsible for simple basic movie playback as well as playing back of songs that you might have in your iPod app.

Now we're not going to talk about playing back songs. We're just going to talk about the movie playback. Now underneath the Media Player Framework is the UI Kit Framework. The UI Kit Framework is a very large framework and has to do with everything that presents UI in iOS.

A small portion of UI Kit has to do with presenting media and I'll be talking about those classes in this talk. Underneath that at a level below is AV Foundation. AV Foundation is another very large framework. It was introduced in iOS 4.0 and it's what you need to use if you're going to do more advanced things with video.

In the diagram you'll see that I've mentioned several other frameworks below AV Foundation: Core Audio, Core Media, Core Animation. I won't be talking in detail about those frameworks, but I wanted to give you a sense that AV Foundation is built on a lot of other technologies. So these three frameworks on the top, Media Player, UI Kit, and AV Foundation is what we're going to talk about. First, let's talk about playback.

Basic movie playback is when you're taking over the entire screen to play back your movie. And in this case, you're going to be using Media Player Framework. The first class you want to know about is MPMoviePlayerViewController. MPMoviePlayerViewController is what you're going to use if you're going to do simple movie playback, where you want to take over the entire screen of your iOS device.

Now, MPMoviePlayerViewController is a UIViewController. And you use it modally. That is, it's going to be the only thing interacting on your iOS device. But rather than using the standard present and dismiss methods, you have to use special methods that are customized for UIViewController. Now, this is going to use the entire screen. So, what you're going to do is you're going to add a method to your UIViewController subclass.

This method would be called something like PlayMovieAtURL. You'll pass it a URL of the content that you want to play. Obviously the first thing you're going to do is you need to allocate that MPMoviePlayerViewController and init it with your content. The next thing you're going to do is you're going to register for the playback finish notification.

Now, one of the things you want to notice here is that the object you're going to get notifications on is not your view controller. Instead, it's the movie player that's associated with the view controller. And the notification you're going to register for is MPMoviePlayer Playback Did Finish notification. And this is what you're going to get when your movie stops playing.

Now once you've done that, you simply tell the view controller to present the movie player view controller. You're calling that method which goes ahead and has this new view controller take over the screen. And at that point, all you do is initiate movie playback. Now notice that this is asynchronous. The movie doesn't start playing immediately, but the method does return immediately.

Now the other piece of code that you need to create is the method that's going to be called by that notification. In this case I've called it myMovieFinishedCallback. Now it's going to get a notification object like any notification and the first thing I'm going to do is I'm going to get rid of the notification that I created. And notice that in this case what I'm getting back as an object is not the view controller but is instead the movie player that's associated with that view controller. Once I've gotten rid of the notification, I dismiss the modal view and then release the instance.

Now, the other class that we're going to be interested in is MPMoviePlayerController. MPMoviePlayerController is one I use when I want to only take over part of the screen of my iOS device. Now, in this case, rather than being a view controller, it has a view property associated with it. And that means we need to install this view into our view hierarchy And the frame of that view needs to match the bounds of its parent view.

So let's look at the method we're going to create. Again, let's call it something like PlayMovieAtURL. Now in the beginning here, I'm going to do very similar things. I'm going to go and create my MPMoviePlayer controller. I'm going to save it away somewhere in my class. I'm going to set the content to be the URL that I'm interested in. And then the important piece that I've highlighted here is I'm going to add the view that's associated with that movie player controller into the view hierarchy that already exists.

[Transcript missing]

So that's our two classes that exist in Media Player. MPMoviePlayerViewController. It's a view controller. You use it modally and it takes over the whole screen. And MPMoviePlayerController instead has a view property. You install it in your view hierarchy and make sure that the bounds match. Now one thing I want to mention is prior to iOS 3.2, MPMoviePlayerController worked differently. In that case it took over the whole screen and the modal playback wasn't presented with a special method. Instead it happened automatically when you played and was dismissed automatically when the movie finished.

Now, the only reason I mention this is if you have to port some code that's pre-3.2, or if you feel you need, for some reason, to still support users who are on iOS 3.1 or earlier. This behavior is deprecated and doesn't occur in newer versions of the OS.

Now I'd like to turn to what you can do in terms of enhancing your support for movies. I'm going to talk about five different things here. What you can do with your user interface in terms of customization, the various notifications that exist around movies. How to handle remote control events. How you can get video output from iOS and about AirPlay.

So let's talk about user interface. Now when I'm playing a video, say with MP Movie Player View Controller, it's in an embedded view and I have a very limited set of controls. A play/pause button, an airplay button, and a button that allows me to go in full screen.

Now once I go into full screen, I have a lot more controls. I have a volume slider, a next fast forward button, a previous rewind button, done button, the scrubber bar and play head, and a button that allows me to change the scaling of the video. Now for certain kinds of media, file-based media, I have a few other controls that will show up. There's a control for subtitle and alternate audio menu and a control that shows the chapter list menu.

In terms of customizing the UI for movie playback, really your only choice is to turn off the controls. You do that with the MPMovie control style none. Now you can resynthesize most of the controls. The exceptions are the AirPlay control, the subtitle and alternate audio button, and the chapter list.

It is possible to get airplay by using the MPVolumeView class. This gives you a volume slider and an airplay button. Using the MPVolumeView is pretty straightforward. You allocate the class associating with the particular parent view's bounds and add it as a subview. Now I'd like to turn to the various notifications that are associated with movies. There are quite a number of notifications, so I've broken them into several groups. The first group I'd like to talk about is those that are associated directly with movie playback. The first one you've seen before, MPMoviePlayerPlaybackDidFinishNot ification. That's a notification you get when your movie's done.

The other ones here have to do with notifications that you'd like to get when some part of your app isn't directly involved with movie playback. For instance, if you want to use the same movie player to play multiple different URLs, that is, different content, you can have part of your app register for the "Now Playing Movie Did Change" notification. That part of your app will get this notification when you change the URL that's associated.

The next group of notifications I'd like to talk about all have to do with the movie content. The first four have to do with values that are associated with a movie that aren't necessarily available when you first initialize your movie. For instance, the system may not know exactly the duration of your movie when you initialize the object. If some part of your application needs to know these values exactly, you should register for one of these notifications so that you can find out once the system knows.

The next two notifications here have to do with things that can change that are part of the presentation state of the movie. For instance, scaling mode did changed. If the user is allowed to change the scaling of the movie, that is what the aspect ratio is of the movie as it's going along, your application will get notified with a scaling mode did change notification. The last one in this group is the time to metadata updated notification.

Now, this notification only occurs on HTTP live streams. Time metadata is metadata that's associated with particular points in time in the movie, say 35 seconds in or 42 seconds in. The time metadata is sent as a notification because this makes it easier for your application to react to that metadata.

The next group of notifications all have to do with going into and out of full screen. Now this is a very typical pattern in iOS. You get will enter, did enter, will exit, did exit. And these should be quite familiar from other parts of the system. The last notification I want to talk about is the thumbnail image request did finish notification. You used this notification when you wanted to get a still image for one or more frames in your movie. These things have to be done asynchronously by the system on your behalf. So the system's going to send you a notification when those thumbnails are completed.

Now I'd like to talk about remote control. Why do you want your app to respond to remote control events? Well, principally you want to respond to remote control events because of the Apple earphones. The Apple earphones have a remote control on them and this control allows the user to play, pause, switch to the next track or go back to the previous track, among other things.

Why do you want to respond to these? Well, if your user is listening to and watching the video in your app while they're on the subway or riding in a car with other people, they don't want to disturb their neighbors with the audio. And if they've got that remote control available to them, they want to be able to use it. Now there's a couple other reasons that you might want to support the remote control events.

If you decide to have the audio portion of your movie play in the background, then there are two ways that the user can control. When the screen is in the locked state, if the user double clicks, they can get the playback controls. Or if the user is using another app, and goes to look at their most recently used apps list, they can swipe over and actually have another set of controls that we call the Now Playing controls. So how do you support remote control events? Well, first of all, your View or ViewController class needs to implement the canBecomeFirstResponder method. Return yes.

And then you need to do something when your view appears and when your view disappears. When your view appears, you need to tell UI application that you're ready to begin receiving remote control events. And you need to become the first responder. Only the first responder gets the remote control events. Now when your view disappears, you need to undo these things. Tell UI application that you no longer want to receive remote control events and resign first responder.

Now when you get a remote control event, your method called remote control received with event is what's going to be called. You're going to look and see that the received event is a remote control. And you're going to look at its subtype to see what kind of remote control event it is.

The one that I've highlighted here is UI Event Subtype Remote Control Toggle Play Pause. This is the most important one to respond to. Typically, whenever you, the play/pause button in all of these interfaces is going to give you this event. On the slide, I've listed a couple of other events, the previous and next track events, but there are several other events that you should handle, and you need to refer to the documentation for those. The next thing I'd like to talk about is how you can get video output from your application.

Now, with the iOS devices, you're not actually getting a video output. Instead, you're getting a second screen. And this happens whether you're using the VGA connector or the composite or component AV cables. In terms of your application, what you're seeing is a second screen. And that screen is just another view.

To handle UI screen, you need to register for screen connection and disconnection. And then when you want to display content on that screen, you create a window and simply associate the screen, that external screen, with that window. You add whatever views you're interested in and show the window to the user. So let's take a look at that. Registering for screen connection and disconnection should look a lot like what we've seen before.

We're just adding another observer, in this case for the UIScreenDidConnect notification, and there's no object associated with this. We'll also register for screen disconnection. Now let's look at our method that we're going to use for handling the screen connection. The notification object associated with the notification is going to be that new screen that's been connected.

We're going to allocate a window associated with that screen and set the screen as the screen object for that window. Once we've done that, all we need to do is tell our application to adjust the UI for having that second screen. Now, when we get a screen disconnection notification, we essentially do the inverse. We need to hide that window, get rid of it, and update our UI to reflect that that window is no longer there.

Next, I'd like to talk about AirPlay. Now, AirPlay was something that we added in iOS 4, but now in iOS 4.3, your application has the ability to send video over AirPlay. Now, one thing that trips some people up is that you always will see the AirPlay button whenever there's an AirPlay device around, even if that's only an AirPlay audio device.

AirPlay audio is something that you don't opt in or out of. It's always there. You do, however, have to opt in to AirPlay video. You do this with the Allows AirPlay property on the MP Movie Player controller. You simply set it to true. Now, do remember that AirPlay in your app requires updated software on your Apple TV. You can use AirPlay with web content by setting the appropriate attributes in your HTML5. And while it's supported with Media Player, it's not supported in AV Foundation at this time.

You can use it with HTTP live streams, but if your stream is using encryption, that's not going to work over AirPlay. And one thing you have to remember is it's streaming the movie, not the screen. If you're doing overlays or backgrounds, those things are not going to show up on your AirPlay device. So, I've talked about five different enhancements about how you can customize your user interface, about the extensive movie notifications, handling remote control events in your app, how you can use a second screen, and AirPlay.

Now I'd like to make a few digressions. First of all, I'd like to go back to our technology framework and talk about another framework that I didn't mention before. And that's the Audio Toolbox framework. Now the Audio Toolbox framework lives underneath the AV Foundation framework and Media Player. And the reason I'm mentioning Audio Toolbox is because of what's called the Audio Session category.

The audio session category. What is it? It's an identifier that tells the system how your application uses audio. It tells the system what kind of behavior your application expects. Do you want to receive audio input or produce audio output? Do you want to honor the ring silence switch? And do you want to mix with any other audio that may be on the system? You have to remember that you're in an audio environment, whether you like it or not. The user may have been playing some other audio before they switched to your app, and it may be appropriate for your app to allow that audio to continue playing.

So there are six audio session categories. And remember, you set these based on the role that audio takes in your app. The first one is playback. If you're doing audio or video playback, that's where you should be. If you're simply recording, you want to use the record category. Now, if you're doing something where you're doing a combination of playback and record, for instance, a voice over IP app, you'd want to use the play and record category.

If you happen to be doing something just with audio, some offline processing of audio, you'd use the audio processing category. And the last two categories, Ambient and Solo Ambient, are mostly for games and productivity apps. Now these are the only two categories, Ambient and Solo Ambient, that honor the Ring Silent Switch.

And the reason they do so is by using one of these categories, you're telling the system that audio is not that important to your app and it's all right if it be silenced. The important takeaway here is you need to understand the audio session category because the default is Solo Ambient. And if you're playing movies, that's probably not the right thing for you.

The next thing I want to mention is differences in movie containers have effects that percolate all the way up to the API. Now, there are really two different kinds of formats. There's HTTP live streams and other formats. Now, rather than saying something that long, I typically talk about HTTP live streams just as streams and other formats as files.

Now, this is a bit of a misnomer, but it's good enough for our purposes. If you haven't used HTTP live streaming previously, I recommend that you visit our page dedicated to HTTP live streaming, developer.apple.com/resources/http-stream ing. This has pointers to all of the documentation, videos, and sample code, and the specs related to HTTP live streaming.

Now, streams and files have a lot in common, but there are some differences. For instance, streams and files can be in different locations. Streams have to be fetched with HTTP or HTTPS. Files, on the other hand, can be local on your device or fetched over the network. There are also differences in features. The kinds of tracks and other aspects of the movie are different between streams and files. And lastly, these differences do percolate into your code and make differences in how you code.

Let's talk about the commonalities between streams and files. First of all, the preferred codecs. For video, it's H.264, and for audio, it's AAC. Yes, there are other codecs that we support, but these are the sweet spot, if you will. In your video, you can go up to 720p and up to 30 frames per second. And the audio, 48 kilohertz sample rate and up to 160 kilobits per second data rate.

In terms of differences, with streams you can have timed metadata, that is metadata that's associated with particular moments in your video. This doesn't exist in files. On the other hand, with files you can have multiple soundtracks, you can have subtitles, in fact multiple subtitles, and you can have chapter markers. These things don't exist with HTTP live streams. Now in both streams and files you can have closed captioning.

The next digression I want to talk about is you need to think asynchronously. In video, lots of things are happening behind your back, being done by the system. For instance, when you get to AV Foundation, you're going to use something called asynchronous key value loading, where you ask the system to fetch a value out of your movie. You're going to get a reply asynchronously with the information. As I mentioned when I was talking about notifications, if you're generating still images from your movie, that's going to happen asynchronously. You're going to get an asynchronous notification that those still images have been created.

If you're making another copy of your movie, exporting it in some different bit rate, again, you're going to ask the system to do that on your behalf, and it's going to notify you asynchronously. And in AV Foundation, you're going to be using key value observing, which by its very nature is asynchronous. The other piece of asynchrony is multitasking. Your app is going to live with other apps on the system, and your app may be pushed into the background by the user.

You have to realize that if the user starts up another app that does playback, that's going to interrupt any export you may have asked the system to start on your behalf. Even if you're in the foreground, if you get a phone call, that's going to interrupt any exports you have.

Remember that when your app is in the background, you're not going to be able to record or process video from the camera. Now, some things won't be interrupted. For instance, if you're using asynchronous key value loading, that's not going to be interrupted by multitasking switches. Now I mentioned before that the audio portion of your movie can play in the background.

If you have a movie that's actually audio only, then that will play automatically when you switch into the background. Any other movie that contains some video will have to be restarted once your app switches into the background. There's a very good technical note about this, technical note 1668, which is specifically about how to play audio in the background with MP Movie Player Controller.

Now let's talk about advanced playback. This means we're talking about AV Foundation. Remember, AV Foundation is a lower level framework. In AV Foundation, your movie is represented by an AV asset. So rather than the case with Media Player, where presentation state and everything was wrapped up with a single object, we now have an object that's just there to represent our movie.

Because movies consist of a number of tracks, there's also an AVAssetTrack object for each particular track inside the movie. When we want to play back a movie using AV Foundation, we're going to use AVPlayer. With the AV Player, we of course start with an AV Asset, which has AV Asset tracks. But these don't have any of the presentation state. So we have another object, the AV Player Item and AV Player Item tracks, which control the presentation state for the movie.

This means that if I want to show the movie in two different ways, I would have two different player items, one for each way I'm presenting the movie. Now the AVPlayerItem is what I give to the AVPlayerObject, which is the object that controls the entire playback. That's the object that you'll be telling to play. But once I've given the AVPlayer item to AVPlayer, it's still not on my screen. I need to have an AV Player later, which is a kind of core animation layer, which is where AV Player is actually going to display the movie.

But I'm still not done because in order for the user to see that movie, it has to be inside my view hierarchy. Now, if you look at any of our sample code around AV Foundation, you'll see that we typically make AV Player layer the layer object for a subclass of UIView.

So there's some important considerations when you're using AV Foundation. The first one is if you're using HTTP live streaming, you can't create an AV asset directly from an HTTP live stream URL. You need to create an AV player item from that URL, and then you can extract the asset from that. If you're familiar with core animation, don't try to use render in context with an AV player layer. It's not going to work.

And AVPlayer can only have one layer associated with it. Yes, you can have your layer tree there, but remember that only one of those layers is actually associated with the AVPlayer layer. So to summarize playback, I've talked about two simple classes, MPMoviePlayerViewController and MPMoviePlayerController, and I've mentioned a few of the classes in AV Foundation, AVPlayer, AVPlayerLayer, AVPlayerItem, and AVPlayerItemTrack.

Now let's talk about recording. So the Media Player Framework doesn't have anything to do with recording. Instead, UIKit hosts the classes that are associated with recording. And of course, AV Foundation has a number of classes around recording. So when we're recording from our iOS device, say an iPhone, we of course have a camera and a microphone. And remember because you have a microphone, you need to pay attention to your audio session category and make sure you're setting that correctly.

Now, you can record video with audio on iOS devices since 3.0. And remember that the recording is under user control. You give the user control and you wait for the user to give you back the result. So the basic way to record video is with UI Image Picker Controller.

And what you're going to do is you're going to allocate your UI image picker controller object. And you're going to tell it that your source type is a camera. If you look at this slide, you'll see that we are specifying the media types. We give it an array, but in this case just a single item in that array, which tells the system that we just want a movie.

You can tell it that you want both either a movie or a still image. We indicate that we want to allow editing and we tell the system what our delicate class is. At that point, all we have to do is tell our controller to present this view modally.

There's two methods that will be called. Image picker controller did finish picking media with info. This is what's called when the user's actually giving you back a movie as opposed to canceling. The first thing you're going to do is you're going to look in that info dictionary for the media type and make sure they've given you back what you're expecting. If you have been given a movie, you're then going to look in the info dictionary for the URL that's associated with that movie and do whatever it is you want to do with the movie.

Once you've done that, you dismiss the modal view that was there and release the picker. Now, if the user cancels, your method image picker controller did cancel is going to be called. In this case, all you need to do is dismiss the modal view controller and release the picker.

Now, AV Foundation has a number of classes associated with recording. AV Capture Session, AV Capture Device, AV Capture Input, AV Capture Output, and AV Capture Connection. Let's look at a picture that will give us a better idea of what's going on here. So I have my iPhone and I want to capture some video. I'm going to want to see that on my display as I'm capturing it.

And I may want to capture some still images. I may want to look at the frames as they go by. I may want to save the frames out to a movie file. But I'm also going to have some audio, and I might want to either save that audio by itself or save it as part of my movie file.

Now the class that ties everything together is AV Capture Session. AV Capture Session controls the entire recording and organizes all the other classes. The input devices, my camera and my microphone, are going to have AV Capture Input subclasses associated with them. The view that I'm getting on my screen is going to have an AV Capture Video Preview layer associated with it. This is another core animation layer.

And the various kinds of outputs will all be subclasses of AV Capture output. The important considerations with recording in AV Foundation are: You can only use one of AV Capture movie file output and AV Capture video data output at the same time. The system isn't capable of handing you the video frames in real time and writing them out to movie at the same time. The other point is documentation currently states that AV Capture video data output can support compressed data. This is not the case in iOS 4.3. It's still only uncompressed data.

Okay, to summarize the classes I've talked about for recording, UI Image Picker Controller is your main class for basic recording. And of course you have a delegate associated with that, your UI Image Picker Controller delegate. Now there are several classes in AV Foundation. I mentioned just a few of them: AV Capture Session, AV Capture Device, AV Capture Input, AV Capture Output, and AV Capture Connection. Now let's talk about editing. So again with editing, the classes are in UIKit and in AV Foundation. For basic editing, what we're really only doing is trimming the movie. That is, we're trimming portions from either the beginning or end of the movie.

What this does is it allows the user to trim those frames. Your program isn't making any decision about where that cut happens. Now, this same class also allows you to re-encode your video at another quality level. You need to look for UI Image Picker Controller Quality Type. That's how you control that. The two classes here are UI Video Editor Controller and the UI Video Editor Controller Delegate.

I'm not going to talk about the interface here in detail because it's very, very similar to the UI Image Picker Controller. Now, you can also do editing as part of your UI Image Picker Controller. There's a property on the UI Image Picker Controller allows editing. Set that to true and the user will be able to do trims.

The interface is very simple for the user. A couple of controls that allow them to move where the beginning trim and end trim is. And a button to either accept or cancel. Now, AV Foundation has quite a few classes used for editing. I'm going to mention each of these very briefly. So if I want to generate still images from my movie, I'm going to use AVAssetImageGenerator. If I want to re-encode my movie or export a version in a different aspect ratio or something, I'm going to use AVAssetExportSession.

If I simply want to read frames from an existing movie file, I'll use AVAssetReader. If I want to create a new movie file, I'll use AVAssetWriter. Now, while I'm outputting my movie, if I want to change the sound level, so I'm going to use AV Audio Mix, If I wanted to do straight cuts in my movie, no transitions or fades or wipes, then I'm going to use AV Composition. It allows me to do simple cuts-only editing.

I'll use AV Video Composition if I do want to have some sort of transition, a fade or a wipe. Now, I can also include core animations in with my movie in AV Foundation. Core animation can be used to generate titles or other images. And when I do that, I need to use another core animation subclass, AV Synchronized Layer. And further, when I'm exporting a movie that's used core animation, I need to use the AV Video Composition Core Animation tool.

The important consideration when editing with AV Foundation is that AV Asset Reader is not a real-time class. Don't expect to be able to read frames from a movie and deliver them to the user in real-time using this class. The other consideration is AV Composition and AV Video Composition are meant for two very different sorts of situations. AV Composition is when you're doing simple cuts only editing. AV Video Composition is when you want to make some sort of transition.

Okay, now I'd like to wrap up. If you want more depth on AV Foundation, you should listen to the talks that we gave at WWDC 2010. They're excellent talks on AV Foundation and specifically on recording and editing. So there you have it. I'm Eryk Vershen and my email address is [email protected]. I look forward to seeing your app really taking advantage of video in iOS.