Harnessing Metadata in Audiovisual Media - WWDC 2014

Media • iOS, OS X • 48:14

Rich metadata, such as time, date, location, and user defined tags, are particularly useful for time-based media. Discover how to harness static and timed metadata in your AV Foundation apps. See how to write metadata into media and read it during playback. Gain knowledge of best practices for privacy and protecting user data.

Speakers: Shalini Sahoo, Adam Sonnanstine

Unlisted on Apple Developer site

Downloads from Apple

HD Video (239.8 MB)
SD Video (82 MB)
PDF Slides (2.6 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Hello, good afternoon, welcome to session 505, "Harnessing Metadata in Audiovisual Media". I've heard we are competing with the "Intro to Swift" talk so get intimate with your neighbors here. My name is Adam Sonnanstine. I'm an engineer on the AVFoundation team and today we are going to talk about, of course, metadata.

So what do I mean by metadata? Well, for the purposes of this talk we're going to define metadata to mean any data that is stored in movie files, streaming presentations, any other sort of audiovisual presentation that describes the primary data, like the audio and video that we think about when we think of those sorts of presentations.

Some examples are always helpful. One you should be familiar is iTunes metadata, when you have this sort of the song names and the artists and the album artwork in your iTunes library. All these things are stored as the sort of metadata that I'm talking about today in the files in your iTunes library.

Besides iTunes metadata, we also have things like location information. If you have a movie that you took with your iPhone or some other location-enabled device and you played it in QuickTime player, that will show up in the info window to tell you where you were when you took that movie.

That's also stored as the kind of metadata that we are talking about today. Some new features, we know that you're not always standing still when you are taking your videos. So new in iOS 8 and OS X Yosemite, we have features that support things like dynamic location, that's a location that changes over time.

So these are some new features that we are going to be talking about later on that we are pretty excited about and, in addition to location, this really applies to any sort of metadata that you might want to add that changes over time in your movie. This is a screen shot of a demo app we'll show you later but the circle and the annotation text that's all stored as the same sort of timed metadata as a timed location.

So hopefully that whets your appetite a little bit. We'll talk about what we're going to cover today, we're going to start, I'm going to give you an intro to metadata and AVFoundation, some of the classes that have been around for a while for describing all sorts of metadata, how to inspect that and how to author it. We're going to talk more about those new timed metadata features and then I'll give you some best practices including some privacy things to keep in mind and some other best practices.

So our first topic: metadata in AVFoundation..What kind of classes are we going to be using to describe our metadata? Well our primary model objects that we use to describe both movie files or HLS streams is AVAsset. AVAsset can contain any number of AV metadata objects and each AVMetadataItem instance represents a single piece of metadata, either your track name or your album mark, even your location stuff that's going to be separate pieces of metadata in our runtime environment.

So a closer look at AVMetadataItem: at its core, it has two properties. The first is identifier, which is actually a new property and that is going to describe the kind of metadata that you have. In this example we have the song name and it's represented by this long symbol name, AVMetadataIdentifieriTunes MetadataSongName, and then you have the value which is the actual payload of the metadata item. So for song name, it's the name of the song as a string.

As an example for cover art, you can see that the value doesn't have to be a string it can be an image or any other object that supports both the NSObject and NSCopying protocols. Now if you've used AVMetadataItem in the past you might be familiar with the key and key space properties.

Well the identifier I mentioned that was new, it's new because it is a combination of the old properties, key and key space. So I'm not going to talking much about key and key space today but mostly going to be talking about identifier going forward as the way to describe your metadata.

Take a look at some of the built-in identifiers we have; this is just a tiny sampling. There's a lot of them. You can find them in AVMetadataIdentifiers.h. I've arranged them here according roughly to the old notion of key space so that's just sort of a sampling of the kind of metadata that we already know that you might want to represent.

Going back to the metadata item itself, we have a property that's also new called dataType, which describes the native data type that your metadata is representing. So for the case of our song name it's stored as a string so we see that the data type is a UTF8 string; and these string constants that represent the different data types are all defined in CMMetadata.h. And besides the data type property, we also have several type coercion properties that you can use if you know you want to get your payload in the form of a certain type of Objective C object. So you have string value, number value, date value and data value and those are going to give you exactly what you'd expect.

For the case of where our native payload is a string, only string value is going to give you an interesting answer. The rest will give you NULL. For our artwork example where the payload is a JPEG image, the top three are going to give you NULL and the data value is going to give you the NSData that you're looking for to grab the bytes of the JPEG image.

There are examples where you can have more than one non-nil tech coercion method. And one is creation date if you have the date represented as a standard string format for dates. You can either get the actual string that was stored that way or you can ask the metadata item to give you an instance of NSDate that describes the same thing in a more convenient representation.

So that's your brief intro to AVMetadataItem, we're going to be talking a lot about it throughout the talk. Let's go back to AVAsset so we can see how we actually get these metadata items. So the easiest way is just to ask for all of the metadata that applies to the entire asset. There are types of metadata that apply to just parts of the asset but this is how you get the metadata to, like, the location and the song title that applies to the entire asset.

There's also a way to get just a subset of the metadata. We have this notion of metadata format but I'm not going to be talking about too much today but you can use it to get just that subset of the metadata. So for our example, when we are getting iTunes metadata, we're going to use that AVMetadataFormatiTunesMetadata and grab all of that using the metadataForFormat method.

And then from there we can use this filtering method, metadataItemsFromArray, filtered by an identifier to get just the items that correspond to the song name identifier. You might be wondering why you can have more than one song name in a single asset? We'll get back to that in just a little bit but first I want to talk about how you load the payload of your metadata items.

AVMetadataItem conforms to the AVAsynchronousKeyValueLoading.h protocol. This is a protocol we define ourselves in AVFoundation and a lot of our core model objects conform to it because a lot of times when you get an AVAsset or an AVMetadataItem we haven't actually loaded the data behind it yet. So you can use this method, load values asynchronously for keys to load the specific values you want and they'll do that asynchronously so you're not blocking your main thread with some sort of synchronous I/O or something like that.

So for this case we have our metadata item. We're looking for the value so we just load the key value and, when we get our completion handler, we're going to check the status to make sure that that loading succeeded and, assuming that we did, we can then just go ahead and grab the value and use it however we see fit.

So back to that whole multiple titles in one asset string. Well, one reason we might have that is if we have the asset localized in multiple languages. So an asset can have the same metadata items in multiple languages. The example that we're going to talk about is QuickTimeUserDataFullName dat identifier.

If you use this identifier in your files, then QuickTime Player, for example, can pick up the title and display it in the title bar. So this is just yet another example of how metadata is used in our applications. This particular example of the movie actually has the title available in both English and Spanish so here we have the English as the system language so QuickTime Player picks that up, picks up the English title but if we set our system language to Spanish it will pick up the Spanish localization of that title instead. These are represented as two distinct pieces of metadata within the file and the way that you distinguish between them is that they'll have different values for these final two properties of AVMetadataItem locale and extendedLanguageTag.

ExtendedLanguageTag is new in this release. It's a BCP 47 language tag and it's particularly useful when you want to distinguish written languages. So that's one reason why you might have more than one metadata item with the same identifier. So I mentioned before that not all metadata applies to the entire asset, well one example of that is metadata that only applies to a particular track, so for this example we have a special label attached to our subtitle track called SDH, that stands for Subtitles for the Deaf or Hard of Hearing and that's basically just a more rich form of subtitles that includes things like labeling who's talking and mentioning sound effects that are vital to the understanding.

We talked a little bit more about SDH and accessibility in general last year in our "Preparing and Presenting Media for Accessibility" talk so check that one out for more details. For the purposes of this talk, just know that to get this SDH label here, it involves setting track-specific metadata.

So let's talk about how you actually find out if your track has this metadata in it, well you're going to use AVAsset track and it has pretty much the exact same API as AVAsset for reading metadata. You have your metadata property; you have your metadataForFormat method and so if we want to find all the tagged characteristics that are in an asset track, we're going to ask the track for its metadata for the FormatQuickTimeUserData.

Once we have that we use that same filtering method we saw before in order to get all of the items that have the identifier, QuickTimeUserDataTagged Characteristic. So this is one example of tagged characteristics is the SDH that I just talked about and it's the payload of the metadata items that tells you what kind of tagged characteristic you're dealing with. We'll talk a little bit more detail about SDH and how you author it in just a little bit.

So going back to our list of identifiers you might have noticed some patterns if you were looking closely. Each of these groups has their own version of a title or a song name or something like that. We noticed that and come up with our own special kind of identifier called a CommonIdentifier which can be used when you want to look up, say, for this example a title without caring exactly how it's stored in your file. Same for copyright here; we also have a common identifier that represents copyright.

These are not the only common identifiers; there's a whole list of them but these are just two examples. So if we go back to our example where we're looking for our iTunes song name, if we don't actually care that the title of our asset is stored as iTunes metadata and we just want a title so we can display it somewhere, you can ask the asset for its array of commonMetadata and this is all the metadata items that can be represented using a common identifier.

Then you use that same filtering method we've been using to filter down to just the ones that have the CommonIdentifierTitle and you can go from there with your title. Also worth noting is that AVAssetTrack has the same property, commonMetadata, so you can do the same thing over there as well.

So that is your brief introduction to inspecting metadata with AVFoundation. Let's talk a little bit about authoring. If you want to make your own files that have say location or iTunes metadata in them, we have several different classes that can write movie files, AVAssetExportSession, AVAssetWriter and the capture movie and audio files outputs and these all have the exact same redirect property called simply, metadata. So you give an array of metadata items and then that will be written out to the file. Similarly for track-specific metadata like those tagged characteristics, you can use an AVAssetWriter input which also has the exact same property.

Now you are not limited to just writing out metadata that you got from somewhere else, like another file through the APIs we've been looking at. You can also create your own metadata items with a mutable subclass of metadataItem. And as you might expect, this just has read/write properties for all of the properties in AVMetadataItem.

So if we use an example of writing a subtitle track that is marked as SDH, well it's actually two different tag characteristics that you have to use and so we'll create two different metadata items, set both of their identifiers to the identifier we just saw, the QuickTimeUserData tag characteristic, but one of them will set the value to TranscribesSpokenDialogue ForAccessibility and the other will be DescribesMusicAndSound ForAccessibility. Then we get the subtitle AssetWriterInputthat's going to write our subtitle track and set that array of the two items on our asset writer input. So that's how you would author a subtitle track that is marked as SDH. Just one example of using tag characteristics in AVMutableMetadataItem.

Special note about AVAssetExportSession: by default the ExportSession is actually going to take any of the metadata that's in the source asset that you're exporting. It's going to copy that over to the output file. Now that's not the case if you set metadata on its metadata property. That will be the signal to tell the ExportSession to ignore the metadata in the source file and instead write just what you put on the property.

So if you want to do augmentation of the metadata or some other sort of modification you'll want to grab that array of metadata, make a mutable copy and do any adjustments that you want and then set that on the metadata property. So that's just a quick note about ExportSession.

The last note about authoring metadata is HTTP Live Streaming, this is actually a new feature in iOS 8, OS X Yosemite and you can use a new tag called session-data tag in your playlist, which has two required fields: the data ID which is a lot like our identifiers that we're talking about, a URI which can point to the payload or a value which directly specifies the payload and, optionally, some language information. So here's an example that shows very similar to what we saw before with the titles in two different languages but this is the markup you'd use for HTTP Live Streaming.

So for more information on reading and writing metadata we do have some sample code, it's called AVmetadataeditor and for more information about the details of writing HTTP Live Streaming metadata see the documents at this URL. All right so that is your crash course in metadata in AVFoundation, our next topic is timed metadata. So timed metadata, although I mentioned we have new features, it is not a new concept.

We supported the notion of chapters for quite some time and conceptually chapters are just an example of times metadata. Each of these chapter markers is just a piece of metadata that is describing a particular range of the timeline of the movie. That's all that timed metadata is, it's just metadata associated with a range of time.

So similarly, with our dynamic location example, we have the path that's drawn here that's really just composed of a number of pieces of metadata indicating the current location, each one of them associated with a particular time in the movie's timeline. So to demonstrate QuickTime Player's features with dynamic location in Yosemite, I want to bring my colleague, Shalini, up to the stage for a demo.

Hi, I'm here to demonstrate how to read and play back metadata using QuickTime Player. Here I have a movie file which has both audio and video and timed locations data stored in a different track. So now if I bring this up in QuickTime Player, this is the usual UI for audio and video. New in OS X Yosemite: in the Movie Inspector you can see a map view if your movie file has location data. Your map view is presented along with the route where you have recorded this video.

So here the blue line indicates the path where we recorded the video and the red pin is an indication of the current location or the location on the timeline of the movie. So if I zoom in a little bit and start play, you can see as the movie progresses the pin's location is being updated to be in sync with the video.

I can drag the scrubber around and you can see the pin moving back and forth. I can also go and click at any point in the map and you see the video seek to that location to present where your video was when you were at that location. This is map view in QuickTime Player on OS X Yosemite.

Thank you, Shalini. So let's talk about what we just saw there. So that location information was stored as timed metadata in the file and in order to have QuickTime Player draw that information on the map, we use AVAssetReader to read all of the location information from that asset.

And because timed metadata is stored in its own track, we use an AVAssetReaderTrackOutput to read that data and we use a new class called AVAssetReaderOutput MetadataAdaptor that knows how to give us that data in the from of a class called AVTimedMetadataGroup. Then from there we can grab each location and draw that path on the map. So AVTimedMetadataGroup is a very simple class. It's really just these two properties: an array of metadata items combined with a time range that describes where in the movie that data applies.

So to see a little bit of code for using AssetReader for this purpose, the first thing you want to do is find the track that contains your location information and we'll talk more about how to do that in just a second. Then you use that track to create an AssetReaderTrackOutput and you use nil output settings and then you'll create your metadataAdaptor with that trackOutput.

And then, in a loop we just take your metadataAdaptor and call the nextTimedMetadataGroup method over and over again, doing something with each piece of data, like drawing it on the map until that method returns nil. Then you know there's no more data to draw. So in terms of finding the right track to read, the way you're going to do that is by examining the tracks in your asset and looking through the format description of each track to find the identifiers you're looking for.

So you first start by getting the tracks with the MediaTypeMetadata and then for each of those tracks you're going to loop through all of its format descriptions, usually there's only one and for each format description you're going to grab its list of identifiers using this function and check whether that identifier array contains the identifier you're looking for.

In this case we're looking for the location ISO 6709 identifier. So once we've found it we're good to go and we can resume with the code on the previous slide. So that's how QuickTime Player is drawing the map or drawing the path on the map before you start playback.

The other thing that QuickTime Player does, as you saw, is it can update the current location while you're doing playback or even scrubbing around and the way it does that while it's already playing the asset using an AVPlayerItem and we're going to use a new class called AVPlayerItemMetadataOutput that you attach to your PlayerItem, which also notes how to vend this data in the form of TimedMetadataGroups. But unlike the asset reader, instead of getting all the data up front you're going to be getting it piece by piece as the movie plays.

So a little bit of code, you first create your metadata output using the initWithIdentifiers method and in this case we're only interested in metadata that has that location identifier so that's all we're going to get by opting into this way. Then you create a delegate that you define and that's what's going to receive the metadata during playback and you set that delegate on your output and tell us what cue you want us to send the data on.

Then you create or grab your AVPlayerItem and call addOutput to attach your output, to attach your output to the playerItem and finally make your player and associate your item with the player as the current item and start playback. It's important to get the smoothest playback experience possible, we highly recommend that you do all of this sort of setup work before you start playback or even attach the item to the player.

So a little bit of look at what your delegate method might look like. There's only one delegate method; it's the metadataOutput, didOutputTimedMetadataGroups, fromPlayerItemTrack method. And the first thing you want to do is grab an item that you can get your payload data from. In this case, to keep things simple, I'm just grabbing the first item from the first group but keep in mind there could be multiple items, there could even be multiple groups.

One reason there could be multiple groups given to this method is that the metadata output will keep track of whether the metadata is coming faster than you're processing it and, if it is, it will start to batch that up and give you the metadata in batches when you're done with the previous batch of metadata.

So moving on with your item; you're going to do this LoadValueAsynchronouslyForKeys dance that we talked about before. In this case, we're interested in the value and data type properties so we're going to load both of those. I've admitted the error checking for brevity here which you'll probably want to do that error checking like we had in the other slide.

And once we have the completion handler we can ask the item for its data type and make sure that's the data type we're prepared to handle, in this case my code only knows how to handle location information in ISO 6709 format so we got to make sure that's the right data type and from there we go and dispatch our code to the main thread that will update our UI.

So that's how QuickTime Player is updating the location metadata during playback. Of course this is not the first API that we have offered for reading timed metadata during playback. There is an existing property called timedMetadata on AVPlayerItem but I'm here to say that the AVPlayerItemMetadataOutput replaces that property for all of these use cases.

Now we're not deprecating the property yet but we do recommend, if you're new to timed metadata, just adopt the metadataOutput and not worry about the property. If you're already using the property version we do recommend that you move over but just you know that you should make sure that your code is working properly after that transition, in particular I'll point out that the metadataOutput will give you, for certain kinds of HLS content, will give more specific identifiers than the old property did. So just make sure your code is prepared to handle that.

The last topic on reading timed metadata is Chapters. Chapters, like I said, have been supported for some time; they even have their own API: chapterMetadataGroupsBest MatchingPreferredLanguages. This is on AVAsset. This will give you an array of timed metadata groups that contain items with the identifier, QuickTimeUserDataChapter, and we've supported this for some time for QuickTime movie files and M4Vs and, new in iOS 8 is the-and OS X Yosemite-is support for chapters in HTTP Live Streams as well as MP3 files. And I'll tell you more about how to author those HLS chapters in just a little bit.

So for more information, we have some sample code that does approximately what QuickTime Player is doing, where it can show your location during play back. We also have a previous session about AssetReader that goes into much more detail than I did here, called "Working with Media in AVFoundation" from 2011. So that's how you read and play back timed metadata.

Our next timed metadata topic is how you can create your own movies that contain timed metadata. We saw the screenshot before and I mentioned that these annotations are stored as timed metadata and, to show you this demo app, I'd like to invite Shalini back up on stage to demo it.

This time let's look at an app on how to author your own custom metadata movie files. Here I have a video and if I would like to share some notes with my friend, who is good at fixing colors in a movie, I can now do that within the app. To add annotations, I use a two-finger gesture, I can use a pinch gesture to resize and then add a comment which is enough for my whoever looks at the video later to fix the colors there and then I begin playback.

And as playback progresses, I track the circle to where I want this to be fixed. And now that I have this annotation and I can write it out along with the audio and video to do that, I hit "export" and now we see an AV player view controller which shows the exported movie along with the metadata which was written to it. So if I start playback you see the annotation is moving along the timeline in the part in which I traced. So if I scrub back in time you can see the annotation moving.

You might wonder that the annotation is baked into the video frame; it is not. It is being rendered real-time using AVPlayerItemMetadataOutput and you can change the color or the font of the annotation. So if I begin playback, you see the rendering is happening in real time. That's AVTimedAnnotationWriter, we have this available as a sample code as well, thank you.

So that was a great demonstration of not only the playback part of it but also how to write that data into the file, so let's take a look at how that was accomplished. So we're going to use an AVAssetWriter to write the file and we're going to use an AVAssetWriterInput in order to write that metadata track to the file. Just like the reader side, the writer has a new class that's a metadataAdaptor and that class knows how to interpret instances of AVTimedMetadataGroup and write that into the file. See a little bit of code; first thing we're going to do is create our AssetWriter Input.

We're going to use the media type AVMediaTypeMetadata, once again nil outputSettings and we're going to have to provide a clue to the source format, well, the format of the data that we're going to be appending. We'll talk more about this and why it's required on the next slide. Then you simply create your metadataAdaptor with the reference to that input and, as you generate or receive your timed metadata groups, you simply use the appendTimedMetadataGroup method to continue to append those and write them to the file.

So what's the deal with that source format thing. Well, it turns out in order for AVAssetWriter to be able to write your metadata in the most efficient way possible, it needs to know up front exactly what kind of metadata it is going to be writing. This will result in the most lowest storage overhead in terms of the number of bytes your file takes up and it also has a effect on how efficient it is to play back this kind of contents. You don't want to be using too much power when you're playing this kind of content back. So you do have some options in terms of how you actually construct one of these format hits.

If you're reading from AVAssetReader you can actually ask the track that you are reading from to give you its list of format descriptions and use one of those. If you're creating the metadata group yourself or getting it from some other source then you can use a new method called copyFormatDescription that will give you back an instance of CM format description that will do this job for you.

It's important to note that if you go this route you need to make sure that the contents of your metadata group are comprehensive in terms of it containing every combination of identifier data type and language tag that you are going to be appending. That is. it contains an item with each of those combinations. Of course, since the CM format description is a CF type, you'll need a CFRelease app when you're done.

Of course, there's one more way you can do this: you can create the format description directly using CoreMedia APIs. And here you use this long name CMMetadataFormatDescription CreateWith MetadataSpecifications function. You're going to pass in the metadataType box. That's the sort of metadata we've been talking about this whole time with timed metadata.

And these metadata specifications it's just an array of dictionaries. Each dictionary contains those combinations I was talking about before. The identifier dataType and optionally extended language tag so you want to make one of these metadata specifications dictionaries for each combination you plan to append. So the one thing that was not obvious about that demo is that we're actually writing metadata timed metadata that describes one particular other track.

So for the example of these annotations, we're really just talking about the video track of the movie and not the sound or anything else like that. So just like we had a way of making track-specific metadata that applied to the entire track, with those tagged characteristics that we saw before, you also have the ability to formerly mark your metadata track as describing one particular other track. You do that with the addTrackAssociationWith TrackOfInput method using as the parameter the AssetWriterInput that you are using to write your video track. And your receiver is the input that you are using to write your metadata track. You use the AssociationTypeMetadataReferent.

So that's how your create metadata that's timed but also specific to a particular track. The next thing we did that was interesting in that demo is that we actually used our own custom identifiers. So we had that big list of built in identifiers. Well, you don't have to use those; you can actually build your own and as I mentioned before an identifier is just a combination of key space and key and it has a particular format: it's just a string but it is in a particular format so to help you make your own custom identifiers, we have this method, identifierFor Key, and keySpace, it's a class method on AVMetadataItem. There are some rules to follow: your key space needs to be four characters long if you want to use it for timed metadata so we actually recommend you use our built-in key space, the QuickTimeMetadata keySpace.

We also highly recommend you use reverse DNS notation for your custom keys to avoid collisions with other kinds of metadata. So a brief code snippet you can see you can simply use this method to make your custom identifier and then set that on the identifier property of your mutableMetadataItem. So in addition to custom identifiers you can also create your own custom data types.

So we're all familiar by now, through this presentation, with some of the built-in data types that we defined; there's a lot more than these but we've been using these quite heavily already. These are really useful but sometimes you want your data type information to express more, maybe about the domain you're working in, so if you are doing a serial number or a bar code kind of thing you might want to define a data type that's this sort of serial number as string data type or barcode image as JPEG data type so you have more specific information about what your metadata actually contains.

The way that this works is, you have to tell us exactly how to serialize that custom data type and the way you do that is you tell us that your custom data type conforms to one of our built-in data types. So in this case the serial number conforms to the UTF8 data type so under the hood it's UTF8 string, but we know that it really represents a serial number and the same with the barcode image.

The way that you do this is you register your data type using the CMMetadataDataTypeRegistry RegisterDataType function that's defined in Core Media. You can't create your own custom base types but you can create your own custom type that conforms to our raw data built-in type if your data type really is just a custom sequence of bytes.

So there are some rules to using AVAssetWriter for writing timed metadata. Most importantly, every metadata item that you append has to have non-nil values for identifier, data type and value. Your identifier has to conform to the format that we specify, so we highly recommend using that utility method that we just talked about.

The value has to be compatible with the data type so you can tell us that your NSString value is an UTF8 string but don't try telling us that your custom class is a UTF8 string because we won't know how to serialize that properly and the AssetWriter will fail.

As I mentioned before, you have to create your AssetWriterInput with a format hint and that must be comprehensive and we described that before. So the last topic about AssetWriter and timed metadata is a recipe for creating your own movies that have the same sort of dynamic location that we've seen a couple of times already.

To do this, you can use AVCapture audio and video data outputs and target that data at twin instances of AssetWriterInput and, at the same time, grab information from Core Location that represents the location information and write that to its own AssetWriterInput. For more detail about how to do that we've actually implemented that and made it available as sample code, so see AVCaptureLocation if you want to make your own movies that contain dynamic location.

We also have sample code as Shalini mentioned for the demo we just showed you, that's called AVTimedAnnotationWriter. And, of course, for more information about AssetWriter in general, see that same talk I referenced earlier: "Working with Media in AVFoundation". Last two quick topics about timed metadata: ExportSession. Just like we've said the asset ExportSession will by default pass through any of your metadata that applies to the entire asset or entire track, it will pass that through, copy it to the output file. It will do the same thing with timed metadata that exists in the source file provided that your destination file type is QuickTime Movie. We'll talk more about file types in just a little bit but basically ExportSession behaves exactly as you would expect.

In our last timed metadata authoring topic is HTTP Live Streaming chapters so if you want to author chapters in your HLS stream, you can use the session-data tag we talked about earlier and the special data ID, com.apple.hls.chapters. Your URL should point to a JSON file that describes the chapter information for that stream and, of course for more detail on this, see that same link that I referenced earlier for HTTP Live Streaming. All right, so that is timed metadata, our next topic is privacy.

Why is privacy important in this context? Well, any time that you are writing your users data to a file you need to be at least considerate about their privacy and be aware that the metadata that you write out to these movie files can contain user identifiable information, the most obvious example of that is location.

And so because movie files can be distributed and we want to protect the privacy of our users, for our built-in sharing services, we do our best to strip out any potentially user identifiable information, such as this location and we recommend that you do the same. So we've given you a utility for that called AVMetadataItemFilter.

Right now there is only one filter that we make available but it is geared towards privacy, it is the metadata item filter for sharing and that will strip out any of this sort of user identifying information that we're talking about; location is only one example. But it will also strip out anything it doesn't recognize, because it doesn't know whether that might contain user identifiable information. So that includes any metadata that uses identifiers that you define yourself. It will leave in some things like metadata that's important to the structure of the movie and chapters are the best example of that, and also any commercial related data like your Apple ID.

So to use the MetadataItemFilter you're going to first of all create your filter and feed it your original array of metadata items using this metadataItemsFromArray filteredByMetadataItemFilter method. This is a companion to that other filtering method based on identifiers we've been using all day and then once you have your filtered array of metadata items just set that on your AssetWriter or ExportSession as you normally would.

Well actually I mentioned ExportSession but things can be simple if you're using the ExportSession and only want to copy the metadata from the source asset and not add your own. You just set the filter on the ExportSession and it will actually do the filtering for you, this will filter both static and timed metadata but it will only filter the metadata from the source asset. If you set your own metadata on the metadata property, it won't filter that for you; you'll need to do the process that I just described of doing the filtering yourself.

The only other thing to keep in mind is that the export may take more time when the filter is being used because it has to go through and examine all of the metadata items. So that's privacy. Our last section of the talk today is some assorted best practices when you are writing your own files that contain metadata.

First up, what if you're writing timed metadata and you have multiple streams of metadata that use different identifiers. How do you get those into the same file? Well, we actually have the situation in the demo app, we have that circle is comprised of two different pieces of information, the position and the radius. So we're representing these and the demo app is two distinct streams of metadata. And so the most obvious way I can think of to get this into a file is to use two different AVAssetWriterInputs, which result in having two metadata tracks in the output file, pretty simple.

But there is another way you can do it, you could instead combine those two different types of metadata into one timed metadata group and write that to a single AssetWriterInput and that will result in only one metadata track in the output file that contains multiple different kinds of identifiers. There are some advantages to this approach, not the least of which is it can result in lower storage overhead and therefore as we always see more efficient playback.

But there are of course pros and cons to everything. So you'll definitely want to consider combining into one track your different metadata if they are used together during playback and they have identical timing. This is definitely the case with the example we just saw with the circle center and the circle radius.

If these are not true then you might not want to combine. And in fact one instance where you definitely do not want to combine, is if you have one type of metadata that's associated with another track in the file, so that's like our annotations are associated with the video track, but then you have another type of metadata like location that is associated with the entire asset. You don't want to combine those into one track, otherwise your location in that example will become mistakenly associated with just the video track, and that's not what you want. So that's how to deal with multiple streams of timed metadata.

Next topic is duration of your timed metadata groups, when you get a timed metadata group from AVFoundation it's always going to have a fully formed time range. So that means it will have a start time and a duration. We actually recommend when you make your own timed metadata groups for a pending with the AVAssetWriter that you don't bother giving us a duration. And to see how that works, here's an example of a group that starts at time 0 but it doesn't have a duration so how do we know when it ends?

Well, of course we'll wait until you append the next one and then we'll say that, "Okay, the end time of the first group is the same as the start time of the next one." So this ensures that your metadata track is going to have a continuous stream of contiguous metadata and we think that for most cases this is the best way to store your metadata.

The way you accomplish this is, when you're making your time range, you just use KCMTimeInvalid for your duration and we'll take care of the rest. We do recognize that there are cases where you might not want to have contiguous metadata, you might want to author an explicit gap into your metadata stream and so, for that, our recommendation is that you give us in the middle there a group that contains zero items. This is the best way to author a gap in the metadata. And you can see we just do that by presenting an empty array when we're creating our timed metadata group.

Notice that we're still using KCMTimeInvalid for our duration here. Just tell us when the beginning of the metadata silence, so to speak, is and we'll figure out how long it lasts based on when you append your next non-empty group. So that's how you write gaps in your metadata. Our last best practice, I mentioned output file type before and here's the longer explanation. Well, AssetWriter and AssetExportSessions support writing to a wide variety of file types.

You've got QuickTime movie, MPEG4, and all sorts of other kind of file types and those file types can carry different kinds of metadata; some have more restrictions than others about what kind of metadata can go into that file type. So the easiest situation, say if you have an ExportSession and you're going from one, from the same file type as your source to the output, so for this example they're both QuickTime movie files. This is the easiest way to ensure that all of that data is actually going to make it into the output file.

If instead you're using a different output file type, like MPEG4 in this example, then some different things are going to have to happen. You notice those last few items didn't quite make it into the output file; it's because they have no equivalent representation that works with an MPEG4 file.

If you're looking closely you'll also notice that those top two items have changed, although they sound very similar they are slightly different identifiers because that's the kind of identifier that works with MPEG4. So both AssetExportSession and AssetWriter will do the sort of three step process. First, they'll try to pass that data through directly if possible and, if not, they'll try to convert the identifier into an equivalent representation in the output file type.

If neither of those work, we have no choice but to just drop that piece of metadata on the floor. So in terms of guidance on how to choose an output file type, well, my two recommendations are, if you are using say an ExportSession to copy all of the metadata, timed or otherwise, from the source asset to your destination file, the best way is to try and use the same file type that you started with, and if you don't know what the file type is you can use the NSURLTypeIdentifierKey to find out.

You can also always use the QuickTime Movie file because that is going to have the greatest chance of supporting your metadata no matter where it came from. If AVFoundation supports it, there's a good chance that it will be supported by the QuickTime movie file. Of course, this is the only way if you're writing timed metadata, to get your timed metadata into a file is to use QuickTime Movie file; it's the only file form that supports it right now.

Of course good advice is always to check the results, no matter what. Check that your output files contain the kind of metadata that you expect, all the metadata that you expect and you can choose to use some of the APIs that we've already talked about if you want to do that at runtime.

Some guidance if that doesn't end up being the case: if you don't get all the metadata that you expect, well, you can try to do the conversion yourself. Especially if you have a custom identifier and are going to a file type that doesn't support your custom identifier, take a look at that long list of built-in identifiers we have and see if there is something that's roughly equivalent to what you're trying to store and you can do that conversion yourself.

One particular example I want to call out that involves only built-in identifiers is when you're trying to go from ID3 to iTunes, well AVFoundation currently isn't going to do that conversion for you. But there's no reason you couldn't do that yourself, so once again just take a look at our long list of identifiers and match them up and do the conversion in your own code.

So that is the end of the talk. See what we covered: we talked obviously a lot about metadata in AVFoundation, we talked about all of the different classes you can use for inspection, we talked about AVAsset and AVMetadataItem and how those work together and also authoring, we talked about the AssetWriter, the AssetExportSession even briefly on the capture audio and movie file outputs. We dove into timed metadata, including all the new features that enable things like the dynamic location and your own timed metadata like the annotation demo.

We also talked about privacy considerations and some best practices like how to choose the right file type. So for more information, you can contact our evangelism team or see our programing guide, there are some other related sessions you might be interested in. If you missed this morning's presentation on "Modern Media Playback", you can catch that on the video recording. Tomorrow there is also a camera capture talk focusing on manual controls. And on Thursday we'll have a talk about direct access to video encoding and decoding, which I'm sure a lot of you will be interested in.

[ Silence ]