What's New in Core Audio - WWDC 2014

Media • iOS, OS X • 57:39

See what's new in Core Audio for iOS and OS X. Be introduced to the powerful new APIs for managing audio buffers, files, and data formats. Learn how to incorporate views to facilitate switching between inter-app audio apps on iOS. Take an in depth look at how to tag Audio Units and utilize MIDI over Bluetooth LE.

Speakers: Michael Hopkins, Eric Johnson, Torrey Walker, Doug Wyatt

Unlisted on Apple Developer site

Downloads from Apple

HD Video (236.7 MB)
SD Video (85.9 MB)
PDF Slides (561 kB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Good morning everyone. And welcome to Session 501, "What's New with Core Audio". I'm the first emcee on the mic today. My name is Torrey. And we have been very busy. We have a lot of interesting things to share with you today. We're going to start off by talking about some enhancements that we've made to Core MIDI and how that affects you and your apps.

Then we'll move on to Inter-App Audio views, and then we will have a large section on the new and improved and Enhanced AV Foundation audio. And that will include a talk about the Audio Unit Manager, AVAudioSession, some Utility classes, and that last bullet point there, AVAudioEngine, is such a large topic that it gets a session all to itself directly following this one in the same room starting at 10:15 a.m.

So without further ado, let's talk about what's new with Core MIDI. If you have a studio, a music studio, that you use to make music, it may look something like this. So maybe there's a Mac in the center of it, or an iOS device, they're also very capable to be the center of your studio. And connected to it may be several USB MIDI devices, controllers, break out boxes that are connected by a 5-pin DIN to legacy equipment, musical instruments, and then also you may have a network session going.

Well, beginning in iOS 8 and Mac OS X Yosemite, your studio can start to look like this. So imagine after making a very quick Bluetooth connection and sitting down on a couch on the other side of the room of your studio and controlling all of your music. That's what you'll be able to do with MIDI over Bluetooth. So starting in iOS 8 and in Mac OS X Yosemite, you'll be able to send and receive MIDI data over Bluetooth Low Energy connections on any device or Mac that has native Bluetooth Low Energy support.

The connections you established is secure meaning that pairing is enforced. No one can connect to your devices without your explicit consent, and after the connection is established, it just appears as an ordinary MIDI device that any application that knows how to communicate with a MIDI device can talk to. So to talk a little bit more about how this connection works over Bluetooth, I want to talk about the two key roles involved in a Bluetooth connection.

There's the Central and the Peripheral. You already have some familiarity with this. Maybe not with these names. You can view your Central as like your iPhone and your Peripheral as like your Bluetooth earpiece. The Peripheral's job is to become discoverable and say, "Hey, I can do something. You can connect to me." So for Bluetooth MIDI, the peripheral side will advertise the MIDI capabilities. It'll say, "Hey, I can do MIDI. You can connect to me now." And then that side waits. The Central can scan for a device that says they can do MIDI and then establish a connection.

After that Bluetooth connection has been established, MIDI data can be shuttled bi-directionally between both of these. Now in order to have a Bluetooth connection you have to have one Central, and you have to have one Peripheral. And we allow Macs and iOS devices to play either role. So you can connect Mac to Mac, iOS to iOS, Mac to iOS, and vice versa.

So what does this mean for you and your application? If you are writing a Mac OS X application, the good news is you are already ready, already ready. This is the MIDI Studio Panel from Audio MIDI Setup, which I'm sure you're all familiar with. If you look there you'll see a new icon, the Bluetooth Configuration icon. If you double click that icon, you are going to get a new window.

And this window will allow you to play either the Central or the Peripheral role. If you look at kind of the top third of the window, you'll see where there's a button that says Advertise. And click Advertise to become discoverable as Fresh Air. That's a name that you can modify. Fresh Air is the name of my MacBook Air because it's fresh.

Then the bottom two thirds of it is the central view. If someone is advertising, "Hey, I can do MIDI," it will show up in the bottom, you click Connect to establish the connection. The pairing will happen, and then a new MIDI device will appear in the setup that any application that uses MIDI devices can see and communicate will.

Now on iOS, there is no audio MIDI setup. So how do you manage your Bluetooth MIDI connections? You'll be using new CoreAudioKit View Controllers. There are 2 new CoreAudioKit View Controllers that you can add to your application. One of them that allows you to play the role of the Central, which means you scan and connect and another that allows you to play the role of Peripheral, which means you advertise and wait. If you establish a connection between 2 devices over Bluetooth MIDI, and they're not communicating for a while and they are unused by the application, after several minutes we will terminate the Bluetooth connection to save power.

So what does all of this look like in practice? I'm going to show you a short UI demo of how users would use this in their studios. OK. I've got my demo machine ready here. And what I'm going to do is I'm going to launch audio MIDI setup.

This is the audio window. We'll close this, and we will go to the MIDI window. Now if you'll notice here in the MIDI window there's this new Bluetooth Configuration panel. So if I double click this, then I will see that there are currently no advertising Bluetooth MIDI devices. I want my Mac to play the role of Central. So I'm going to wait for someone to become available to connect to. And I'm going to use my iPad for that.

So here's my iPad. And this is a little test application that we wrote to implement the CoreAudioKit View Controllers that I talked about earlier. I am going to go to Advertisement Setup, and this will give me the Peripheral view. If you look here at the top, you see the name of this iPad Air is iPad Air MIDI.

If I want to change this name I could tap the "i", but I'm OK with that name. And then I will say Advertise the MIDI Service. Now after I'm advertising the MIDI service, back on the Mac OS X machine you'll see iPad Air MIDI has shown up here. If I click connect, after a few minutes you'll see a new device appear in the MIDI setup.

I'm going to launch Main Stage because Main Stage can receive MIDI notes and play back audio. Go into Performance Mode [music playing]. OK. So a big confession here, I don't play keys. But I do have an application that plays keys really well called Arpeggionome Pro. So I'm going to launch that, and I'm going to use it to send MIDI data over to the Main Stage 3. OK. Now one thing I want to do really quickly here is check my connection status because I left it inactive for quite a while here. So I'm going to go back and make myself advertise one more time.

[ Music Playing ]

So now this is live MIDI data being sent over Bluetooth. If I could get that volume a little louder please. Thank you. So if I wanted to do this preset, it's called Epic Fall. And it is epic. So that's Bluetooth being sent over MIDI. And also this sends not only the controller data, but it also sends the SISX data that you may have or any other type of MIDI. A few final words before I turn the mic over.

This Bluetooth, being able to connect with Bluetooth MIDI connections will work on both OS X Yosemite and iOS 8 using those view controllers that I told you about. And it will work on any Mac, iPhone, or iPad that has native Bluetooth Low Energy support. So now I'm going to tell you which ones those are. For Macs, any Mac that was manufactured in 2012 or later and a mixed bag of Macs that were released in 2011 also have native Bluetooth Low Energy support.

For the iPhone, the iPhone 4S and greater have Bluetooth LE. For the iPad, the first iPad with the Retina display has LE and the ones from that point and all iPad Minis have native Bluetooth Low Energy support. So this will work on all of those systems. Also, the connection is really low latency. It's very sensitive. And the Bluetooth LE bandwidth greatly exceeds the minimum MIDI bandwidth requirement for MIDI of 3,125 bytes per second.

Standardization is in the works. We're working with standards bodies to standardize this so more people can get in on it. And the key takeaway for you is if you're making your iOS applications, please start adding these Bluetooth UI View controllers immediately to your applications so that users can manage Bluetooth MIDI connections using your app. And the person who is going to show you how to do that is my colleague and homeboy Michael Hopkins. And I'll turn the mic over to him.

Thank you very much, Torrey. I'd like to talk to you this morning about a new framework for iOS called CoreAudioKit. This framework provides standardized user interface elements for you to add to your application to do things like show the MIDI over Bluetooth LE UI that Torrey just demonstrated as well as some new views for those of you that are doing Inter-App Audio.

We've designed these so that we do all the heavy lifting so that you don't have to worry about rolling your own UI, and you can just concentrate on what makes your app unique. Therefore, they are very easy to adopt with a minimal amount of source code, and they work on both iPhone and iPad.

Looking specifically about these interface elements for MIDI over Bluetooth LE, as Torrey showed you we have separated these into two different view controllers so that you can choose which one is appropriate for your own application or you can use both. For example, if you use the UI split view controller you can have those both visible at the same time. The first one is the CABTMIDILocal PeripheralViewController.

That's quite a mouthful this early in the morning. If you want to advertise your iOS device as a Peripheral, you use this class. The source code for adopting this is very straightforward. You create a new instance of that view controller, get the navigation controller object for your app and push that view controller onto the stack.

The CABTMIDICentral ViewController is required if you want to discover and connect two Bluetooth Peripherals. And you use that in the same way. You create the view controller and push it onto your view controller stack. Now I'd like to switch over and talk about Inter-App Audio. For those of you that weren't present last year at WWDC, we had a session talking about this new technology that we released with iOS 7. In review, Inter-App Audio allows you to stream audio between one or more apps in real time. A host application can discover available node apps even if they are not running. And please refer to last year's session, "What's New in Core Audio" Session 206 for further details.

But looking at how this works with the Host App and a connected Node App, the Node App can be an instrument, an effect, or a generator. And the Host App and Node App can send audio back and forth. In the case of an instrument, the Host App can also send a MIDI to that instrument app and receive audio back.

The two user interface elements that we provide in iOS 8 are firstly the Inter-App Audio switch review, which provides an easy way to see all the Inter-App Audio apps that are connected together and switch between them using a simple tap gesture. We also provide an Inter-App Audio Host Transport view. This displays the transport of the host you're connected to in your Node App and allows you to control the transport playback, rewind, and record in addition to displaying where in the Host Transport you are via that numeric time code.

And I'd like to show a demo of this in action. I have 3 different applications here that we'll be using together in our Inter-App Audio Scenario. The first of which is GarageBand, which is the current version of that application that I've downloaded from the iTunes store. I also have a Delay application and a Sampler. Let's take a look at the Sampler first. This allows me to trigger sample playback via the keyboard.

[ Music ]

So now let's go ahead and connect this to GarageBand. I'm going to launch GarageBand. I'm going to connect to that Sampler app, and now this is connected to GarageBand. So the first thing I'd like to demonstrate is the Inter-App Audio Switch Review in action, which this application has implemented as visible via a button. I press that, and you can see now that we have two Nodes shown. The Host, as well as our current application. And I can switch over to GarageBand simply by tapping.

I'm going to add in an additional Inter-App Audio App, the Delay effect. And now if I was to switch over to this application without using the Switch Review, I could double tap on my Home key, look, and try to find that application. Where is it? It's difficult to find. And that's why we've provided the Inter-App Switch Review. In this application, the Delay, you can see that on the lower right-hand corner. And now that is showing our Host, the Sampler, as well as our current effect.

So it's very easy to switch back and forth. And you can see that it just showed up there. So that's the first view I'd like to demonstrate. And if I play back on my keyboard, we can hear that we're now getting that Delay. And this is interesting because we have, we're sending audio from the host to our Sampler, and then through an effect to playing that delay, and then back. Now the second view, the Transport view, you'll see just above that view, let me hide that for you, and that allows me to control the transport of the Host [music playing]. I can do recording.

[ Music Playing ]

Sorry. I'm no Dr. Dre. It's too early in the morning for that, but you get the idea. And these are the views that we're providing for your benefit. So please adopt those to add this functionality to your application. OK. So the goal between these user interface elements are to provide a consistent experience for your customers. You do have some flexibility in controlling some of the visual appearances of those controls.

They support a number of different sizes. So if you want a ginormous UI you can have that, or if you want them very small you can do that. The source code, as I'm going to show you, is very easy to add to your application. And because these are subclasses of UIView, you can choose to create a view controller if you want to add them to a UI popover view on your iPad, or if, the example demonstrated it, if you want to embed that directly in the content of your app you can do that as well. Let's take a look at the code.

We import the umbrella header. In this case, I'm demonstrating how to add the Switcher View from a nib file. So you go into IV, drag out your UI view, assign that to be the class of the CAInterAppAudioSwitcherView, create an outlet for that view, and then in the viewDidLoad method we specified a visual appearance of that view. And then we need to associate an audio unit with that view so that it can automatically find the other apps that are connected.

And that's all there is to it. And creating the Transport view programmatically, as this example shows, we create the view, specify initial size and location of that view, configure it's visual appearances, associate an output audio unit with a view, and then finally we add that transport view as a subview of our main content.

OK. Now I'd like to switch gears a little bit now back to AV Foundation. The rest of my presenters including myself will be focusing on this framework and some of the new enhancements and abilities that we've added for you to add to your application. The first new feature is for Audio Unit Management. And that's the AVAudioUnitComponentManager.

This is a Mac OS X Yosemite API, Objective C-based. And it's primarily designed for Audio Unit host applications. However, as you'll see, we do have some end-user features as well. We provide a number of different querying methods, which enable your host to find the Audio Units on the system given some criteria, for example, number of supported channels.

We have a simple API for getting information about each individual Audio Unit. We have some new tagging facilities that I'll demonstrate in a moment. And finally we have a centralized Audio Unit cache so that if you have multiple host applications on your system, once one host has scanned Audio Units, and for a lot of people they have a large number of them so this can take quite some time, all the other hosts on the system share that information so they don't have to perform that exhaustive scan again.

Let's take a look at the API in more detail. As I said, these are in AV Foundation, and they're new. The first class is the AVAudioUnitComponentManager. And this provides three different search mechanisms for finding Audio Units. The first of which is based on the NSPredicates. We can use a SQL-based language to provide strings, which I'll show you in a source code example later for finding audio units matching the given criteria.

We also have a block-based mechanism for finer programmatic control. And for those of you using older host apps with our current audio component API, we have a backwards-compatible mode as well. Each of these search methodologies return an NSArray of AVAudioUnitComponents. And that class can be used to get information about the audio unit.

Now using our prior API, if I wanted to do something like find all stereo effects that support two-channel input and two-channel output, I'd have to write a great deal of code. That's OK. But now with this new API we can reduce all that to this simple, elegant four lines of code. The first of which is retrieving an instance of the sharedAudioUnitManager.

And here I'm using the block-based search mechanism to find all components that pass a specific test. And in this block I'm checking to see if the type name of that audio unit is equal to the preset string AVAudioUnitTypeEffect. And then furthermore we're checking to see if that Audio Unit supports stereo input and output. You'll notice there is a stop parameter, so if you wanted to return only the first instance of the audio component matching this criteria, you could return yes and the stop would, and it would stop immediately.

OK. Now I'd like to move on to talk about tagging. A lot of people, especially Dr. Dre in his studio has a large number of Audio Units. So finding the right one can be a bit challenging because they're sorted alphabetically or by manufacturer. And there's a lot easier way for users to find these Audio Units now with tagging. It's very similar to what we have done with a finder of the previous Mac OS X release. Users can now specify their own tags with an audio unit in order to create broad categories or even specific categories of how they want to organize their audio units.

They can apply one or more tags in two different categories. The first of which is a system tag. This is defined by the creator of the audio unit. And, for example, in Mavericks, excuse me, in Yosemite, I have to get that name in my head, I personally liked Weed, but I didn't get to vote.

The system tags are defined by the creator. And we at Apple have added standard tags to all the Audio Units that we feel would be useful to most of our users. You can also have user tags. These are specified by each individual user on the system. So if you have three users they can each have their own set of tags.

A tag is a localized string in the user's own language. Swedish, Swahili, it doesn't matter. They can be arbitrary, or they can be a pre-defined type. And these are all in AudioComponent.h. They can be either based on the type of Audio Unit, for example a filter or a distortion effect, or they can be based on the intended usage, for example an audio unit useful in a guitar or vocal track.

Now I'd like to show a demo of tagging in action using a modified version of AU Lab. So in AU Lab we can look at all the tags associated with all the built-in audio units. And here you see that, for example, the AU time pitch has two standard tags that are associated with it, time effect and pitch. And those are defined by us. In addition you can see that this distortion effect has two user tags, one specifying that it's useful for a drum track and another one for a guitar track.

The API also provides developers the ability to get a list of all the system-defined tags localized in the language of the running host as you can see here. And I can also see all of the user tags that the users assigned to all the Audio Units on this system. Adding tags are as simple as typing a new one.

Now that's been added to that Audio Unit. And I can do a search using this predicate-based and other search mechanisms. And it will search all the audience looking for that particular tag. So this is something that is really exciting, and we hope that you'll use this API to add tagging functionality to your own host application. Let's take a look now at the API.

To find an Audio Unit with a specific tag, in this example I'm going to use the NSPredicate filtering mechanism. Here I'm defining a predicate. It says that the component has to have the old tag name's property containing a particular string, in this case, "My Favorite Tag", and this is the identical searching that you just saw in my demo. Once you've defined the predicate you get an instance of the shared AU Manager, and then call componentsMatchingPredicate, which returns an array.

To get a list of the tags associated with this particular AVAudioUnitComponent, you use the userTags named property. You can assign to that as well. And in this example I'm adding two tags to the audio Unit. We could get all tags for a specific component, and these will include the user tags as well as the system tags. All tagNames property.

We can get a localized list of all the standard system tags by getting the Component Manager and then calling the standardLocalizedTagNames property. This is what I was displaying in the pop up in my demo. And finally I can get a list of all the localized tags that this user has assigned across all the audio units on the system. And that, again, you saw in my demo.

For those of you that ship Audio Units, and you want to add your own built-in tags to those Audio Units, you need to go into your AudioComponent bundle. And in your info.plist, look at your Audio Component Dictionary and add a tag section. These first two items are examples of using standard tags, and the third item is a custom tag.

So you can have that be something meaningful to your own company, for example, if you have like the Silver Effect Package, you could add that tag. If you do so, you can also localize that tag by adding an AudioUnitsTag.strings file into your bundle and then adding localizations for each language that you wish to support. And please do not localize any of our standard system tags. We've already done so for you.

So, in summary, if you're a host developer please adopt the AVAudioComponentManager API, so your users can tag all their Audio Units. And if you're an Audio Unit developer, please add system tags to your audio units. So without further ado I'd like to turn over this session to Eric Johnson. He'll be discussing tips and tricks and new functionality in the AVAudioSession. Eric?

[ Applause ]

Good morning. So I'll be continuing on with the AV Foundation framework. This time we're on iOS only with AVAudioSession. So today we're just going to spend a few minutes talking about some best practices focusing on managing your session's activation state, and then also talking about just a little bit of new things in iOS 8.

Before we dive in I wanted to call your attention to an updated Audio Session Programming Guide that's available on developer.apple.com. Since we saw you all last year at this time, this guide has been updated so that it has been rewritten in terms of AVAudioSession, so it's no longer referring to the deprecated C API. That's a really great update. And for those of you who are maybe not that familiar with Audio Session, there was a talk from two years ago where Torrey talked about Audio Session and also Multi-Route Audio in iOS.

All right. So let's dive into talking about managing your session's activation state. So there's your application state. And then there's Audio Session state. And they're separate things. They're managed independently of each other. So if you've been doing development for iOS you are probably familiar with app states. So this is whether your app is running or not, whether it's in the foreground or the background, if it's been suspended. Your Audio Session activation state is binary. It's either inactive or active.

Once you've made your session active you do need to be prepared to handle interruptions, and we'll talk about what that means. So let's look at an example of how an Audio Session state changes over time. So here we're on an iPhone. We have our application on top, our Audio Session.

Let's say that we're developing a game app. And then on the bottom we have the phone's Audio Session. And right now the user is not in a phone call, and they haven't launched their app yet, so both sessions are idle, inactive. So now the user launches our app.

When we first come into the foreground our Audio Session is still inactive. And because we're a game app, we want to make our session active when we're in the foreground so that we can be ready to play audio. So we'll do that. And we're going to just play some music, so we're now happily playing music in the foreground with an active Audio Session.

So then the phone starts ringing. We get interrupted by, the system sends us an interruption event. The phone's Audio Session becomes active and plays the ringtone. And the user decides to accept the call. So the phone's Audio Session stays active, and our Audio Session has been interrupted, so we're inactive.

And then the user ends the call, hangs up, says goodbye, and now the system is going to deliver an end interruption event to our Audio Session. And we're going to use that as a signal to make our session active again and presume playback. And we continue in this state. So this is a typical example of how something like a game application interacts with the phones, the phone app's Audio Session on an iPhone.

So the way that you need to manage your application's Audio Session state is actually going to depend on how you use audio. We've identified a number of different types of applications that commonly use audio on iOS. And we don't have time to talk about all of these this morning, and you'd probably be bored to death if we did. So we're just going to talk about a few of these. So let's continue on with the idea that we're developing a game app. So for game apps usually what we recommend is that when you're in the foreground, you'll want to have your Audio Session active.

So a good place to make your Audio Session active is in the app delegate's applicationDidBecomeActive method. So that will cover the case when you're being launched. If you're coming from the background into the foreground, or if you are already in the foreground and the user had swiped up the control panel and then dismissed it, you'll be covered in each of those cases.

So once you've made your session active you can leave it active, but you do need to be prepared to deal with interruptions. So if you get a begin interruption event, you should update your internal state so that you know that you're paused. And then if you get an end interruption event, that's your opportunity to make your session active and to resume audio playback. And this is just like the example that we looked at just a few minutes ago.

Media playback apps need to manage their Audio Session state a little bit differently. So I'm talking about applications like the built-in music app or podcast or streaming radio. And these are the types of applications that we usually have a play/pause button, and they're what we refer to as non-mixable meaning that they'll interrupt the audio of other non-mixable Audio Sessions.

So for these types of applications we recommend that instead of making your session active immediately when you enter the foreground that you wait until a user presses thePplay button. And the reason that we give you that advice is sometimes the user brings your app into the foreground just to see if they have a particular podcast episode downloaded or to see if they have a song in their library. And they don't necessarily want to interrupt other audio that was playing. So it's good to wait until they press Play.

So like in the case of a game app once you've made your session active you can leave it active. But, again, you need to be prepared to handle interruptions. So if you get a begin interruption event, you should update your UI. So if you have a play/pause button it's a good time to change that state and also keep track of your internal states so that you know that you're paused. One thing you do not need to do is you do not need to make your session inactive because the system has already done that for you. That's what the interruption is.

So then if you get an end interruption event, we ask that you honor the ShouldResume option. So if this option is part of the info dictionary that's part of that notification, that's the system giving you a hint that it's OK to make your session active and to immediately resume playback. If you don't see that option as part of the notification, then you should wait for the user to press play again.

OK. So we talked about for game apps and media playback apps when you would make your session active. What about making your session inactive. So if you are something like a navigation or a fitness app, you're typically going to be playing short prompts of audio. And you're going to be using the duck others category option which will lower the volume of other audio applications on the system. So it's important when you're done playing your short prompts that you make your session inactive so that the other audio is able to resume at full volume.

If you're a voiceover IP app or a chat app or maybe one of these apps that has like kind of like a browser view where you're playing short videos, then you are usually going to be what we refer to as non-mixable, meaning that you're going to interrupt other audio.

And so it's important that when you're done playing audio that you make your session inactive so that other sessions are able to resume. And it's a good idea to use the NotifyOthersOnDeactivation option when you make your session inactive. And that way the system is able to tell an interrupted Audio Session that it's OK for them to resume. All right.

So now let's shift gears a little bit and talk about managing your secondary audio in response to other audio on the system playing. So first let me explain what I mean by secondary audio and primary audio. So let's say we're developing a game application. Our primary audio is going to be our sound effects, our explosions, beeps and bloops, short bits of dialog. And it's the kind of audio that really enhances the gameplay.

And it's also the kind of audio that, if the user was listening to music when they launched your app, you still want it to play. And it's OK that it mixes in with the other music. By secondary audio, I am really talking about your soundtrack. This is the audio where it also enhances the gameplay, but if the user was previously listening to their music or their podcast, you'd just as soon have your soundtrack be muted. And then if the user stops their music playback on their podcast then you'd like to have your soundtrack resume.

So in iOS 8 we've added a bit of new API to help you do this. We've added a new property and a new notification. The property is called secondaryAudio ShouldBeSilencedHint. As the name implies, it's a hint that the system is giving you that it's a good time to mute your secondary audio. So this is meant to be used by apps that are in the foreground. And we recommend that you would check this property in applicationDidBecomeActive.

Going along with the new property is our new notification. This is the AVAudioSession SilenceSecondary AudioHintNotification. Another mouthful for this early in the morning. So this notification will be delivered to apps that are in the foreground with active Audio Sessions. And it's kind of similar to our interruption notification that it's two-sided.

There's a begin event, there's an end event all wrapped up in the same notification. So when you get a begin SilenceSecondary AudioHintNotification that means that it's a good time to mute your secondary audio. And if you get the end event it's a good time to resume your soundtrack. So let's look at what this looks like in action.

So on the far right we have the built-in music app, and it's currently in the background. It's not playing audio. On the far left we have our game app that we're developing. So we're playing our primary audio, the sound effects, and we're also playing our soundtrack because there was no other music playing. And in the middle we have iOS helping to negotiate things.

So the user has his headphones plugged in, and he presses that middle button. And the music app responds to remote control events. So it uses that as a signal to begin playback. The music app also informs iOS that it's using its audio output. And so then the system is able to send a begin notification to our app that's in the foreground. And in response to that we can mute our soundtrack. So our app is still in the foreground. The only thing that's really changed is that the user used their middle button to play their music. And we've responded to the notification that we got from the system.

So now some time passes. We're in this state for a while, and the user presses the middle button again. So the music app responds by pausing its playback and telling the system that it's pausing its audio output. And then the system is able to send the end notification to our app that's still in the foreground. And in response to that, we resume our soundtrack. So hopefully this will be pretty easy to use. There's one new property and then the two-sided notification that you can use to manage your soundtrack.

So kind of on a similar thread, in the past we've given advice about how you could manage your secondary audio based on the isOtherAudioPlaying property. And we had given advice about choosing between the ambient category or solo ambient based on the state of this property. What we're recommending now is that if you're this type of application, that you just use the ambient category and then use the new property and the new notification to manage your soundtrack. All right. I'm going to hand things over to Doug Wyatt. He's going to tell us about some new utility classes in AV Foundation.

Thank you. Good morning. I'm Doug Wyatt. I'm an engineer in the Core Audio Group, and I'd like to talk to you about some new audio classes in the AV Foundation framework. I'll give you, we'll start out with some background and tell you what we're up to and why. Then we'll start looking through these classes one by one. And I'll tie things up at the end with an example.

So in the past our CoreAudio and AudioToolbox APIs, they're very powerful, but they're not always easy for developers to get their hands around at the beginning. We've tried to work around this by providing some C++ utility classes in our SDK, and that's helped to some extent, but example code gets copied around. It evolves over time. And we think it's best in the long run if we sort of solidify these things in the form of API, and that's what we're providing now with these classes in the AV Foundation framework starting with Mac OS X 10.10 and iOS 8.

So our goals here, we don't want to make a complete break with the past. We want to build on what we've already got. So we're going to, in many cases, wrap our existing low-level C structures inside Objective-C objects. And in doing so, these lower level C structures become easier to build. But we can also extract them from our Objective-C objects and pass them to the low-level APIs we might already be using.

And this is a philosophy we used also with the AVAudioEngine, which we'll be examining in more detail in the next session here. And I should also mention an overriding goal here is for us to stay real-time safe, which isn't always easy with Objective-C. We can't access methods for properties on the audio rendering thread. So we've taken some great care to do that in our implementations and as we go I'll give you a couple of examples of places where you need to do this to be aware of real-time issues when you're using these classes.

OK. So here are the classes we'll be looking at today in this session. At the bottom in green we've got AVAudioFormat, which has an AVAudioChannelLayout. In blue we have AVAudioPCMBuffer, which has an audio format. Every buffer has a format describing it. And finally we'll be talking about AVAudioFile, which uses AVAudioPCMBuffer for I/O and as you would expect the file also has format objects describing the file's data format.

So first let's look at AVAudioFormat. This class describes the actual format of data you might find in an audio file or stream and also the audio you, the format of the audio you might be passing to and from APIs. So our low-level structure here for describing an audio format is an AudioStreamBasicDescription, which in retrospect might have been called "audio stream not so basic" or "audio stream complete description" because there's a lot of fields there, and it can be a little challenging to get them all set up consistently especially for PCM formats.

But, you know, the beauty of this structure is that it describes just about everything we would want to use to describe an audio format. But, again, it's a little challenging. But, in any case, you can always create an AVAudioFormat from an AudioStreamBasicDescription, which you might have obtained from a low-level API. And you can always access a stream description from an AVAudioFormat.

But now we can move on to other ways to interact with AVAudioFormat. So in the past we've had this concept of canonical formats. And this concept goes back to about 2002 in Mac OS 10.0 or 10.1 or so. So this format was floating-point, 32-bit, de-interleaved, but then we got along to iOS, and we couldn't really recommend using float everywhere because we didn't have the greatest floating-point performance initially. So for a while canonical was 8.24 fixed-point.

But because of that schism we want to reunite under something new now. We've deprecated the concept of canonical formats. Now we have what we call a standard format. We're back to non-interleaved 32-bit floats on both platforms. So this is the simplest way to construct an AVAudioFormat now is with, you can create a standard format by specifying just a sample rate and a channel count. You can also query any AVAudioFormat you might come across and find out if it is a standard format using the standard property.

We've also provided for using Common Formats with AVAudioFormat. And we define Common Formats as formats you would often use in signal processing such as 16-bit integers if you've been using that on iOS or other platforms. We also provide for 64-bit floats. And it's very easy to create an AVAudioFormat in one of these formats by specifying which one you want, the sample rate channel count, and whether it's inter-leaved. You can query any format to see if it is some common format or something else using the Common Format property.

OK. So that's AVAudioFormat. Let's look at AVAudioChannelLayout. Briefly here, this describes the ordering or the roles of multiple channels, which is especially important, for example, in surround sound. You might have left, right, center, or you might have left, center, right, and so on. It's important to know the actual order of the channels.

So every AVAudioFormat may have an AVAudioChannelLayout. And, in fact, when constructing the AVAudioFormat, if you were describing three or more channels you have to tell us what the layout is. So it becomes unambiguous to anyplace else in the system that sees that AVAudioFormat what the order of the channels are. So the underlying AudioChannelLayout object is pretty much exposed the way it is here. You can go look at that in the CoreAudioTypes.h, but we have wrapped that up in the AVAudioChannelLayout for you. OK. Moving on let's look at AVAudioPCMBuffer.

So buffer can be a sort of funny term when we're dealing with de-interleaved audio because of the audioBufferList structure, but that aside you can think of it simply as memory for storing your audio data in any format including non-interleaved formats. And here at the low-level structures, which these ones in particular can also be a bit of a bother to deal with because AudioBufferList is variable length. So you can simply create an AVAudioPCMBuffer. It'll create an audioBufferList for you of the right size. And you can always fetch it back out of the AVAudioPCMBuffer.

Here's the initializer. So to create a buffer you specify the format and a capacity in audio sample frames. You can always fetch back the buffer's format and the capacity with which it was constructed. And unlike audioBufferList, which has a simple byte size for every buffer, here we've separated the concept of capacity and length. So there's the fixed capacity it was created with and the frame length, which expresses the number of currently valid frames in the buffer.

Some more methods here. To get to the underlying samples we provide these simple type safe assessors. And this is a good time now to say a word about real-time safety because these are properties. And as useful as they may be for actually getting to the data, since they're properties they may involve a method lookup, which can, in principle, take a miss on the lookup and cause you to block. So if you're going to be using AVAudioPCMBuffers on audio real-time threads, it's best to cache these members in some safe context when you're first looking at the buffer and use those cached members on the real-time thread.

OK. That's PCM Buffer. Now we can look at AudioFile, which wraps all these other classes together. So here we let you read and write files of any CoreAudio supported format. This ranges from .M4A, .MP4, .WAV, .CAF, .AIFF, and more I can't think of right now. So in accessing the file, here we give you a single way to read and write the file completely independent of the file's actual data format.

So if it's an encoded format like AAC or Apple Lossless or MP3, if there's a codec on the system, and in most cases there is, we will, transparently to you, decode from that format as you read the file. Similarly when you're writing an audio file we will encode from PCM into that encoded format if we have an encoder.

So to do this, the file has this concept of the processing format. And the processing format is simply the PCM format with which you will interact with the file. So you specify the PCM format when you create the file, and it has to be either a standard or common format.

The only limitation here is that we don't permit sample rate conversion as you read from or write to a file. Your processing format needs to be at the same sample rate as the file itself. Now, if you're familiar with the Audio Toolbox Extended Audio File API, this is functionally very similar, and it's just a bit simpler to use.

So I'm looking now at the initializers and some assessors for AVAudioFile. Here's the initializer for reading from a file. If you don't specify a processing format you simply are, you get the default behavior, which is that your processing format will be a standard format. Very similarly to creating an AVAudioFile for writing, the only extra information you need to give us is a settings dictionary.

This is the same settings dictionary passed to AV Audio Recorder. And in there there are keys, which specify the file format you want to use, and in the case of example, for example AAC you can specify the bit rate and any other encoder settings. Those are in the settings dictionary.

So once you've built a file you can always access back the actual file format on disk. So that might be, for example, AAC, 44 kHz, two channels. But you can also query the processing format with which you created the file. And in the case of the two simplest initializers, this would be floating-point, 32-bit, same sample rate as the file. Same channel count as the file.

OK. So to read and write from AVAudioFiles there's a simple method, readIntoBuffer, and that will simply fill the AVAudioPCMBuffer to its capacity assuming you have, you don't hit the end of file. writeFromBuffer is a little in that it looks like the buffer is frame length rather than the capacity, so it writes all of the valid frames from that buffer to the file.

And you can do random access I/O when reading from audio files. So this is like the standard C-library's seek and tell functions, F seek and F tell. You can query the frame position to see where you are when reading an audio file. And you can also seek to a different position in the file by setting the frame position pointer before you read. And the next read will proceed sequentially from that point.

OK. I'd like to tie all these classes together now with this short example. And I've got four screens here. We'll see what it's like to open an audio file, extract some basic information from it and read through every sample in the file. So here we have initForReading. We simply pass the URL. I'm using the variant here that's explicit, but I'm passing PCM Float 32 always. I could have left those off and gotten a standard format.

I'm going to fetch some basic information from the file and print it, including the files on disk format and the processing format. I can query the audio file's length and frames, sample frames. And I can convert that length in frames to a duration by dividing by the file's sample rate.

OK. Next I'm going to create a PCM Buffer to read from. Since the file might be large, I don't want to try to read it all into memory at once. So I'm going to loop through it 128K sample frames at a time. So I'm going to create a buffer with that capacity. And notice I'm just going to pass the audio files processing format. When allocating this buffer, and that ensures that the buffer is the same format that the file will be giving me.

And here I'm ready to start reading through the file. And I'm going to read one buffer at a time until I get to the end so I can query the current frame position and to see if it's less than the length I discovered earlier. I can read into buffer, which will again fill the buffer to capacity. I can double check to see if I'm done by seeing if I got a zero length buffer.

And this is a lot of code, but it boils down to two for loops. The outer one is walking through all of the channels in the buffer if it's a multichannel file. And then the inner-loop will look at every sample in that buffer. So given every sample, I can look at its absolute level and see if it's the loudest, or if it's louder than the loudest sample I've found so far, and if so, I can record that level and where I found it in the file. So there, in about four screens of code, I opened an audio file. I read through the whole thing one sample at a time.

OK. So moving on I'd like to just sort of foreshadow the uses of these classes in the AVAudioEngine session, which will follow this one. So at the bottom we see AVAudioFile and AVAudioPCMBuffer. And those are both used by something called AVAudioPlayerNode, which will be your basic mechanism for scheduling audio to play back. If the AudioPlayerNode is a subclass of a more generic AVAudioNode class, which is some unit of audio processing, and we'll see how AVAudioFormats are used when describing how to connect AVAudioNodes.

So that brings us to the end of my section of this talk. We saw the AVAudioFormat ChannelLayout, PCM Buffer and file classes. You can use these without AVAudioEngine using your existing code with the Core Audio, Audio Toolbox, and Audio Unit C APIs. If you're careful, just do real-time saves, and you can use those assessor methods to extract the low level C structures. And, again, we'll be seeing how these are used in more detail in the next session on AVAudioEngine.

And that's the end of our hour here. We've looked at MIDI over Bluetooth, the Inter-App Audio UI Views, lots of features of AV Foundation audio, and we hope you'll stick around for the next session on AVAudioEngine. If you need more information, Filip is our Evangelist, and there are the developer forums. Here's the next session I keep talking about.