Audio Development for iPhone OS - WWDC 2009

iPhone • 55:35

iPhone features a state-of-the-art audio engine, enabling the most innovative mobile music and audio applications available. Get introduced to the range of powerful audio APIs provided in the iPhone SDK and understand how the audio system works with popular audio formats. Learn the recommended practices for handling audio interruptions, responding to user actions, and playing multiple sounds simultaneously.

Speaker: Bill Stewart

Unlisted on Apple Developer site

Downloads from Apple

SD Video (132.1 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Welcome to the iPhone OS Audio session. My name is William Stewart. We're going to be talking about - taking drinks of water in between your sentences. We're going to be talking about audio on the iPhone, how it works, how it integrates in with the user experience. And we thought that we would delve deeply into how the user perceives the phone or the iPod Touch as an audio device, and how you as an application can understand what the user is doing, understand the user's expectations for how they should use your application and the kinds behaviors I'd like to see from your application.

And the APIs that we use to express all of this is done through AudioSession, as you may have heard from some other sessions. And there's a few concepts in AudioSession that we'll go into in detail, category, routes, what to do about interruptions, and then just an overall sort of view of managing the state.

So before we go into that, I'd like to just give you a very brief overview of the APIs that you have for both playing and recording audio. These are a collection of different frameworks and these are the primary objects that you have, MPMusicPlayerController, AVAudioPlayer and Recorder from AVFoundation, AudioQueue, OpenAL, and then AudioUnits. So how do these look in the system?

The first thing is a little bit divorced from the rest of the audio frameworks which is the MediaPlayer Framework, and that just gives you an object MPMusicPlayerController, and it allows you to access the iPod library that's on the user's system. And it plays with the same rules as the iPod application. So if you're running a current phone and the user could be playing iPod in the background, MPMusicPlayerController behaves the same way.

And when we talk about background or iPod playing in the background in relationship to your application's audio, MPMusicPlayerController has exactly the same behavior, so you can sort of think about that as well as iPod. So then here's a sort of a system view of the audio frameworks, or the frameworks that have substantial APIs for audio. AVFoundation is the top level framework.

This is an object say framework, it's built on AudioToolbox objects, AudioQueue and AudioFile that we'll look at in a moment. And it's the simplest API that you can use for playing and recording files. AVAudioRecorder is new in 3.0, AVAudioPlayer was provided in 2.2 OS releases. In the next session we'll be going into these objects in more detail.

But just to give you a very brief overview, the intention of these objects is to be utilitarian, they're provided to give you a utility model, I want to play this file, I want to stop it, I want to loop it this many times, this kind of basic sort of solution. It also has delegates for handling state changes and for interruptions.

So if you just need to play or record files, AVFoundation is all you need to really know. If you're doing a game, OpenAL is the one-stop shop for this. OpenAL is an API that's across the platform on various platforms, it gives you 3D source posit=ioning, you can move sources around the listener, the game player.

You can do independent rate control so you can have different sample rates on sources if you want to do that. You can do "Doppler" effects so that you can simulate sound sort of going past you, you get that sort of ambulance fading sound. You can do looping, the basic set of controls that you need for mixing separate sound sources into one mix and outputting that mix.

Now AudioToolbox is a general collection of audio services. AudioQueue and AudioFile are the two APIs that are used to implement AVFoundation, which you can sort of see from the way the frameworks are structured here. But there's also a collection of other utilities, and some of these utilities are used by AudioQueue or AudioFile, and AudioConverter. But you can use these APIs directly as well, so if you want to get down into more detail and have more control, you can use these things directly.

ExtendedAudioFile combines AudioFile which is reading and writing to the disc, and AudioConverter, which is transforming data formats. And then AudioFormat gives you information about AudioFormats. Is MP3 a variable bitrate format? What decoders do I have on the system at the moment? And then if you're just doing simple beep sounds as the audio services play calls and that's the general set of APIs for AudioToolbox. Now AudioQueue itself uses AudioUnits to render the audio that you're providing, the AudioQueue, and it uses output units, it uses mixers, and you can use these same objects yourself.

Now we're going lower down, there's some more rules about how you can interact with these objects, the kinds of things that you can and shouldn't do because the I/O services that are embodied at this level are deadline driven and you have a limited time which you've got to fill a buffer of audio in order for it to be heard, otherwise you'll get glitching and bad sort of behavior. The mixing AudioUnits here, there's a 3D mixer, that's the mixer that's used by OpenAL.

Then of course at the bottom of all of this is the AudioHardware, and the AudioHardware, there's no direct access to it. On the desktop system you could have the core audio framework which is not available on the iPhone, so the access you do have is through the AudioUnit remote I/O or the voice processing AudioUnit.

And the main thing about the AudioHardware is that there's also a state associated with AudioHardware like do you have input available, do you have output, etc? And you manage this state through AudioSession, and that's what we'll be looking at in the rest of this talk. So AudioSession really describes an application's interaction with the audio system. So it represents a snapshot, a current state of the audio on the device.

You can have settings to establish your preference, like I want this sample right, or I want to do input or output. And AudioSession also gives you a way to handle state transitions, so it has a notion of being active, that you're an active client, you're really using the audio system or you're not. It has explicit notions about being interrupted. A phone call will interrupt your application's audio.

High priority items besides phone calls could be alarms. So all of this is embodied in the AudioSession object. Now the AudioSession API itself is in AudioToolbox, it's a C API , and all the implementation and detail is here, there's a collection of properties, there's property listeners. And in 3.0 we introduced AVAudioSession which is in AVFoundation, and this is a wrap of utility class built on top of AudioSession. Now it's convenient for you because you're in Objectve-C to write your application. But whether you use AVAudioSession or the underlying AudioSession APIs, it's exactly the same thing, you don't get any benefit by doing something in AudioSession that you can do with AVAudioSession.

So in the rest of the talk, what I'm going to do is where possible I'll talk about AVAudioSession because it's a little bit cleaner and simpler, and then we'll look at some specific AudioSession things that AVAudioSession doesn't give you. So the first thing that I want to go through is categories.

AudioSession categories really describe the basic role and the basic set of services that you want from the audio system in your application. Now some of these categories will define priorities. It will decide whether you allow mixing with iPod or not. And because it's going to set a collection of priorities, you may interrupt the iPod if it's playing back and you go to play audio.

So that's another thing an AudioSession category is going to establish for you. And it also dictates fundamental behaviors about your audio in relationship to the ringer switch, in relationship to the screen lock, and also in relationship to whether you have input or output. So using AVAudioSession to set a category, because there isn't really an AudioSession object, there's a single term pattern it's a single instance that's global for your application, for your process.

So in the AVAudioSession you just have the shared instance method on AVAudioSession that gives you back an AVAudioSession object, but that's it, there's no way to allocate an AVAudioSession object. So in the rest of the slides I'm just going to refer to it as the session. So you get the session object and then you make a call on that which is to set the category.

And we'll look at the different types of categories. And this is basically all you need to do to establish your priorities with the AVAudioSession object. So the collection of features that a category defines, we've got this down to five columns; whether you mix with others, whether your audio will obey the ringer switch or not if the ringer switch is set to silent, is the audio silent or does it play through? When the screen is locked, when the phone is locked, does your audio go silent? Does your application require input?

Does your application require output? So this is what the table looks like, the rows are the categories, and the columns are the behaviors. So the top row is Ambient, and the second row is SoloAmbient. And the difference between these two categories is the SoloAmbient goes solo, it doesn't mix with the iPod or MPMusicPlayerController in the background, whereas Ambient does. But both categories will go silent with the ringer switch. Both categories will go silent when you lock the screen.

Both categories only do audio output. Now you can use the playback category, and the rest of the categories as you can see will both disobey the ringer switch and disobey screen lock, so they'll keep playing when the screen is locked or the ringer switch is set to silent.

And then those three categories, one is for playback, one is for record, one is for both playing and recording. And then they have an override mark in the mix with others, and we'll look at this in more detail in a moment. So what are some examples for the categories?

What type of categories would you use for what type of application? Now for games we would recommend either Ambient or AmbientSolo because typically the audio in a game is incidental or it's not critical to the application. You can play the game without hearing the audio if you're in a movie theater, if you're in a WWDC session talking about AudioSession and you're getting bored and you want to play a game, you don't want everybody else to know that you're playing a game, so you want the game to be quiet so that you don't disturb the speaker.

Anybody there? No, okay. So Ambient is a great category for this reason, and also if you're playing a game and you're really interacting with the device, you want the audio to go quiet when the screen is locked because there's really nothing for the user to do at that point. For the playback category, you can imagine the iPod is this category, if you've got an app that's similar to that in intention that it's going to play music or do something with audio that you'd want even if the ringer switch is silent.

So it's an intentional music application. PlayAndRecord. Same kind of thing except now you've got record as well. So if you're doing a chat application, that would be an example of that. Now you don't need to do both playback and recording at the same time, but if you want to do either one at different points, you might want to use this category, and we'll get into that a little bit more later. And then of course the record category, if you're doing recording and you just want to get input, and this will mean you don't have any output at all.

In 2.2 we had another two categories, UI Sounds and Live Audio, and we've decided to deprecate them to try and make this a simpler set of categories and concepts to understand. So UI Sounds is equivalent to Ambient, and Live Audio is equivalent to playback, there's no difference between those two. And if you're using them just use the Ambient or playback. So there's a couple of behaviors that have got to do with categories, not directly but it kind of fits in with the notion of categories. And the first one is, is other audio playing on the system?

Now an example of it we've seen with some games for example this, is if the user is playing their iPod already and the game launches, then you can use this to determine if there's other audio playing so then they can set their category to Ambient so that they'll mix in with the iPod, and they won't play their own background music.

So that could be used a decision point to provide a user experience that's a little bit polished and sensitive to what the user is currently doing. Another category sort of behavior that's interesting is OtherMixableAudioShouldDuck. So what this means is that when your application plays audio, any other audio on the system should be attenuated, should go quiet.

An example of this is Nike, you know, you're running along, you've got your 120 beats per music, you're pumping up the hill and it's like keep going, you're there. Well you want to hear the Nike saying you've been running for ten miles and you're nearly there. So it ducks the iPod down, it makes its announcement, then it comes back up. So you can set this property so that the system will do that for you. So with that I'd like to bring out Michael [inaudible] to give us some demos, and we're going to look at some category behaviors and switch to the Wolfson. Thank you.

Thanks, Bill.

So as Bill mentioned, what I want to show you is a couple of applications that exhibit the behaviors of various categories and setting them using the AudioSession. The first I'm going to show is a program called AVTouch, and now this is using the media playback category, as he mentioned, this would be for an application where media is critical or something like an internet radio station.

So I'm going to start the application.

[ Music ]

And we've got the music playing, and now you'll see what will happen if I screen lock. It's going to keep playing through.

[ Music ]

Similarly I can set the ringer and the application is just going to blow through that, this is just going to continue playing.

The application really has no idea as to the states being performed on it because as we mentioned, this is critical for the application function. To contrast that, I'm going to use another application called OALTouch, so this something more similar to what a game would use, OpenAL with a SoloAmbient category.

And you'll see I can play this very boring game. But now you'll notice if I screen lock, we're actually going to pause the audio. Again, the application doesn't have any indication this happening, the audio is just being paused from underneath them. I can restore this and we get the audio back just as we were before.

Now the ringer switch similarly is going to mute the audio. I can put this back and we'll continue just as normal. The other thing I want to show you, as Bill mentioned, we have this OtherAudioIsPlaying property. And where that's helpful as he mentioned, is if we have -

[ Music ]

-- music playing.

Now I'm going to launch the app again, and what we do here is we see that the other audio is playing, so instead of setting the SoloAmbient category, we set just the Ambient category, which allows the iPod to continue playing. But I can still get my game sounds.

To contrast that, I've got a bad version of this application which is just setting the SoloAmbient no matter what. And what will happen here is you see now that we've lost the background audio, which the user was kind of expecting to continue, but we don't have anything aside from the regular game audio.

Which is kind of disruptive to the user experience. And so you can see what you really want to do is make sure you don't interrupt what the user intends to do, and that OtherAudioIsPlaying property will let you do that. Back to you, Bill.

And of course if you want to, you can do MPMusicPlayerController to play the background music as well while your game is playing.

Now what about mix with others, how does this work? So the mix with others is a characteristic of categories and it dictates the behavior of two things. So can a background application like iPod play or record while your app is active? And this really defines whether your application is going to interrupt another app. So as we saw with the demo just then when we launched with the AmbientSolo, it interrupted the iPod application, so that's that characteristic of mix with others, and the characteristic with AmbientSolo of course is not going to mix with others.

And in the previous case, he used Ambient where it did mix with others. Now the second characteristic that this controls is whether you have access to hardware or software codecs. So what do we mean by hardware and software codecs? Hardware codecs are codecs that use some chip on the iPhone or the iPod Touch to take some of the load off the main application process in order to decode the audio.

Now this gives an advantage in that we take CPU cycles off the CPU so that you have more for your game, more for your application. And it also has a low power usage, it's less power to do the decode using this chip than it is to run on the main application processor.

But there's some limitations to that, and that is that you can only run one of them at a time, it has to be a negotiated resource, you can't just have everybody doing all kinds of things with it because it's fairly restricted in what it can do. So the software versions of codecs are flexible, that's their main feature, is that you can run more than one of them. You can have the iPod using the hardware and you can be using the software codecs, and that will all be just fine.

But the cost to you for using software is that we will take CPU or application processing time away from your application. So if you're doing a game and you're really pushing it to the limit and you start to use a software codec to decode your backing track in MP3, you'll probably lose frames in your game. So this is a tradeoff that you would need to understand. So it can affect application performance.

So what about the formats, what are the formats that are used for hardware? In the decoders there's an MPEG-4 AAC which is the main format that we use for the music store content, for ripping CDs, etc. There's MP3, Apple Lossless. With the 2.2 release there's an HE-AAC code.

HE-AAC is a lower bit rate version that's really optimized for sort of internet steaming and kind of places where low bit rates are good. So you can get a pretty good reproduction of audio at 64 kilobits per second, so that's about 20 to 1 compression ratio or something, so it's pretty good.

Now with the iPhone 3GS, there is also an AAC encoder. If you've seen the keynote talking about capturing video and you're recording the audio into AAC, you can access this AAC encoder yourself and you can record and use AudioQueue or AVAudioRecorder to encode into AAC on the iPhone 3GS. For software with 3.0, we've also added software versions of the hardware codecs, MPEG-4 AAC, MP3, and Apple Lossless, these are all new with 3.0.

So they will take a lot of time on the application processor to do work, and that varies of course on model and how fast the clocks are on the CPUs, etc. And then the software for all of the other formats like ulaw and alaw iLBC which is speech codec, etc. So to access the hardware codecs you need to have mix with others set to false. That is I'm not going to mix with others, I'm not wanting to go to any party, this is me, this is mine.

So that's every category by default except for the Ambient category. So if you are doing any kind of application and you're not Ambient, then you have, by default, you are going to turn off other people, and you are not going to mix with them. Now you can override this category in the Playback category and the PlayAndRecord category so that you can allow mixing with others and get the other behavior that you would want for Playback and PlayAndRecord.

For example, if you wanted to mix with the iPod but you wanted your music to keep playing when the screen was locked or the ringer switch was silent, then you can use the Playback category and you can override the mix with others characteristic of the category. And the hardware codecs are described in more detail in the AudioFormat header. And also that gives you details about how you can explicitly control when you make an AudioQueue for instance, whether you use hardware or software.

And then if you use a format that's supported in hardware but have mix with others set to true, then you can default to the software codec. So if you go to use AAC and you're the Ambient category then you're going to default to the software codec. So you really need to understand something about what you're doing here if you're using the Ambient category or if you're using one of these overrides.

So how you do the override, it's an AudioSession property, you just do the AudioSessionSetProperty call, it defaults to false which means that Playback and Play/Record categories will not mix with others by the default setting. Now you set this and you may file, we may decide to change the before, but at the moment we have no plans to do so. But you should be prepared to just not get it. If your application changes the category, you need to reassert this if you want.

So if you go from Playback to PlayAndRecord, you need to reassert if you've done this override. So that's on that. So we're going to move on now to audio routes, and covers the audio categories, I hope that's clear. We'll go a little bit back into this later in the session and see how that all fits together with the flow in your application.

So what about audio routes, what do we mean by audio routes? Well it's where does your audio go, where does your audio come from. And this looks like a simple device, doesn't it? There it is, it fits in your pocket and you think oh great, this should just work. Well it's not quite so simple.

And iPhone at least, has a microphone, it has a speaker, it has another speaker which we call a receiver to distinguish between something that is a speaker and something that you put to your ear, so we call it the ear speaker, the receiver. You can also plug in headphones. You can also plug in headphones that actually have a microphone, and we call these headsets to distinguish between headphones and headphones with microphones. You can also use a hands-free device like a Bluetooth device that has both a speaker and a microphone.

New with 3.0 you can use A2DP Bluetooth headphones or speakers. We're not even finish yet. When you turn it around to the side you can also get line out through the 30-pin connector. You can interface to car kits through USB outputs or through A2DP Bluetooth. So that's all of the places audio can go in and come out of, then there's a collection of controls.

There's volume key controls, there's the ringer switch, and there's the screen lock. So really this is not at all a simple device, this is quite a complex device to understand what's going on. And when we looked at this and we thought well what are the expectations that the user has for how this device should behave? That was our fundamental question.

And our answer to that was the last in wins. So last in could be last out. If I plug headphones in, then my intention is the audio goes to the headphones. If I pull the headphones out, then the audio goes to whatever I had on before the headphones. So if I just had nothing else connected on the iPhone it would go to the speaker.

But what happens when you get into using A2DP where you don't have a concept of plugging something in, you just walk into a room with A2DP speakers on and suddenly they're there. So we had to describe some interface to allow the user to actually specify behavior when the default might not be what they want, and this is the audio device speaker and it looks like this.

And this just occurs as something that the user sees, but I just wanted to bring this up in this context so that you understand that these things can change and you have no control over these changes, this is the user's device and it is the user controling where the audio is going to or coming from based on what they've plugged in. So how do you know what the current state of these connections are.

We're going into AudioSession now, there's an AudioSession property called AudioRoute, and that tells you the current route. And there's also a property that you can listen to for when that route changes, when the user plugs headphones in for example, or they pick a different device from the picker in the case of A2DP on the Bluetooth. And this is a notification, and the notification tells you why the route changed, and what the old route was.

And you can always find the current route through AudioRoute. So when you're using AudioSession as a C API, you can get the current route through the AudioRoute property, and this describes the current route as a CFString. You can get an empty string if there's no current route, an iPod Touch I guess, if you had nothing plugged in or no speaker or you were silent.

So if the route changes, you listen to the AudioRouteChange property, and you listen to this property through adding a property listener on an AudioSession. And we'll have a look at what this property listener might look like. So this is the prototype for the property listener. The property ID that's going to come in to this listener that we're interested in is AudioRouteChange, so we check to see if that's it.

And then the payload for this property is of the voice start data is the CFDictionary, so we're going to cast it to that And the CFDictionary has two keys in it; one is the reason why the route changed and that's a number, and it will either be a new device appeared, so headphones plugging in is a new device appearing. The existing device went away, so that would be the user pulling out headphones or user taking the iPhone off from a dock or a 30-pin out ty0pe connection.

And then you'll get a CFStream which is the same kind of string format as the current route, but in the AudioRouteChange parallel, this tells you what the old route was. And then of course in this case, you can get the current route using AudioSession get property on AudioRoute to see what is my new route, what's my now current route given that I've lost whatever my route was previously.

So as with the category, there's an override for AudioRoute and the override exits to give you the facility to make a choice when we don't really know what the right choice is. So if you're in an iPhone and you have nothing plugged in and you're sitting there, you have two speakers.

Now in some categories you could imagine that the earpiece speaker, the receiver is the correct speaker to use. In other use cases, you can imagine that the speaker itself is the right route to use. And they're both there, it's not the user has plugged something in or out, they're both just there, so the override exists for you to make a choice.

And the typical way you'd make a choice would be to present some kind of speaker icon like you do with the speakerphone. So this is how you implement the speakerphone type button. And the override is a set property and you can set it to speaker or you can set it to none, and it's not sticky. So when the route changes, you're going to lose your override, and if it's appropriate for you to reassert the override, you can attempt to do so.

But remember, the route changes because the user has done something, they have plugged something in or they have plugged something out. So it's really better to try and respond to what the user has done rather than try to dictate behavior and feel like you can really control what this thing should do, because it's the users device, and that's the view we took on all of this.

So with that, I'd like to get Michael to come back up and demonstrate AudioRoutes. Thank you.

So as Bill mentioned, what I want to show you is using the AudioRoute properties to actually mimic the behavior. I'm sure you guys know if you have an iPod and you unplug while you're playing back audio it'll actually pause.

Whereas if you're playing the audio through the speaker, you plug it in and it'll continue going. And what I want to show you is how we can actually mimic that behavior.

[ Music ]

So we have the application going, and I've got my handy dandy Apple headset here.

So we're going to play, and then while it's playing I'm going to plug this in. You'll see that the audio is going to continue playing, the last in rule will make the audio now go through the headset. Basically we receive the route notification here and we've checked the reason and said well the new device is available, I don't really care about that, so just keep going.

However, if I unplug, you'll see that now the device is actually paused. What we've done is we checked the reason again when we got that property notification and seen well if the old device went away, what I want to do is update the UI so that we pause the application. And this way what we manage to do is kind of streamline the user experience so that the behavior of your app is what the user would expect just as they get from the iPod.

So what are some of the other things that AudioSession can tell you? AudioSession can tell you about the hardware format, what is the current state of the AudioHardware, how many number of channels do you have for input and output, what's the sample rate? Now when you ask these questions, in whatever state the device is in at the moment, whatever its active category is, whatever is going on on the device. Now before you're active, you can say - sorry, with AVAudio I'm jumping one slide ahead.

So with AVAudioSession this is the calls you make, you just get the session object and you just say what's the current hardware sample rate and the number of input or output channels? Now before you're active, and we'll look at that in a minute. I'm planting all these seeds and hope they'll sprout into understanding by the end of the session.

So before your session is active, before you have asserted your ownership of the audio system or your use of the audio system. You can express your preferences, you can say well when I'm active, when I'm using the audio system, I'd like the sample rate to be at 44.1 kilohertz or at 8 kilohertz, whatever is possible.

I would also like to have an I/O size because I'm a chat application of 5 milliseconds because I want very low latent CIO. Or I'm really concerned about battery life and I'm just playing back so I'm going to put an I/O size of milliseconds. So these are settings that have to do with hardware and the way that the hardware works. And based on your category or based on some other things that may be going on on the device, you may not get these settings, so we call them preferences. And you express them through AVAudioSession, setPreferredHardwareSampleRate, setPreferredIOBufferDuration, and the duration is in seconds or typically milliseconds.

And sample rate of course in hertz. Now what about hardware volume? So hardware volume is the volume that you set with the volume keys. And hardware volume is a lot more complicated than you would think, and we don't try to make it more complicated, we try to understand the complications of this so that it doesn't look complicated to the user. So we collect volumes for a whole different set of use cases, and the volumes are set per category, per route.

So here is a couple of examples, there's a volume for a phone call and headphones. So when you've got headphones plugged in and you're on a phone call, there's a volume for that. When you're in the playback category, so if your playback category or your iPod playing back or something and you're going out to speaker, there's a volume for that.

There's a volume for the ring tone, for the speaker, there's a volume for the ring tone for headphones. So there's a whole collection of volumes that are a pair of the route and the category. And not all routes can have volumes, we don't currently support a volume for A2DP, so there's no volume for that route. So we understand that there's a use case there, but we don't have a volume stored for it.

Now you can get the current volume at any time, and once again, the volume that you get here will be based on the current category and the current route. So if you're in the Ambient category in your game and you're active and you're playing out to the speaker, then you'll get the volume for speaker in Ambient. Now to set it there is no AudioSession set property current hardware output volume.

We decided that this was really user action and we wanted the user to control this and not to have volume set sort of behind their back if you like. And with all of the devices except for the original iPod Touch, they have volume keys, so there really isn't a need to have a hardware volume control here as well because there's keys there. And you can also bring out the slider, and the slider can be used to set the hardware volume.

Now this is the hardware volume, this is the volume across the whole device that affects everything, the audio going out. You can of course with AudioQueues and AVAudioPlayer set individual volumes for those objects so you can do mixers that are relative different sound sources and so forth. So this is just the overall volume of the audio coming out of the device.

Another property that's interesting for the hardware is whether you have input available. On the current iPod Touch, if you plug in headset, you have a microphone now and you have input available. If you do not have a headset plugged in, you have no input available. So if you're doing a recording app or you're doing some kind of input and output, you'd probably want to know is the device running with input, should I allow my application or that feature of my application to run if I made input and input is not there, would be kind of difficult for the user to understand what's going on.

And you can instantiate a property listener so that if they're on an iPod Touch and they plug in a headset, your property listener will fire up and say hey, you've got input now, and you could change your UI. So in order to see some of that in action, I'm going to get Michael to come up again and give us a demo of that.

So what I have here, as Bill mentioned, is a second generation iPod Touch. And what I'm going to do is I'm going to start the recording app we have here, and you'll see that what it's telling me is I can't record because there's no input device available.

And I'm guessing you're probably wondering why do we demo a recording app that can't record. But what we can do is bring back his headset, and as soon as I plug that in, you'll see that now we've actually enabled the record button and found the characteristics of the input hardware. What we've done is just added a property listener for that is input available. When that's available, we use that to update the UI. So now I can continue and I can actually record something here.

Welcome to WWDC 2009. And I'll stop that, actually I'm going to record that again. Welcome to WWDC. Now of course as Bill mentioned also, the last in rule is going to play this back through the headphones, so to show you guys, I'm going to unplug that and I'll play it back now.

Welcome to WWDC.

And you can see that basically using that is input available, we've made a recording application still relevant for the iPod Touch, and most importantly allowed the user interface to update to what the available system devices are.

Thank you, Michael. Okay, so we're getting near the end, I hope you're all still awake.

There will be a test at the end of this session. We'll let you leave if you pass. So how do we fit all of this together, what's the order of operations, how are we going to get all of this to work in your application? That's a lot of stuff. And maybe you don't need to know all this stuff, but it's good to know that it's there so that you can really try to integrate your behavior and your application with the user and what the user is doing with their device.

So the typical use case that you do with AudioSession is that you initialize it. You establish some basic state. Typically this is your category, you may want some preferred hardware settings and so forth. Then you make yourself active, and then you have to respond to changes of state. Let's have a look through this. With AVAudioSession initialization is implicit.

Once you call any AVAudioSession call will initialize the AudioSession object for you. So this is the line of code we saw at the start o of my session. AVAudioSession has a delegate, and you should implement those delegate operations that you are interested. Most importantly, you should implement the interruption methods on the delegate, and we'll go through that. And you can also use the delegate to get notifications of changes in hardware, like sample rate or something like that. So after you've got your session object, then you can set your category, in this case we're doing SoloAmbient.

I'm not doing any error checking here, obviously you might want to do that in your code, but this should be pretty straightforward. And then at this point you might want to establish any other state, if you're interested in changing behavior based on changes in routes or doing overrides or something, these are the kinds of things, you either set those state here or you would establish the listeners for, you know, is input available here, something like this.

So this is where you do all of this kind of setup. Then you're ready. Finally after 45 minutes we geEt to the AudioSession, set active. Yes, that's all you need to do. What that says is hey, I'm around and I'm going to use the audio system. And we have a category for you, so you assert a category and the behaviors associated with that category are now active.

And this is where we will apply those settings and make this the device behavior conform to this. So you have use of the audio system based on the limitations of your category. You can use the hardware codecs if you have a category where mix with others is set to false. You can get input, you can get output based on the category settings. Now you don't lose just by accident, access to the audio system.

You only lose it from one of three actions; the session is interrupted through a high priority item, if you explicitly deactivate yourself, or the application is quit. Why would you want to explicitly deactivate yourself? Well the only real case we can think of is that if you're doing recording and you want to do your best to make sure the recording is not interrupted, then you would set the record category. That means that you've turned off output, there is no output available to the device at this point.

So the user is happily recording and they get an SMS message, they're not going to hear the SMS alert because there's no output device available on the system at the moment because your session owns it. So at the end of the recording, if you want to make the system kind of give it back a little bit, you might want to do SetActive to no to make your ownership of the system, to give that back to the system. And so then if they need to play an alert or something like this, then the system can do that. Otherwise, you probably don't need to do the SetActive.

So one of the common misunderstandings or misconceptions about AudioSession that I'm hoping to clear up with the talk today is what do we mean by categories and AudioSession active, and what is this kind of, you know, how do I understand this and how do I use this? And it's not a change on play type of property. I don't play one sound as an Ambient sound and play another sound as a playback category sound.

Okay, the categories and you being active describe your application's basic role, it's your application's basic personality. Now if you have distinct faces in your application where you might want to change categories or you might want to turn audio off and on, then of course you can do that.

But you really I think need to have a pretty good understanding if you are going to do that. And I think by default, the idea that you just set things up and make yourself active is really 90 percent of the use cases where you really need to worry about it.

So if you're just doing playback, you can be active all the time, just get going, set your category appropriately, make yourself active, you're done. If you're doing recording and playing back, pretty much the same thing. There's no need to kind of fiddle around with the categories or set active false or whatever because the system can still function as you'd think the user would expect, and everything should be just pretty straightforward.

If you're doing record only, this is maybe the only real exception at the sort of top level where you may want to at least manage the active state so that alerts and things can come through Because once you do AudioSessionSetActive, you're potentially interrupting the user, you're potentially as we saw with Michael's demo, interrupting the iPod playback, or interrupting something that the user was doing.

So once you kind of set things up, you make yourself active then you're done, it's a very understandable experience for the user to know how your app behaves and what's going to happen with their device while they're using your application. And of course all of this gets into priorities and interruptions, and the one thing that can upset the Apple cart here for your app is getting interrupted.

And this happens basically with a phone call. So when you're active, you have a priority, and a phone call or a clock alarm is going to be at a higher priority every time than your application can be. So a phone call coming in can interrupt you, and your application of course can interrupt iPod playback if you've got mix with others set to false. And so the phone call will also interrupt iPod playback, it interrupts basically everything, as does a clock alarm.

So what happens when you get interrupted? The system is going to make your session inactive, it's not going to ask you to become inactive, it's not polite about it, that's why we call it an interruption. It just makes your session interactive, any audio that's playing on the system is stopped.

And furthermore, because you've been interrupted, you can't become active again, you can't start playing again until the high priority item has completed. So we notify you through these two delegate methods on AVAudioSession, begin interruption, and we interrupt you. And we're not telling you that we're going to interrupt you, we're not telling you hey, would you mind kind of stopping some time soon?

We've stopped you, and you can't start playing again. So this is a notification to you that we did this to you, we're bad people and we've done this. So you are inactive, you have lost control of the system. What you should do here is to change your state.

You should reflect to the user anything that you would want to change to say that hey, I've lost audio, your play button could become a stop button, could become a play, whatever you want to do to show the user that you've been interrupted and your app is no longer making sound.

Now if the interruption completes and your application is still running, for instance, if you've got a clock alarm, they've got the interruption option and they said yeah, yeah, I know, and you could come back and your application is still there, it hasn't been terminated. In that case, we'll send an end interruption notification to you.

This means that we have made your session active, we're giving it back to you, and you can go ahead and do whatever it is that you want to do. And you can make your session active, this one, because the interruption guy has gone away. Now you might decide well actually you know, you've kind of ruined my recording so I'm not going out there make my session active again here because the user has to click a record button. So your response may not be to make a session active, but you can do it at this point, and you could resume whatever activity you think is appropriate.

So let's have a look at a demo of how you can do this.

What I'm going to show you here is the application I had up before which is the game application. I'm going to launch the application, I'm going to play the audio, and you'll see it's operating as expected.

[ Video game sound ]

I can play with the background music.

[ Music ]

I can keep going. And you see now I have an incoming call. I'm a little busy right now so I'm going to go ahead and decline that, probably against my better judgment. And you'll see that the application resumes as it should just kind of right back from where it started. Now to contrast that, I have a version of the application, which you shouldn't use, that's why it's bad.

[ Video game sound ]

It's really eager. So I'm going to launch the bad version and you'll see it operates just like the normal one, but I've removed the interruption handling code from this. And you'll see that when an actual interruption will come in.

[ Phone ringing ]

Still going to decline it.

Now I've actually lost audio to my system because as Bill mentioned, the audio system has been paused, but since the application is not handling it, what we've actually done is hit a pause state, but the UI doesn't know that. So here I've just kind of put my system in a state that the user is not going to know what's going on and the UI is out of sync with the actual application audio, so it doesn't really know what to do. And as you can imagine, you've kind of disrupted the user experience here because they don't really know why there's no audio, it just isn't there anymore. Now I'm going to run off stage before he actually tracks me down.

Okay, so that's pretty much it. So if you want to use AudioSession to do some more fine grain control over the behaviors you can, you just use it in conjunction with AVAudioSession, there's no sort of this or that type of thing, you just use whichever is most appropriate for you to do. AudioSession, you just use those get and set property calls to manage state. You instantiate property listeners for the various properties as we've seen in some of the slides.

And that's really it, that's all I wanted to cover. So use AVAudioSession, and I think the key points here are to understand the category, what that means to you in terms of the features that you need. If you want to interact with the audio environment there are of course a collection of APIs to do that.

And really the fundamental thing about AVAudioSession is you should at least understand the interruption delegate, you should understand begin interruption, end interruption, otherwise as we saw with Michael's last demo, you can get your application in a confused state where the user just doesn't really know what's going on, just because their boss called them to get back to work, it's a shocking thing.

So a little bit of work in your app can make a vast difference to the user experience and that's why we thought we would spend all this time tonight talking about AudioSession in some detail. If you need more information, you can contact Allan Schaffer and I'll leave this detail up here so you can get Alan's email and spam him.