Graphics, Media, and Games • iOS • 50:39
iOS provides a powerful engine for playing, recording, and processing audio in your applications for iPhone, iPad or iPod touch. Gain a thorough understanding of audio session management, and learn the recommended practices for handling background audio, dealing with interruptions, and playing multiple sounds simultaneously.
Speaker: Eric Johnson
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good morning. How's everybody doing? Come on, a little more energy. All right. So today we're gonna be talking about audio session management for iOS. And what this is really all about is making sure that your applications use audio in a predictable way that users-- The way they expect audio to work, and to provide the best user experience.
So we're going to be talking about the managed audio experience on iOS and explaining what that's all about. And then the bulk of the talk is going to be about using audio session. We're going to be talking about some new things like modes that are new in iOS 5. We're going to be talking about background audio and what it means for your application to be mixable or non-mixable.
We're also going to be talking about some new routing properties that are available in iOS 5, and one new behavior that's related to routing. And then we're going to be talking about the voice processing audio unit in conjunction with one of the new modes, and how we think that using the voice processing audio unit can make your application that much better. And then finally, I'm going to spend a few minutes talking about codecs. Codecs are not directly related to audio session, but they're a really important kind of bread and butter technology in digital audio.
So before I get into the in-depth discussion of audio session, it can be a little bit daunting. And so we wanted to talk about if you're just wanting to do some basic playback or some basic recording, what are the kind of bare minimum steps you need to know about when using audio session? So we're going to look at a little bit of code and try and help you feel like it's not that daunting. So for a basic application that wants to do playback, there's three steps: preparing your session, handling the beginning of interruptions, and handling the end of interruptions.
So preparing your session, we can see that it's really not a lot of code. In the first line, we're using an AV audio session class method called shared instance to get a pointer to our application's global instance of AV audio session. Then we're choosing a category. For many applications that want to do basic playback, Ambient is going to be a good choice for things like mini games and productivity apps. And then once we've done that setup, we're going to make our session active.
I should note that if you're using something like the AV Audio Player or AV Audio Recorder classes, setting your session active may not be absolutely necessary. But it's a good idea to go ahead and do it here, because some of the other APIs like OpenEL will require you to set your session active. And then you're going to get ready to play audio, and depending on which API you're using, the steps are going to be a little bit different.
So then once you're up and using audio in your application, your session is active, you want to be prepared to handle interruptions. The system is going to notify you when your audio has stopped. Eric Johnson So we're looking here at the code. This is a delegate method that's part of the AV audio session delegate protocol.
And so when the system notifies you that an interruption has occurred, it's telling you that playback has stopped and that your session is inactive. Eric Johnson So if you had a user interface element, something like a play button, this would be a good time to change that to reflect that audio has been stopped. Eric Johnson So we're looking here at the code. This is a delegate method that's part of the AV audio session delegate protocol. And so when the system notifies you that an interruption has occurred, it's telling you that playback has stopped and that your session is inactive.
If you're using an API like OpenAL or using AudioQueues, AVAudioSession provides a global notification for events like the beginning and ending of interruptions. If you're using AVAudioPlayer or AVAudioRecorder, there are delegate methods, and these are per instance. So if you had, say, three or four AVAudioPlayer instances, each one can receive a begin and end interruption event.
When the interruption ends, the system will notify you. So looking at the code, this is an AV audio session delegate protocol method, end interruption with flags. And here, this is a good chance to resume recording or playback, to update your user interface. Again, so if you had something like a play button, you would update your interface to show that audio is playing again. And this is where you could reactivate your session. Thank you.
And again, AV Audio Session Delegate provides a global notification for APIs like OpenAL or AudioQs. And then AV Audio Player and AV Audio Recorder, each instance will get a notification. So those are the basic steps if you want to do simple playback or simple recording. So it's not a lot of code. It's not too scary. But it's kind of hiding some -- quite a bit of that complexity that's actually going on in the system. And on iOS, we have a managed audio experience. So let's talk about what that means.
So the managed audio experience on iOS is kind of comprised of a few parts. The main thing is that we know that users carry their iOS devices everywhere. If you think about where you see people with their iPod touches, their iPads and their iPhones, you have them with you in business meetings, when you're at home eating dinner, the phone's in your pocket. If you're at your place of worship or on your morning commute, these devices are really with us everywhere.
And we know that audio can be a disruptive thing. So users expect that things like the ringer switch and the screen lock and the volume keys behave in a predictable way. So the system is helping you to manage that. And the goal is to have a consistent user experience, and the best user experience possible.
Another aspect of this is that the mobile device market is really fast moving. If you think about just in the last year at WWDC 2010, we were talking about iOS 4 and iPhone 4. On the software side, we've seen 4.1, .2, .3 updates, and we're talking about iOS 5 this week. On the hardware front, we saw the Verizon iPhone 4 come along, a new iPod touch, and of course, iPad 2.
So with these fast moving changes on the hardware and the software side, part of the managed audio experience is making sure that applications continue to behave the way that users expect them to. So our message for developers is that we want you to choose the right APIs and use those APIs to communicate to the system how you want to use audio.
So let's get into audio session management, the bulk of today's talk. Audio session is the primary way that you communicate with the operating system to communicate how you want to use audio. So we're going to be talking about how to make your app sounds behave according to users' expectations, be consistent with built in applications, things like the iPod app, voice memos, YouTube.
We're going to be talking about how to pick the best category for your application. We're going to be talking about mixing with background audio and what that's all about. And as we saw earlier, we're going to talk about responding to interruptions and I'm going to talk about that in more depth. And we're going to talk about handling routing changes.
So audio session is actually two APIs. If we look at the bottom layer in green, that's the audio session services part, and that's a C callable API. And this is where all of the implementation for audio session lives. On top of that, in AV Foundation, we have AV Audio Session. This is an Objective-C class that provides a lot of the common functionality.
Because AV Audio Session is built on top of Audio Session Services, there's no loss in functionality. It's using Audio Session Services directly. So we encourage you to start with AV Audio Session, and then if you need a little more control over your session, you can use the API that lives in Audio Session Services. And it's perfectly okay to mix and match. They're designed to work together.
So even if you need to use some of the lower level details in Audio Session Services, go ahead and use AV Audio Session for the things that you can. So there are five tasks that you need to be aware of when you're using Audio Session. The first is to set up the session and delegates.
Next, you're going to choose and set a category. In iOS 5, there's a new optional step that goes along with setting the category, and that is choosing and setting a mode. And we're going to be talking about each of these steps in more detail. Once you've got your session configured, you want to make your session active. And then once you're up and running using audio, you want to be able to handle interruptions and handle route changes appropriately.
So the first step, setting up the session. This is going to look really similar to what we saw a few minutes ago. The first step is the same. We're using the shared instance class method to get a pointer to the global single instance of AV audio session in our application. If you're familiar with design patterns, you can think of this as a singleton. The next step is setting a delegate for notifications. So AVAudioSessionDelegate is a protocol that provides several methods.
The next step, step two, is to choose and set a category. This is a really important step, so we're going to explain what each of these categories are used for and help you to decide which category is going to be the best for your application. In many cases, for many applications, you're going to pick a single category, set it once, and forget about it. For some of you, you may need to switch back and forth between two or possibly three categories. So the six categories that we provide are playback, record, play and record, audio processing, and then ambient and solo ambient.
The first three we can kind of group together and think of these for audio applications. That is, applications where audio is really the forefront and a very important part of what the application is all about. Things like audio players and video players, voice over IP, voice chat, audio recorders.
Eric Johnson What these categories have in common is that they do not obey the screen lock, nor do they obey the ringer switch. Now, that may seem a little counterintuitive, so let's talk about that. So if you think about something like the built-in iPod application, it's going to use the playback category. And you really want users to be able to start their music and then hit the lock screen to save battery life, but continue to listen to their music.
Eric Johnson And likewise, with the ringer switch, you can use the playback category to play music. Eric Johnson And you really want users to be able to start their music and then hit the lock screen to save battery life, but continue to listen to their music. So if you think about this, you want the user to be able to have that set to the silent position, since they're not hearing ringtones. But since the user is in control, it's the user who's pressing the play button, we want them to be able to hear their music even though the ringer switch is set to the silent position.
These three categories also share the property that they're allowed to be used in the background. So if you think about something, again, like the iPod app, you want users to be able to start playing their music, send iPod to the background, and then bring up something like Safari so they can surf the web while they're listening to their music.
So the playback category is used for output only. The record category is used for input only. And it's important to note that when you use the record category and your session is active, all output audio in the system is going to be muted. The play and record category combines the first two, so it allows you to do playback or recording or simultaneous play and record.
In the middle of the chart, we see this Mix with Others column, and I'm going to be explaining that more in detail, but let's just notice that the playback and play and record categories, by default, they're not going to mix with other audio, but there's an override you can set to make them mix with other audio.
The second grouping of categories is for applications that are games and general applications. So these are the types of applications where audio enhances the experience, but it's not critical. So if you think about something like the yellow sticky notes application that is on every phone, hearing those key clicks enhances the experience, but if you have your ringer switch off and you're not hearing those key clicks, it's not a big deal. You can still use the application. Eric Johnson So these are the types of applications where users are very hands on with them, or something like a video game where again the audio enhances the experience, but you can still play the game even without sound.
So these categories do obey the screen lock, and they do obey the ringer switch, and they are not allowed to use audio in the background. And the reason that these categories are not allowed to use audio in the background is because of the fact that we expect users to be hands-on with them.
So if you send this application to the background, the user is no longer interacting with it, so it makes sense that the audio will stop. The difference between ambient and solo ambient is that the ambient category will always mix with other audio. So this is for things like a video game where you just have kind of incidental sound effects, but you don't have your own music soundtrack playing in the background.
The final category is for offline processing. The name of this category is audio processing, and it's for doing offline conversions or offline processing. And looking at the chart, we can see that it does neither input nor output. And this one also does not obey the screen lock or the ringer switch, and it is allowed in the background. So for example, you want to be able to continue processing audio even if the screen is locked.
So those are the six categories. I just want to reiterate that you really want to spend some time thinking about what is the best category for your application, because that's a big part of making sure that the user experience is the best possible. So looking at the code, following along here, we saw that we got the pointer to our AV audio session. We set up our delegate, and now we are setting the category. In this example, I'm showing the play and record category.
So now in iOS 5, we're introducing some new functionality that we call modes. Modes go along with categories, and they're a way to specialize your category and tell the OS a little bit more about how you want to use audio. And this is going to unlock some capabilities to behave more like built-in applications and to do some things that you just couldn't do in iOS 4. So we have four modes-- voice chat, video recording, measurement, and then the default mode. And let's look at each one of these.
Voice chat mode, this is for things like voice over IP or video games where you want players to be able to talk to each other over the network. Because it's a two-way communication, we're going to, this works with the play and record category. So when you set this mode, and you've set the play and record category, the system is going to pick the best microphone choice for the current route.
So let's look at the way that people hold phones when they're on, say, voice over IP calls. The first orientation is to hold the phone up to your ear. So at the top of the phone, we have a speaker that we refer to as the receiver. And then at the bottom of the device, we have a microphone. So this is the input and output routes that we'll use in this orientation.
The second common orientation is speakerphone. So here we see a man who's chatting with someone. He could be in a FaceTime call or it could be a voice over IP call, but this is the speakerphone orientation. So here we're going to be playing audio out of the bottom speaker and using the top microphone for input. We want to use the output and input devices that are the farthest apart from each other to eliminate feedback and those sorts of things.
When you set the voice chat mode, the system will also optimize the signal processing for voice applications. The system will also help to manage routing by restricting the allowed routes to those that make sense for voice chat types of applications. And it will automatically, by default, the system will allow Bluetooth headsets to be used in your audio routes. So if you think about voice over IP applications or video games where players are chatting with each other, you want them to be able to use those Bluetooth headsets. If you decide that you don't want that, you can set an override to turn that off.
So the final thing I want to mention about this mode is that we want to encourage you to use the voice processing audio unit. In a few minutes, I'm going to go and talk about the voice processing audio unit. The next mode is video recording mode. The use case for this is actually pretty easy to explain.
A lot of our new iOS devices have great HD video cameras, and so this is for applications where you want to have the best audio experience to go along with using that video camera. So there are two categories that support this mode, play and record, and then the record only category.
And like with the voice chat mode, the system is going to pick the best microphone choice for the current usage. So let's look at a typical way that someone might be holding the phone when they're doing a video recording. We see, if we're looking at the diagram of the iPhone 4, that the top microphone and the camera lens are located pretty close to each other.
So it makes sense that we'd want to use the top microphone because it's close to that lens. And if you look at the way she's holding the phone, oftentimes users will end up covering up the bottom microphone with their hand. And so that's another reason that we choose the top microphone for this type of application. And likewise, the system is going to choose the best output route for the current usage.
The third mode is Measurement Mode. This is for applications that want to do calibration or measurements, things like SPL meters or audio analysis tools. And then on the output side, maybe applications that aren't so concerned about having the nicest quality sound, but the simplest audio with the least amount of signal processing applied. This mode, when you're in this mode, the system is going to use the primary microphone. On iPhone 4, that's the bottom mic.
And in terms of the signal path, the system is going to apply some minimal EQ to create a flat EQ for microphones. And otherwise, it's going to provide very minimal signal processing. And that's what this mode is really all about. The final mode is default mode. And this is the mode you get if you do not explicitly set one. This works with all categories.
If you want to set it explicitly, there are constants in the API for doing so. On the input side, the system is going to select the primary microphone, and the device is going to be configured for general usage. So it's just like what you would have gotten in earlier versions of iOS.
Okay, so looking at the code, we've, in the previous step, we set plan record category, and now we're choosing voice chat mode. So since we're setting the voice chat mode in the play and record category, this is a good opportunity to talk about the voice processing audio unit.
So what is the voice processing audio unit, and why should I use it? The Voice Processing Audio Unit, as the name implies, it's an audio unit, so it's one of the lower audio APIs in the software stack. It is an AU remote I/O with a built-in acoustic echo canceler, and that's the important part that we're going to be talking about today. It's designed for high-quality chat with two configurations available: highest quality and lowest complexity.
The word "complexity" here really is talking about CPU usage. So if you know that your application is really doing a lot of processing, and you need every last CPU cycle, then you may want to choose the lowest complexity configuration. Otherwise, why not choose the highest quality? So this is available on iOS. It's been on iOS since version 3. But it's new and available on OS X Lion.
So let's look at what happens in a voice chat scenario where you do not have an acoustic echo canceler in the signal path. So let me explain the diagram a little bit. On the left side in blue is a user at the far end. So maybe they're in the next room, or maybe they're on the other side of the world. That wavy line in the middle represents the network. And then on the right side in red is the near end user. And you can think of the near end user as being the person using their device with your software running on it.
So let's look at what happens here. So the far end talker, the person in blue, speaks into his microphone. That signal goes into his iOS device and goes across the network. And then that signal is played out of the loudspeaker on the near end device. Eric Johnson The near end user is going to be talking at the same time. And so the sound that's coming out of the speaker and then the near end user's voice, they're going to be mixed together in the air. And so the signal that's being fed into the microphone is a combination of those two signals.
And that's what the purple line at the bottom represents. Eric Johnson So now this signal is going to be sent across the network, and it's going to arrive at the far end device. Eric Johnson So the far end talker, he's going to hear, his own voice being echoed back to him, perhaps 100 to 200 milliseconds later. And that's going to be a really annoying echo. It's just going to really detract from the user experience.
In reality, the situation is even a little bit more complex. There are other applications on the near-end device that can also be producing audio output. So perhaps the near-end user is playing his iPod in the background, and things like SMS notifications or a voicemail notification, all of those things can be making sounds that are also gonna be coming out of the loudspeaker. So coming out of the loudspeaker on the near-end device, that light blue line is a combination of the voice signal coming over the network as well as any other output audio on the near device.
So now those sounds are going to be mixed in the air with the voice signal from the near end speaker, and that combined signal is going into the microphone. Again, it's being sent over the network. So now the far end user, not only is he hearing an echo of his own voice, which was irritating, but he's also hearing the music that was playing on the near end device.
[Transcript missing]
So I'm just going to direct your attention to the AudioUnitProperties.h header file. There are five properties here that you can use to fine-tune and configure the use of the voice processing audio unit.
Okay, so we've talked about choosing a category, choosing a mode if it can enhance your application. So now it's time to make your session active. And so that's the code that we're seeing here highlighted. Once you've activated your session, there may be some additional setup, things like setting up your AV audio players or OpenAL or audio cues.
Okay, so now that we're at the point where our application is active, it's up and using audio, it's a good time to talk about background audio and what that's all about. So with the introduction of iOS 4 last year, third-party applications were now given the ability to do multitasking. This was a great thing, and it introduced a way for third-party applications to play audio in the background. So it gets interesting when you start thinking about multiple applications that want to do playback at the same time.
So before we get into that, let's talk about how you enable background audio. So the first thing is to pick a category that supports background audio. So that's going to be all of them, except for ambient and solo ambient. Once you've selected a category that supports background audio, you're going to go to your Info.plist, and for required background modes, you're going to add the audio flag. So those are the two steps to get started with background audio.
So now let's talk about what happens if there's more than one application that wants to play at the same time. So the question is, what is going to be heard? And the answer is that it depends on both applications, whether each application is mixable or non-mixable. So looking at the chart here, we see that the ambient category will always mix with others.
The playback and play and record categories, by default, will not mix with others. But there's an overriding that you can set to allow them to mix with others. So that's what we refer to when we're talking about mixable. Either your category automatically supports mixing with others, or you've set that override to make it mixable.
A couple of notes about if your application is non-mixable, which is, again, going to be the default if you're using the playback category or play and record. If you're non-mixable, your application is going to have access to hardware resources, things like setting the sample rate or setting audio buffer sizes.
Kind of going along with that, the modes, when you apply a mode, that's kind of closely related to hardware resources, because it's affecting things like routing, like microphone selection. So you need to be a non-mixable app to apply a mode. And then finally, if your application is non-mixable, when you go active, you may interrupt other audio. So let's look at that in more detail.
So in the first scenario, our application is in the foreground, and it's the only application that wants to use audio. We've set that override to make it mixable, and so we're happily streaming audio. It's being fed into the system's mixer and then sent out to playback hardware. So now let's talk about if there was another application that was already running in the background using audio when your application launches and goes active. So in the first case, the background app was also mixable, so there's no conflict. Both applications can continue to play. Both streams of audio will be fed into the mixer and then sent to the playback hardware.
If the background application was non-mixable, recall that that application has access to hardware resources, things like setting the sample rate. And there's no conflict in this case either. Both applications can happily play audio at the same time. Okay, so now let's kind of take a step back and look at if our application was non-mixable. So we've set the playback category and we did not set the override. So now this forefront application is going to have access to those hardware resources, the buffer size and the sample rate.
So the question is, what's going to happen with the audio in the background app when our application goes active? Well, if the background application was mixable, there's no conflict, and both streams will continue to play. The more interesting case is if the background application was also non-mixable. So the red X is telling us that their audio has been interrupted now, so that background application is going to stop playing. Don't be afraid of the red X.
It doesn't mean that anything has gone wrong. In many cases, this is absolutely the right behavior. It's what you want. So if your, let's say your application streams radio from the Internet, then you want it to interrupt iPod when the user brings your application to the foreground. So this may be the absolute right behavior. Thank you.
Okay, so let's talk about, for more specific types of applications, some kind of details about going active and when you might want to go inactive. So for most applications, you can just go active at the beginning, set it, and forget it. But there are a few classes of applications where you want to kind of manage your active state a little more.
So recorders, voice over IP applications, turn-by-turn navigation apps, and then non-mixable apps. For recorders, as I mentioned earlier when we were talking about categories, if you've set the record-only category, all the output audio on the system is going to be muted. So you only want to be active when you're actually recording audio.
As we just looked at with non-mixable applications, your application can interrupt other audio in the system. So you may want to think about managing your active state to only use-- only be active when you're actually using audio. We're going to look more closely at voice over IP and turn by turn navigation apps.
So let's start with Voice over IP or voice chat applications. So if we're a Voice over IP application and a phone call comes in, that's when we want to go active. And that's going to interrupt other audio that was playing on the system. When the call ends, we want to go inactive to allow other audio to resume.
So looking at the code here, we have a callback method. My call did finish. So this is when the user is hung up. So we're going to be calling setActiveWithFlagsError, and we're going to be passing a no to deactivate. And we're going to be using the notifyOthersOnDeactivation flag to let the system know that it can tell the other audio that had been playing that it can now resume.
Turn-by-turn GPS applications are a really interesting case. So let's look at the setup first when you're setting up your session. We're going to choose the playback category, because we want to be able to play audio if the screen has been locked or if the ringer switch is off.
We're going to set the mix with others flag, and this is that override that I was talking about that can make your application mixable. Because if you're a turn-by-turn GPS application, you're giving directions like turn right in 500 meters, but you don't really need to interrupt other audio on the system. You want it to mix in.
There's another override called other mixable audio should duck. And what this means is that when you're giving those directions, like turn right in 500 meters, that the other audio on the system is going to be lowered in volume so that those directions stick out in the mix. So let's look at that.
So if you're this type of application, you want to be registered to receive location updates, and you want to be registered to use background audio. So the user is in their car, they're driving, and they've arrived at a new location where you want to give them some directions to turn right.
So we see our method getting called here, play the preloaded instructions. So now we're going to set our session active, and this is going to be the trigger to the system that it should lower the volume of other audio. And then we're going to use our AVAudioPlayer object here and tell it to play.
When the AV audio player is done playing the instruction, the audio player did finish playing method is going to be called. And this is where you're going to set your session inactive, and that's going to be the trigger to the system that it can go ahead and raise the volume of the other audio that had been playing.
Okay, so we've talked about when you're active, if you're doing certain types of applications, how you may want to kind of manage your active state a little bit more. So now let's assume that we're up and running, we're using audio, and we want to be prepared to handle interruptions. At the very beginning of the talk, I just talked very briefly about this, so let's look at this in more detail now.
Eric Johnson So your session can be interrupted by higher priority audio, things like a phone call, a clock alarm, or a non-mixable other application coming into the foreground and interrupting your session. Eric Johnson The interruption makes your session inactive, and any currently playing or recording audio is going to be stopped. When the interruption is over, it's a good time to reactivate certain state, and that's going to depend on exactly which API you're using for playback or recording, and then to become active again if it's appropriate to do so.
So let's look at the methods that are part of the AV audio session delegate protocol that are related to interruptions. On a begin interruption, this is the system notifying you that audio has been stopped and that your session is inactive. I should clarify that it's not the system asking if you want to be inactive. It's telling you you're already inactive. You need to just deal with it. So this is a good opportunity to change the state of your UI. So again, if you had something like a play button, this is where you'd change it to reflect that audio has stopped.
When the interruption ends, so this would be like if the user had taken a phone call and they've now ended the phone call, that would be an end interruption. Or if an alarm went off and the user pressed the OK button, that would be the end. Or if a phone call came in and the user decided that they wanted to just decline the phone call, that would also end the interruption. So the system is going to notify you by one of these delegate methods. So this is when you would make your session active, update your user interface, and then resume playback or recording.
Just quickly looking at if we were using an AV audio player, the delegate protocol for AV audio player has very similar methods for audio player begin interruption and audio player end interruption with flags. Eric Johnson As I noted earlier in the talk, if you're using AV audio players or AV audio recorders, each instance is going to receive these notifications. So again, if you had like four or five or six AV audio players, each instance is going to receive a notification.
Okay, so then the final step in dealing with audio session is to handle route changes. And this is all about the user's expectations. On iOS, we have a rule that we call the last in wins rule. So this means that if the user plugs in a headset or headphones, the user expects that audio is going to be automatically routed to the headphones, and they expect that that audio is going to continue playing without pausing, because they're, you know, they're putting the audio in a different way. They're putting the earbuds in their ears, and they can continue playing.
It's not going to be disturbing anyone else. On the other hand, when you're unplugging, you're unplugging headphones or headset, the user does expect that audio is going to be routed back to wherever it was before, but they expect that audio is going to pause, because they're, now they're going to be broadcasting audio into the room, and that can be disruptive.
So there's a new behavior in iOS 5 that we just want to make you aware of. The nice thing is that as a developer, there's not really much you need to do, but we did want to make you aware of this. So in this example, we see that a person is using their iPod, and they've started their, sorry, they're using iPad, and they've started their iPod music. It's playing in the background, and they're taking advantage of AirPlay, and that audio is being sent to their Apple TV, which is connected to a television set, and so their music is playing.
So they sent the iPod application to the background, and their music continues to stream, and they brought up the yellow sticky notes application, and they're editing a recipe, a cookie recipe, I think it is. So the key click sounds now are going to stay on the local device. They're going to stay with the iPad. Whereas the iPod audio is going to be streamed to the television. So this is going to apply when you're using AirPlay, or you are connected via an HDMI cable.
The other type of audio, aside from system sounds, that will stay on the local device is voiceover. That's an accessibility feature for visually impaired users. So what are the things that as a developer you need to be aware of with regards to audio routing? So there's two things, querying the route, and then listening for route changes.
Querying the route is asking the question, what is the current audio route? So in iOS 5, we have a new property. In previous versions of iOS, you could do this, and the property was a little bit different. So take a look at videos from previous years to hear about that. But in iOS 5, the new property, audio route description, is going to give you a CFDictionary. And that CFDictionary is going to have two keys, one for audio inputs and one for audio outputs.
In addition, in iOS 5, we're now enumerating what all of the possible inputs are. So things like built-in microphone or Bluetooth HFP. I see some applause here. And we're also enumerating all of the possible outputs. You have a few more on the output side, things like Bluetooth A2DP, headphones, AirPlay. So with the new property, you'll be using these constants to tell which audio input or output is being used.
So the second question is, where did the route go? So when a user plugs in an accessory or unplugs an accessory, what happened? How did the route change? So for this, you're going to use AudioSessionAddPropertyListener, and you're going to add a listener for the AudioRouteChange property. So this is going to tell you the reason why the route changed, what was the old route, and in iOS 5 we're also telling you what's the new route.
So in iOS 5, so sorry, the reason has always been available on earlier versions of iOS. In iOS 5, we're giving you the previous route description, and it's going to be the same format as the property that we just looked at for getting the current route. It's going to give you separate inputs and separate outputs, and we're enumerating what each of the possibilities are for inputs and outputs. And then we're also giving you the current route description.
Okay, so we've been talking for a little bit now about audio session. We've looked at the five steps. Setting up the session in delegate, choosing a category, and setting a mode if you're doing voice chat or video recording or some type of measurement. And then we talked about making the session active and managing your active state for certain types of applications. And then we talked about once your application was up and running and you're using audio, that you want to handle interruptions and handle route changes.
So the final topic today is audio codecs. Audio codecs are separate from audio session, but they're a really critical technology that many of you will be using, and so it's important to talk about it. So what is a codec? Well, the term codec comes from encoder and decoder.
It's all about taking linear PCM signals, audio signals, and compressing them into some sort of compressed format and then decompressing them back to linear PCM. There are two broad categories of codecs, lossy and lossless, and we're going to go into each of those. And it's, like I said, a core technology in digital audio these days.
So lossless audio codecs. The beauty of a lossless audio codec is that there's no loss of information. If you take an audio signal, you compress it, and then later decompress it, the signal that you get back is going to be bit for bit identical. The disadvantage is that the compression factor is not all that high, typically 1.5 to 2. So that's why some really smart people came up with lossy audio codecs.
Many of the popular modern lossy codecs, like MP3 and the AAC variants, rely on a perceptual model, a psychoacoustic model of human hearing. The quality of lossy codecs are going to vary with the bit rate, and the payoff is that you get a much higher compression factor, typically 6 to 24.
On iOS 5, we have many of the popular codecs available, MP3, the Apple Lossless Audio Codec, otherwise known as ALAC, and then various flavors of AAC. On the decoder side, all of these are available, and on the encoder side, with the exception of MP3 and AAC, the high-efficiency variants encoders are available. Eric Johnson And on iOS 5, we're also adding AAC-enhanced low-delay plus SBR.
Let's just take a quick look at the AAC variants. AAC stands for Advanced Audio Codec. It's not Apple Codec. It's Advanced Audio Codec. So the first form of AAC is AAC Low Complexity. And this is the core technology for the AAC family, what all the other variants are built upon.
This provides very high quality audio. So the audio that users download from the iTunes Music Store, that's all using AAC. And it's for general use. The high efficiency variants are designed for streaming audio, and they provide lower bit rates. And then the low delay variants of AAC are for voice over IP and conference applications, and their key feature is the low delay.
Okay, how do I use a codec? At the bottom layer of the software stack are audio converters. Moving up a layer in the audio toolbox, we have audio cues and the extended audio file API. And then moving up a layer further into AV foundation, we have AV audio player and AV audio recorder. So these are some of the ways that you can use Codex. Each successive higher layer in the software stack is going to wrapper some common functionality and make it easier to use audio converters, which are at the base. Thank you.
Okay, so thanks for putting up with me for the last 50 minutes. We've talked about a lot of things. We've talked about the managed audio experience on iOS, and we talked about what that means. We talked pretty extensively about audio session, and we talked about the new things in iOS 5, like modes. We talked about using background audio and what it means for your application to be mixable or non-mixable.
We looked at some of the new properties that are available in iOS related to routing, and we talked about a new behavior-related routing. And then we talked about how you can use the voice processing audio unit if you're using the voice chat mode, and why that's so important, and how it can take your application to the next level. And then finally, we talked about codecs, since they are a really important digital technology.
For more information, I want to direct you to Eric Verschen, who's our... Eric Verschen, who's our Media Technologies Evangelist. Check out the programming guides that are available on developer.apple.com. I recommend you just go and search for audio and take a look at all the great documentation that's there. And then finally, the Apple Developer Forums are always available. Thank you, and have a great week.