Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2006-205
$eventId
ID of event: wwdc2006
$eventContentId
ID of session without event part: 205
$eventShortId
Shortened ID of event: wwdc06
$year
Year of session: 2006
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC06 • Session 205

Core Audio Surround Sound

Graphics and Media • 1:10:42

Discover how to use surround sound in your application or game. Learn about the powerful multichannel audio services (such as panning and mixing) provided by Core Audio, the Mac OS X built-in OpenAL implementation, and the surround Audio Unit.

Speakers: Bob Aron, Michael Hopkins, James McCartney

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it may have transcription errors.

So welcome to the last session in this room for the day. This session is going to be on Core Audio and how we deal with surround and multi-channel in general. My name's William Stewart and I manage the Core Audio group, but I'm not going to be really talking in this session. I just thought I'd introduce the topics and then bring the first presenter up on stage. So there's a few different ways that surround sort of goes through Core Audio. We have general services for dealing with surround and we'll be covering that in the audio toolbox multi-channel section of this talk. They're just general concepts of what surround is and how we publish its capabilities and so forth in the system. We'll begin the talk with an overview of OpenAL 1.1. OpenAL is an API for dealing with games and audio. And 1.1 is a recently released spec that we're now supporting with 10.4.7 and in Leopard 2. So the 1.1 overview is just to go through that, go through some of the custom extensions that we've made because our implementation of OpenAL is based on Core Audio, so that's why it's here. We also have an application called AULAB, and AULAB is provided to support audio unit development and usage. And in particular, AULAB has a new version in Leopard that supports multi-channel and surround both audio units and document configurations and so forth. We're also going to be going through panner audio units. These are audio units that are specifically focused around the task of panning audio, spatially, it's not just a volume type pan, and we'll be going through that in some detail. And then on a completely unrelated topic, just because we had nowhere else to put it, we're also adding MIDI output capability to audio units in Leopard, so we're going to finish this talk with that. So without any more to do, I'll get Bob Aron to come up and talk about OpenAL. Thank you. Thank you.

Thanks, Bill. Yes, my name is Bob Aron. I'm a member of the Core Audio team here at Apple, and I'm going to talk about the current state of OpenAL on Mac OS X. So just for a quick review, those of you that don't know, OpenAL, as Bill mentioned, is sort of an open source API set for doing spatial audio, and it's used quite mostly in games, and it's sort of a complement to OpenGL. So with Tiger, we shipped our first implementation as a framework on the system. And we actually resubmitted those sources that built that implementation back into the open source repository that Creative Labs hosts. There are many applications out there that are either using the framework installed in the system or delivering their own implementations that are built on those same sources. And as Bill mentioned, it's based on some core audio pieces, primarily the 3D mixer audio unit.

So what's new for OpenAL? Well, since we shipped in Tiger, the OpenAL community finished up the 1.1 specification. And basically, there were three things that got accomplished with this spec. It was an opportunity to clean up some of the existing APIs, better document them so they might be more consistently implemented. It was an opportunity to add some new features. And also, it was a chance to take some of the existing extensions that were common in several implementations that a lot of developers were using and roll those actually into the API, the core API set.

So again, as Bill mentioned, what's new for Apple is that we shipped our 1.1 spec implementation with our latest software update, 10.4.7. And that's a full 1.1 spec. We added some new extensions. We're now using the same OpenAL headers that all the other implementations are using. There's no custom things in there. And we added the cone support, which is actually part of the 1.0 spec, but we didn't implement the first time around. And those sources are available, again, at the Creative Repository. So you can go get those and build them yourself if you want to.

So let's run through some of the 1.1 features. As I mentioned, there was a chance for some cleanup. So here's a list of a few things that got done in this cleanup. Some better error code conditions were defined. The closed device API was changed to return a Boolean so you know if it was a successful call. One to note is the AOVersion attribute. Now this was not implemented consistently across the various implementations. Some were returning a spec version. Some were returning an implementation version. This is documented now to be the specification version. Some pitch shift limits and a couple of new types were defined.

So if you're familiar with OpenAL or OpenGL, you know there's a nomenclature for getting or setting properties on the various objects of the library. In the 1.0 spec, there were get and set source and listener properties for the float type, but not for integer. So those were added here. And then integer and float variants were added for the buffer object. Those didn't exist in 1.0.

In the 1.0 spec, there's a notion of a Doppler mechanism. It's not true Doppler, where you've got a pitch shift based on some movement of your objects over some amount of time. But it's a pitch shift that allows your application to get kind of this Doppler effect. Now, the Doppler velocity, the AL Doppler velocity that's up there, that was the 1.0 call, and it was very inconsistently implemented. And the behavior across implementations was to the point where it wasn't very predictable, so it wasn't really that useful. That's been deprecated in the 1.1 spec and replaced with the ALSpeedOfSound API. And there's the formula there. It's still similar in that you get Pitch Shift based on the direction and velocities of your source and your listener objects.

OK, one of the big new features for 1.1 are the addition of some capture APIs. Now, these are set up so your application can grab audio data from the user's default input device on their system. You can grab that audio data and then pass it to a buffer object that can go and attach to a source and play it back as output. So as you'll see here, the first two APIs are for opening and closing your device, similar to opening or closing the output device in OpenAL. The main difference with the open call in the capture device call is that you also indicate the format of the data that you want to receive when you grab the bytes from the library, and also the size of the ring buffer that you want OpenAL to set up to write data into.

Now, once you have that device, there's some calls for starting and stopping the device. There's really no need to be writing into that ring buffer if you aren't going to be grabbing the data, so you don't want to be using that processing time. There's an API for capturing the samples, so you can fill a buffer with that audio data. And then also you can use the ALC Get Integer API with the ALC Capture Samples property to discover how many samples are available for you to grab. So I'll just walk through a real simple scenario of how you'd use these APIs. As I mentioned, the first thing you'll do is open a device. And so if I walk through those parameters, we pass null at the beginning because the Apple implementation is always going to use the default input device on your user system. Now, your user can set that in the system prefs or the audio MIDI setup. And so it always uses that, so you don't have to designate a device by name. Now, we're also telling the capture device that we want data that's at a sample rate of 44,100 and that we want mono 16-bit integer data. Now, that last guy there, the 1024, that's how many samples we want the ring buffer to be, and keep in mind that's a number in samples, not in bytes. So if that comes back successful, then we want to start capturing, have the device... We want to have the library start writing data from the device into our ring buffer. So it kind of goes merrily along, chugging and filling up the ring buffer, and at some point you'll want to discover how many samples are available. So we'll use the ALC get integer call and the ALC capture samples, find out how many samples are there for us to grab, and then once we have that value, then we know how many we can request. So we'll use the ALC capture samples. Again, we pass it our device. We give it a buffer that's appropriately sized for the amount of samples that we're requesting, and we tell it how many samples we want, and then the bytes are written into that buffer. So it's a pretty simple mechanism. Thank you.

So the next thing we have that's new for 1.1 are some new distance models. Now the 1.0 specification had one distance model that you could pass to the AL distance model API, and that was an inverse distance model. And with 1.1, there have been two new curves that were added. First an exponential curve. Now it's quite similar to the inverse curve. Really the main difference in the effect of attenuation with distance with this curve is that as your roll-off factor increases, your attenuation occurs more quickly as your source moves away from your listener object. In addition to the exponential model, there's also a linear distance model. Now that's just a straight attenuation line from no attenuation at the reference distance to full attenuation at the max distance.

So here we've got a graph of the three curves. And the reason I wanted to show you this is so you can see the similarities between the inverse and the exponential model and the linear model. So what you'll see about the inverse and the exponential model is that when they reach that maximum distance there, that line on the right, the vertical line on the right, you'll notice that the curve stops. It stops at that point and it flatlines. Well, basically what that means is that any distance past that maximum distance that your source is in relation to the listener object, there won't be any further attenuation. It'll stop at that distance. By contrast, if you look at the linear model, it's just a straight line to that maximum distance. And once you get there, you're fully attenuated.

All right, another feature of 1.1 are some offset abilities for your application to set the playhead or get the playhead of your open source object's buffer queue. So you can do this while it's rendering or while it's not rendering, and you can get or set values in terms of milliseconds or bytes or samples. So for instance, here we've got a really simple buffer queue. It's just got two buffers, a short one that's one second, that happens to be 44,000 samples, and a second one that's two seconds and 88,000 samples. And so your application may-- the source may be chugging along playing, and the play head may be where that red arrow there indicates.

And we could discover that value by using the AO Get Source Eye APIs and then the appropriate version, whether we want the samples or the millisecond offset, to get that value. Now, we may want to jump, let's say, to 2 1/2 seconds in the buffer. So we'll use the ALSourceI APIs and then give it a value that's appropriate, and then it'll jump there. So that'll occur whether you're rendering or not. So that's really sort of the main new features of 1.1. So let me show you-- let me have a demo-- have the demo machine up, and I'll show you some of these in action. No, wrong machine. Where's that?

Oh, A. Here we go. OK, I switched, but I don't see anything. Oh, there we go. There we go. Yep, got it. Thanks. In the core audio SDK, you actually can find a project called OpenAL Example. Sorry about that. And... This is just a simple OpenAL application that creates a context and adds some source objects. All these objects have buffers attached. All these red guys and yellow guys are sources. We can move them around. We've got surround in the room so you should be able to hear these things. We can move our listener around. We can orient our listener.

All right, so we've got our context. It's a top-down view, so we're looking at x and y here, not z-- I mean, not-- x and z, not y, sorry. So I'm going to turn a couple of these off so we can demonstrate a couple of these features. So here we've got a source, and he's playing.

As I mentioned, we have cone support now in this implementation. So here we've got some cones. The way the cones work in OpenAL is there's a notion of an inner cone, an outer cone, and an outer cone gain. So here we've got our inner cone. We can change the angle of that cone. And the outer cone, we can change the angle of that guy.

And then we have an outer cone gain, and I'll talk about that more as we go. So the way that the API works is as your listener moves around, as long as the listener is within the inner cone of that source, there's no attenuation occurring at all. So I can move this here. And as I move it-- I'll shut up in a second here. As I move it, you can hear there's no change of attenuation. Alright, so now as the listener moves outside of that inner cone gain and toward the outside of the outer cone gain, we start getting some attenuation.

Now the volume of the source that will be heard by the listener once you're outside of that outer cone gain is based completely on whatever the outer cone gain setting is. So if I change that and raise that outer cone gain, you'll hear that we have some gain.

So those are cone supports. As you can see, it doesn't matter whether your listener is moving around your source object. Your source is moving around your listener. Your source has a different direction. So that's cone support. We'll turn that off. Okay, another thing that I mentioned that's new in 1.1 is speed of sound. So why don't I turn a different sound on here that's kind of good for this. So we've got a car sound. As you can hear, as I move it around the listener, there's no pitch change.

So if we apply some velocity or some speed of sound setting to our source object, It will be indicated by... Oh, wrong guy, sorry. You can see the direction based on that little nose that's coming out of there. You can hear that there's a pitch effect. And this is going to change as we change the direction of that speed of sound.

So the pitch is completely based on the direction of the listener and the source objects and the vectors. As the vectors change with the movement of the objects, you will get changes in pitch. But they're not based over some time distance over some time period like real Doppler would be. All right, so that's Doppler. OK, let's turn that guy down.

Okay, another feature that I mentioned about 1.1 were some new distance models. So let's turn on this guy here. Alright, so what we've been listening so far are all these objects that have been attenuating using the default distance model, which is an inverse model. So you hear as I move the source away from the listener, you'll hear some attenuation by distance.

to be hearing front to back. So that's distance attenuation. So we can switch that to one of the other formulas. Let's go to the exponential model. And right now, you'll hear it's-- pretty similar. As I mentioned, the exponential and the inverse model are similar. Let me go ahead and change the roll-off factor of this guy. We'll change it to 3, let's say. You'll hear that it's quite quiet.

So our roll-off is a lot... Our attenuation occurs much quicker with that formula. So that's exponential. And then we have our linear guy here. Before I do that, let me change the max distance of our object. We'll change it to 250. And we'll switch to a linear model.

As you can see now that we get to 250 away from our listener, we're fully attenuated. All right, so that's the distance models. And one of the big new features, as I mentioned, was capture. So I've got this cool little snowball USB microphone here. And it's the device that's being used for capture. And the way this application works is as soon as it launches, it opens the capture device. And it starts capturing. And this whole time, it's been writing whatever this guy's been capturing into the ring buffer. So I can now-- I'll go ahead and click on this button, and it'll capture what I've been saying here as I've been talking.

It has then copied that data directly into one of the buffer objects and then attached it to a source so we can now play it and move it around our context. capture what I've been saying here. I can now go ahead. Okay, so that's capture. So if we could go back to slides. So those are some of the features as they're working.

Thanks. OK, so OpenAL extensions. The way that extensions work in OpenAL, it's a mechanism for extending the API and discovering whether an extension is there while your application is running. So it's got basically three parts that you do. First, you query for an extension by name. If it's available, then you can go and get proc pointers or constant values, again, by name using the get proc address or the get enum value APIs.

Now, some of the new features I mentioned in 1.1 were some of those features that had existed as extensions in various 1.0 implementations. Now, they weren't on the Apple one, but they were on some various other ones, and they were valuable for the developer community, and that's why they got rolled into the 1.1 spec. So the capture, the distance models, the offset, those were all extensions in a previous life, a 1.0 spec life. In the 1.1 implementation that we've just delivered, we can also get access to those features through that mechanism. So if you have a 1.0 application, OpenAL application that was using these features through the extension mechanism, you can also do that with this latest implementation as well as get at those through the new APIs that have been added.

But enough of the old ones, let's go to some new ones. We've added three new extensions with this implementation. We have a static buffer extension, a Mac OS X extension for doing Apple Mac OS X-specific kind of control of the engine, and an ASA extension, which allows your applications to get at some things like reverb and occlusion. So let me walk through those one at a time. So the static buffer extension is something that we got a request from a lot of developers for, and this allows the application to own the memory that the audio data lives. The way that OpenAL works using the AL buffer data call is that your application passes some data in memory to the library, and then the library copies that and sticks it in its own buffers and manages those resources. So every time you want to have new data to play, the library has to do this mem copy. So the buffer static version of this call looks exactly the same. It's the same parameters. The one difference is that your application owns that memory that the data lives in, and the library then will play that data. Now, you have the same contract that you would have using the AO buffer data call. In other words, that means there are certain times when it's not safe or you're not allowed to change the buffer's audio data. It could be in a queue that's either processing or it's pending to be processed, and that's not allowed to make that change at that time. So basically, it's the same contract. You're going to use the same mechanism for discovering when buffers are safe to modify that you're already doing with the buffer data call.

Okay, so the next thing we have are some Mac OS X, a Mac OS X extension, and this is really just to expose some of the specific core audio underpinnings of the implementation. So we can deal with the mixer sample rate, forced stereo rendering, rendering quality, mixer bus, and let me walk through these one at a time. So getting or setting the mixer output rate is important for you, or may be important for you, depending on the sample rate of the sources you're playing in your application and what sample rate that the hardware is running on your user system. So it's a little easier to explain if I show you this little diagram. So this is basically the core audio stack in the implementation of OpenAL. When your device is open and your context is made, basically what happens is you'll look at the bottom box there, it's the HAL device. That's your user's hardware. Now it's going to be running at some particular sample rate, and that sample rate gets propagated down through the audio units that are used for rendering in OpenAL. So the thing that connects to that device is a default output unit. Now that 48k in this example gets propagated down, and the 3D mixer then is connected to the default output unit. So by default, what happens is all of those red boxes there, those represent OpenAL sources that, for instance, might be playing 22 kilohertz data. What's going to happen is every one of those gets sample rate converted to 48k, and then those are mixed together to whatever stream format we're rendering out to the hardware, and then that gets passed down the chain. Well, that's a little inefficient if you know that you're going to be rendering sources that are that particular sample rate. So you can use this API then, in this example or others, to set the mixer output rate to 22k. then what happens is all of those sources get mixed at their native sample rate. And then when that 22k data is passed to the default output unit, the sample rate conversion gets done there on two or four or five streams, whatever you happen to be rendering to the hardware. So this just gives you a little bit of control to be smart and efficient and make your application just run that much better.

All right, the next thing we have is the rendering quality, set and get rendering quality. And the reason we added this is the 3D mixer audio unit that does really the bulk of the work in this implementation has a notion of various rendering qualities, so you can make a tradeoff between CPU usage and quality. If we're running, if your user's running a system that has four or five speakers, basically this is a no-op. You're always going to be using the low or the normal rendering quality. But if your user is using headphones or a stereo system, you have the ability to give them a high quality HRTF rendering mode. And of course, this is a trade-off again. The HRTF is more expensive, but it may be worth it for your user if they're on a system that can handle those extra cycles.

All right, the next thing we have is a render channel count. The reason you might want to do this is the Apple implementation of OpenAL, what it does is it goes and discovers how many channels are on your user's hardware when the context and the device are set up. So if your user has a system that has five or more channels, OpenAL is going to render to 5.0. If your user has four channels connected to their hardware, if their hardware is running four channels, OpenAL is going to render to quad, And then by default, it runs the stereo.

Well, you might have a circumstance where your user has maybe a 5.0 system, but they want to plug some headphones in. And in that case, you wouldn't want the library to be rendering to 5.0 if your user wasn't going to get all of those channels. So you can force the rendering to stereo regardless of your user's hardware by using these two functions.

Okay, and then lastly, the maximum mixer buses. By default, the 3D Mixer Audio Unit has 64 input buses, and that's sort of the limitation that gets passed into OpenAL for the amount of simultaneous sources that you can have rendering. Now, this is a settable property. The 3D Mixer can have more buses than 64. That just happens to be the default. So if you need to render to more than 64 sources at a time, you can then go ahead and make that setting. Now whenever you make it, you also should then also get the maximum mixer buses to confirm that the setting that you wanted was actually possible and see how many mixer buses were available.

Okay, now lastly we'll talk about the ASA extension. This is kind of the big new thing for us by adding reverb and occlusion and obstruction effects to OpenAL. Now this will be available whenever your application is running on a system that has the 2.2 3D mixer audio unit present and that also shipped with 1047 along with our 1.1 implementation of OpenAL. So right out of the box you should get that.

The extension's pretty simple. It's basically four APIs and a bunch of constants, and I'll talk about the constants in a bit, the properties. There's a getInASetListener property call and a getInASetSource property call. They all take a property, which is an integer, some data, data size, and, of course, the source variants of those APIs also take a source ID, an OpenAL source ID.

So let's talk a little bit about the source properties. First, we have a reverb send level. That's a per source property. It's a wet/dry mix level where the default value of 0 means that there's no reverb being applied to your source. And a value of 1.0 means all you're hearing is the reverb return, no actual direct source signal.

Next we have occlusion. Occlusion is a low-pass filter that gets applied to the source's direct signal to the listener. So you can emulate your source being in a different physical space by using this property. It's a setting in dB. It takes a float from 0 to minus 100. And for occlusion, the low-pass filter is also applied to the signal that's sent out to the reverb. So both reverb send, reverb return, and direct signal get filtered. So, Okay, and now the last source property is the ASA obstruction. This is also a low-pass filter. It gets applied to the direct signal of your source.

Again, it's a float value that's in dB from 0 to minus 100. And the difference here is that the signal that's sent out to the reverb does not get the low-pass filter applied. So all of the sparkly transients of your reverbs will still be heard, even though you're applying obstruction to your source object.

So we have some listener properties. They're pretty standard, what you would expect. We have to be able to turn our reverb on and set a global level, and then just like we have the rendering quality in the 3D mixer, we have the ability to set a reverb quality so you can make a trade-off between the quality of that reverb and how much CPU is needed. So there's a reverb quality.

The reverb also has some EQ settings so that you can apply some EQ to the reverb signal. It's basically a parametric EQ, so if you're familiar with parametric EQ you know that there's a gain which is a cutter boost, a bandwidth and a frequency for the center of that bandwidth. These properties are settable in real time. They're also storable in AU preset files, and I'll talk a little bit more about that in just a second.

So there's a couple of ways that you can get a particular reverb sound into OpenAL. First is using the reverb room type. We've defined a bunch of constants that you can pass as values for there. A lot of what you might expect, various small, large rooms, chambers, cathedral, a lot of, you know, they're pretty self-explanatory. And those are just constant values that you can pass in.

But more interestingly is the ability for you to load AU preset files at runtime. And so those AU preset files are things that you can save by running some signal through the matrix reverb and saving that as a preset and then loading it at runtime. And I'll show you how to do that in just a second.

So just a little bit of a code snippet here about how the extension mechanism gets used, for example, with the ASA extension. So you see that one line of code that's in yellow. We're querying to see if the extension that we want is present at runtime. So we're going to ask for it by name. If that happens to come back true, we're on a system that has it, then we go and, for instance, go get the proc pointer for the ASA set listener call. Now, once this is done, now you can use this function to access that API. So here's a couple of really simple boxes on how you can then set up one of your reverb types. So we've got one here where we're using the reverb room type property. We're passing in the reverb room type cathedral constant, and just by making this call then, that's the reverb that will get used by OpenAL. By contrast, the box below, that little tiny function there, accepts a path to your AU preset file that gets passed on to the setListenerProc call using the ASA reverb preset property.

So with that, if we could go back to the demo machine, I'll give you a little demo on how you could go ahead and make those custom reverbs just with things that are already on your system. So what I've got here is a document. This is an AU lab document.

And Michael's going to talk a little bit more about AU in a bit. But this is a document that has a sound file attached. And in this channel strip here, we also have a Matrix Reverb audio unit installed. So we've got a reverb here. So let me play this sound. Hello, Mac.

All right, so there's our dry signal. - Here's our reverb. Now we can go ahead and make some reverb settings here. I'll just try and do something drastic here so you can hear it. There we go, that's pretty drastic. Now if you notice these last three parameters at the bottom here, the filter frequency, filter bandwidth and filter gain, these equate to those EQ properties that you can set on your listener reverb EQ. So these can be saved along with the preset. There, we'll boost it, do something drastic there. Now we can save this preset out. We're going to save this preset.

Right? So we've saved the preset. I don't need to save the document. Now we have a preset file on our system. Now you can use that preset file to load at runtime. So if I launch my OpenAL example application again, Maybe I spotted this guy before I covered it up. So here's the ASA extension parameters that are exposed in the API.

So let me get just one again so that we can kind of hear the reverb. So what we have right now is no reverb being applied. So let's turn on our reverb. Set the quality to max, that's fine. And let's set it to the cathedral. Okay, so you don't hear anything yet because we haven't actually sent, changed the dry/wet mix of that particular source object. So now here we have OpenAL, source object, sending to a reverb. Let me bring that back and I'll show you occlusion.

all the way down to you can't hear anything. So there's occlusion, now I'll show you obstruction. So as you notice, with occlusion and obstruction, both are being, a low pass filter is being applied to both. But there's no reverb playing so you don't hear a difference in how these behave.

So let me change to a reverb preset that I had made earlier. You know, it sounds like you have a lot of stuff to do before you do any stuff. It's my airport terminal preset. Let me know when you're ready. All right, so now I'll apply some occlusion.

So as you can hear, both the direct sound and the reverb return is being filtered. But as I apply obstruction, if you listen carefully, it may be a little difficult in the hall. If you listen carefully, the direct signal is going to be filtered, but the reverb trails will remain the same.

So really, obstruction is used to emulate something that's really between your listener and your source object. Something that's between your source and your listener object, but in still the same physical space. All that reverb characteristics will stay the same. Just a couple other little things to show you. As I change my custom presets, you'll notice that the EQ settings will change.

Again, because those are stored along with the AU preset files. So that pretty much sums up OpenAL for Mac OS X. I'm going to pass it over to Michael Hopkins now, and he's going to talk about AU Lab, which you just saw, in a little bit more detail. Thank you, Bob. Could I go back to slides, please? Yes. Before I jump right into new features in AU Lab, I thought I'd start by providing a bit of context by talking about audio units and a little bit about host applications as well.

Audio units are the plug-in specification for audio on Mac OS X. The audio unit is packaged as a component bundle, and a host application uses the component manager to locate a specific audio unit and then open it. A component bundle can also contain more than one audio unit. The host application, when it loads the audio unit, can then present a custom user interface, whether it be Carbon or Cocoa.

There are many different types of audio units on Mac OS X. For example, an effect such as a reverb that you heard a lot of today, an instrument such as a physically modeled organ, which can take input from a MIDI device, and then based on the note that's played, generate audio. A music effect, which is similar to an effect, except that it can be controlled via MIDI. And generators, such as the AU file player, which loads audio from a file, or a sine wave generator, which creates the audio programmatically. Mixers, output units, and format converters, as Bob showed you earlier, are used, for example, by OpenAL to take many different input sources, convert them to a common format, and then use the 3D mixer to mix them together and then output them to the default output unit; offline units, which perform processing that can't be accomplished in real time, such as reversing the contents of an audio stream; and finally, the pattern unit, which is new to Leopard, which James will be talking about in great detail later on in the presentation.

There are many different host applications on Mac OS X, and this list is by no means complete. And these target a number of different users, such as the professional user, a DJ performing live in a hall, the consumer hobbyist, or even the developer, where applications such as AU Lab, which we'll talk about in a minute, are used for testing purposes by many third-party vendors.

Let me talk about this application now in more detail. As we mentioned, it's part of your development tools installation on Tiger. And now there's a new version that comes with your Leopard Seed. It supports mono, stereo, and now multi-channel audio units. And it's capable of displaying the Carbon and Cocoa user interfaces, or also providing a generic view if that custom view is not present.

New features in the AU Lab 2.0 version on your Leopard Seed include a patch manager, which allows you to group a number of different tracks together and quickly switch between these groups. A studio view that allows you to see an overview of your MIDI setup and the audio input and outputs for your document. A visibility feature that allows you to toggle the visibility of different tracks based on their type. And our marquee feature for AU Lab, which is improved audio unit support, including multi-channel support and support for the new panner unit type.

Now I'd like to focus a bit on patch management. As I mentioned, this feature allows you to create groups of tracks, which I'll subsequently refer to as patches. And these patches are saved directly in the document file. You have created for you by default a default patch which contains all the tracks in the document. And you can switch between these tracks simply by clicking. Any track that is not part of the active patch does not consume CPU resources since it's not active.

Let me look at an example here. As you can see, we have a rather complex document open. And the patch manager appears on the right-hand side as a drawer attached to the document window. You can imagine that this scenario would be useful for somebody performing in a live performance situation where they have a number of different tracks that are all connected to the same instrument and they want to quickly switch between them.

It's also useful for a developer who has several different tracks that each is containing the same audio unit, but with a different preset. The developer could then quickly switch between them using the Patch Manager feature. So let's look at this in more detail. As you can see in the upper right-hand corner there, the default patch is listed in bold, meaning that it's the active patch. And therefore, you see all the tracks in the document. There are three additional patches defined, the first one being drum and bass.

And you'll see a number of bullets to the right of each track name. And these indicate whether that specific track is part of that active patch. The first item, being in gray, is not active, whereas the base track is, and therefore appears in orange. So if I switch between the default patch and our first patch, the German base patch, you'll see that now we have a different number of tracks that are active. Switching again changes those tracks again based on whether that they're in that patch.

So it's a really nice new feature in AU Lab. Continuing on, we have a studio view, which is also in that drawer. And that is comprised of two components, a MIDI view and an audio view. The MIDI section shows an overview of all the sources that each instrument or music effect is using. And it's listed in a hierarchical fashion, allowing you to easily switch the source simply by dragging and dropping those items. The audio view provides a summary and editing of the audio settings such as the device and also shows you the channel assignments that each one of those input and output tracks are using. We'll see this in my demo in a bit.

And now our key feature, which is multi-channel audio unit support, which we define in AU Lab as being comprised of more than two channels. Each channel of audio goes to a specific speaker, and the speakers can be arranged in many different configurations. We've chosen three basic types to represent in AU Lab based on their popularity and how prevalent they are. The first one, the surround configuration, is typically used by home theater or cinema applications where you have five or more satellite speakers arranged in a circle around the listener, which is represented by the white couch here in the diagram. The listener would be facing the center speaker, which is located above the screen. In this application, we also can have an LFE channel where all the low frequency effects are sent.

An additional layout that we have in AU Lab is the geometric layout, featuring the ever popular quadraphonic layout, for those of you that remember that. And this type of layout allows a regularly spaced geometry, which in the diagram here you'll see is a hexagonal or six channel layout, each of which is separated by an angle of 60 degrees. And this type of layout is sometimes used in concert environments or concert halls.

Our final configuration is a constrained configuration, which you can imagine at a stadium where you're seeing Madonna or something like that. And it allows you to have three or more speakers arranged in an arc in front of the listener, where each speaker is equidistant, and the first and last speaker define a specific spanning angle, which in this case is 60 degrees.

In AU Lab now, every track, instead of being able to support mono or stereo, can now have up to eight channels of output. We do this by defining an audio channel layout, and you can have one per document, regardless of how many multi-channel outputs you have in your document. And this audio channel layout, which James will be talking about in more detail later, defines both the channel ordering and the speaker positions. Additionally, now we have support for the new audio-- excuse me, the panner unit. Or you can also choose to use the built-in surround panner in AU Lab. Now I'd like to switch the demo station, where we'll look at this in more detail.

Demo, please. Thank you. So I'm going to go ahead and launch AULAB here. And once AULAB comes up, you'll see that we're presented with a document configuration assistant that allows us to specify the input and output channels of the document, starting here with a stereo default output. I can choose to add or remove channels here or change the configuration of each output track. And you'll notice that the view always chose the total number of channels that the outputs will have.

For the purposes of this demo, I'm going to go ahead and add a single multi-channel output track. And the dialog expands to allow us to choose the configuration of that track. And you'll also notice that we have a diagram indicating the position of those speakers. We can determine whether we want to include a center channel or an LFE channel, for example. And you'll notice that as we choose those, the view updates to show you how many total channels your output will be using. And these allow me to choose the formats that I mentioned earlier, the constrained format, geometric layouts, as well as the surround layouts. And I'm going to go ahead, for the purposes of this example, and use a 5.1 surround layout where I'm using five channels, no LFE channel. I click Next and now configure the input by adding a single stereo input.

And now I'm ready to configure the device that we'll be using for our document. I'm going to switch from built-in to our FireWire device, which supports more than stereo. And you'll notice now that I have a five-channel output, but I need to specify the channel ordering for that, since this is not correct. In this hall, I have left assigned to channel one, right assigned to channel two, center assigned to three, And I have a gap between 5 and 6 where the LFE is here.

I'm going to go ahead and click Done to create that document. And what's important to notice is that this document works exactly the same way as the previous version of AU Lab. We've just extended all the features to provide support for multi-channel. So for example, you'll see five channels here in the meters instead of just two.

And if I open the new studio view, you'll notice now that we have a graphical representation of where our inputs are on the device as well as our outputs. And we're free to edit that if we want, but that's beyond the scope of what I'd like to show here.

Adding an insert is simply selecting that from the insert pop-up menu. So I can add a stereo delay, and then if I want I can choose to add a stereo to five channel effect such as a matrix reverb, etc. etc. etc. Again this works exactly the same way that it did previously with mono and stereo.

Not only do we support multi-channel effects, we also support multi-channel instruments and generators. So for example, if I add an audio file player, I can show the additional details and specify that that's a five-channel player. For the purposes of this example, I'd like to just do a mono one so I can demonstrate the panner a bit easier. So I'm going to create that generator. add, set up my preset here. Let me reduce the volume so I don't blow your eardrums out. Okay, let me go ahead and play that.

And as you can see by the meters here, most of the sound is going towards the center channels because that's where my panner is aligned. So as I rotate that, you'll notice by the meters that's more left, more right. behind the user. I also have the capacity of specifying the distance from the listener to the source where the listener is in the center of the knob. closer, farther away. Okay, I'll now turn this up so you can hear the effect.

It's also important to note that if I so choose, I can bypass this default surround panner and use one of the panner units just by adding that into the track. But James will be talking about that in more detail. In fact, I'd like to now turn over the rest of the session to James, who will be speaking about multichannel and other topics. Could I go back to demo, please? JAMES DICARLO: Slides. DAVID CHANDLER: Excuse me, slides.

Okay, my name is James McCartney. I'm going to talk about writing code for multi-channel audio and core audio. You primarily do this through the audio toolbox. The main thing you need to know to write code for multi-channel audio is what an audio channel layout is. An audio channel layout is a structure that's metadata on top of an audio stream basic description. Audio stream basic description is the structure that we use to describe audio throughout Core Audio. So an audio channel layout is composed of three main parts. the ordering of the channels more specifically. So this is what the audio channel layout structure looks like. You can see the channel layout tag, which is defined as an integer. Then there's the bitmap and then a variable length array of channel descriptions.

The channel layout tags define many predefined formats. There's in CoreAudioTypes.h, you can find many of these formats listed. And you can always mask off the lower 16 bits of the channel layout tag to find out how many channels are in that tag, and then the upper 16 bits are a unique identifier for that tag.

So there's--you can look in Core Audio Types.h. There's very many of these to find. Oh, I should point out, there's a couple of special ones here. "Use channel descriptions" means that you need to refer to the array of channel descriptions. It's not a predefined layout, so it'll be defined by the ordering in the channel descriptions array. And then "Use channel bitmap" means it's a bitmap in WAV style, so you look in the bitmap field. So, the channel descriptions in the array of channel descriptions are structures that look like this. There's a channel label, which is an integer that tells you which channel you're dealing with. And then there's flags and coordinates. And the flags, there's two flags currently defined which tell you which coordinate system you're working in, whether it's rectangular or spherical. So the channel labels, there's also very many defined channel labels. You can see the most common ones defined here. Use coordinates means there's no name for the channel. You just use the coordinates that are specified there. And there's unknown and unused for describing like hardware layouts and things like that. So these are like a rough view of the channel labels that are defined in CorelDutypes.h. This is showing the abbreviations given in the SMPTE recommended channel abbreviations. So this shows the listeners in the center, this shows the basic orientation of many of the speakers. There's even more than this defined. Now there's a lot of operations you can apply to audio channel layouts and they're all done through the audio format API. There's properties in the audio format API for getting the number of channels in a layout, getting a full array of channel descriptions if you have a tag or a bitmap. And you can also get a matrix of down mixing coefficients if you're trying to move from one channel layout to another, and you want to cross mix in a standard way, you can use the Matrix Mix Map property.

You can also get names for channel layouts or individual channels and there's a number of other operations you can do as well. So when you're writing Using an audio unit, there's two properties on audio units for supporting audio channel layouts. Each input and output element of an audio unit, you can ask for its supported channel layout tags, and that will pass you back an array of the channel layout tags that are supported for that input or output of the audio unit. then you use the audio channel layout to set or get the channel layout tag that is set on that element. So now, in some cases, you don't need to use audio channel layouts. If you're writing an audio unit that only does mono and stereo, one in two channels is implicitly assumed to be mono or stereo, so you don't need to support setting or getting the mono or stereo layout tags for those elements. They're just assumed. If you are writing a multi-channel audio unit, but it doesn't care about spatial location, such as a multi-channel parametric EQ, you're just going to equalize each channel. It doesn't matter which channel it is. Then you don't need to support channel layouts. However, if you're writing an audio unit where you do care about the spatial location, or you're writing an audio unit that is going to change the number of channels to a number that's other than one or two, then you need to support audio channel layout so that the other audio units down the chain know what they're dealing with when they get the multi-channel audio in. So if you're writing a host application, then you need to be aware of channel layouts, because you need to get the channel layouts for multi-channel audio unit output elements and propagate them down chains of audio unit inputs.

and then being aware of audio units that may not support channel layouts and continue propagating the currently valid layout down the chain. And if there's not a channel layout on a particular stream, then you need to clear the input layout of an audio unit along that chain. When you're encoding audio, you need to use channel layouts to tell an encoder which layout of audio it's going to be encoding. There's two properties for this in the audio converter. The available encode channel layouts, which tells which channel layouts an encoder supports, and then you can set it with the codec channel layout property. In audio files, you can set channel layouts and get channel layouts from audio files. It's currently supported with AIFF, WAV, and CAF files. WAV defines the bitmap field for the channel layout, so you can only use channel orderings that are compatible with the WAV bit field layout. Okay, so that's audio channel layouts. Next, I'm going to talk about a new kind of audio unit that we're shipping in Leopard called panner audio units. And panner audio units includes a more broad sense of spatialization, not just merely volume panning. We're shipping four panner units in Leopard, the sound field panner, a spherical head model panner, a vector-based panning algorithm and then head-related transfer function panner for 3D stereo. All panner units are required to support six parameters. There's gain, which goes from zero to one. Azimuth and elevation are in degrees. Distance goes to zero to one. And then there's a coordinate scale in meters. So distance in meters equals distance times coordinate scale. And then there's a reference distance, which is the distance below which there's no more attenuation. It's a flat distance, a flat gain. So here's a graphic that shows azimuth. Zero degrees is front center going to positive 90 degrees to the right, and then negative 90 degrees is directly left, and plus or minus 180 is behind you. Then distance is a number from 0 to 1 that goes from the center to the maximum distance, which is defined by the coordinate scale. And then-- OK, this should have been done with core animation, but it was a lot harder probably to do in Keynote than it would have been in core animation. So then elevation is a-- An angle in degrees where zero degrees is the horizon, and you go positive to go up towards the zenith, and then negative minus 90 would be directly below you. So the negative elevations are below the horizon. Okay, so if you want to write your own panner unit, in the SDK with Leopard, there's an AU panner-based class. It provides support for the required parameters for bypass rendering and for getting and setting channel layouts. So you just need to subclass that, and your subclass needs to provide an implementation of the panner render function and distance attenuation data property, and then you can override get channel layout tags if that's necessary.

So panner render looks just like AU-based render, has the same input and output parameters. Bypass has been handled for you in the panner-based render. class, so you only have to deal with your own signal processing in here. The distance attenuation data property is a property that the view calls to find out what your distance attenuation graph looks like. You support this if your gain is a function of distance for your audio unit. And the value of the property is a structure, which is a variable length array of pairs where the view passes you a distance in meters and you fill in the output gain for that distance. Thank you.

OK, and this is the getChannelLayoutTags method that's defined in AUPannerBase. If you want to support different channel layout tags than the base class supports, you would override this method and change the output scope to return whichever layout tags you will support. And then you return the number that you're returning, the number of tags that you're returning. So AU Lab in Leopard provides a generic panner view for panning units. And you can, of course, provide your own custom Carbon or Cocoa view for that. And so I'm going to show quickly how a panner unit works. Can I go to the demo machine, please? OK. So I'll just start AU Lab.

All right, so this is an AULAB document that has a file player in it. And I'm going to be bypassing the built-in panner. And I'm going to add a stereo to five channel panner. I'll do a sound field panner. So this is the generic panner view that will come up for any panner that you write in AU Lab. So if I play it, sound. I'm very attenuated because I'm very far away.

What's your big plan? I'd like to make a home movie or-- Sorry. I'd like to try my built-in camera. You can see as I'm controlling this, it's controlling the azimuth and distance properties here. Bring it up a little bit. And then down here I can control both the reference distance and the gain. So I can control the... The distance in the way things fall off is a function of distance. Maybe create a website, try out my building camera.

So what about you? Well, first I've got to do drivers, and I've got to erase the trial software that came on my hard drive. Sweet. And I've got a lot of manuals to read. So you can see these are the six required parameters, and then you have these more interesting controls here. Then elevation, if you have speakers at elevation, you can control that with this that goes from minus 90 to plus 90, like that. So that's the panel unit demo. create a website, try my built-in camera.

OK, go back to slides. OK. OK, and another new feature in Leopard is MIDI output for audio units. To support this, you support two new properties. There's a MIDI output callback info property and a MIDI output callback property. The MIDI output callback info property is a read-only property of audio units. The value of the property is a CFArray of CFStrings. The size of the array is the number of MIDI outputs. And each entry in the array is the name of a particular MIDI output supported by the audio unit.

And the array is the responsibility of the host to release. Then the host will provide a callback structure to the audio unit, which contains a few units we'll use to call back to send MIDI to the host. This is how the function pointer is declared. It takes the user data parameter that you are you are passed by the host. The timestamp is the timestamp that you are provided in your render call. You tell it which MIDI output number you're using corresponding to the MIDI output, the names in the CFArray, and then you pass a MIDI packet list.

And this shows how you would call it. It looks pretty similar. You would store the MIDI output callback struct in your audio unit, and then you will call the function to the function pointer, and passing it the user data, the audio time stamp, the MIDI output number, and the packet list. So some notes about this. The callback is only made inside the render function. So when your audio unit's called to render, that's when you can call the host's MIDI output callback and send the MIDI to the host. And the audio timestamp is what you were passed in render. And then the timestamps in the MIDI packet list are the sample offsets relative to the audio timestamp argument in render. OK, that's the end. Thank you.

Thank you. Twice now you've got me on that one. So thanks very much for coming to the session on Cordio Surround. We've only really got five minutes, so I think we won't have a formal Q&A. We'll just be down here at the front if you want to come and ask questions. We will be in a lab tomorrow from 2pm till 6.30pm, so if you do have more questions there or you want to bring some sample code to look through we've got an API list that we run. We monitor that very closely. That's available. There's general development information for audio at Apple's audio website at the developer website. Then there's some new documentation as well and a general overview for Cordeo and some details on developing your own audio units. So that's available and we'll be adding more there over time, I hope. So thanks again very much for coming and have a good conference. Thanks for watching.