Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2003-407
$eventId
ID of event: wwdc2003
$eventContentId
ID of session without event part: 407
$eventShortId
Shortened ID of event: wwdc03
$year
Year of session: 2003
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC03 • Session 407

Audio Overview: Mac OS X Audio Rocks!

Application Frameworks • 1:09:55

With Mac OS X, professional-level audio is designed right into the OS, and features ultra-low latency, high resolution, and multichannel capabilities, with the ability to be flexible and extensible. This session presents an overview of Apple Audio Technologies, system services, drivers, and hardware. We discuss AudioUnit and MIDI, and provide insight into the design strategy and fundamental paradigms implemented throughout audio on Mac OS X. We address all APIs, so view this session, especially if you are new to audio on Mac OS X.

Speakers: Craig Linssen, Nick Thompson, Bill Stewart

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

I'm Craig Linssen, Music and Audio Partnership Manager at Apple, and today's session, you're going to learn why Mac OS X audio truly does rock. So what we're going to cover today, we're going to go over some developer market opportunities. We at Apple, we're excited about our advancements in the music and audio space recently.

And we'd like you to be able to share in our success. The market is really large, and I'm going to go over a couple of slides to show you just how large that market is, and hopefully show you some new opportunities that you can tap into with your applications. We're going to cover some audio device driver changes in Panther. And we're going to go over the core audio architecture design objectives.

Finally, we're going to dive right into the APIs and show you how to get started. Whether you're a game developer, a music and audio developer, or you just want to input, output, or process music or audio with your application, you're going to learn about the APIs that you need to get started today.

So market. The following is a slide of the estimated number of music and audio creators in the United States today. At the top end of the spectrum is the pro market. There's approximately 50,000 aspiring, sorry, 50,000 professionals who are doing music and audio day in and day out and making a living at it.

100% of those individuals are using a computer in their creation process. Now, in the middle end of the pyramid is the aspiring professional market. These are individuals who aspire to be at the top of the pyramid. They aren't quite making a living at it, but they spend a lot of money on their gear, and it's a very large market.

However, only about 30% of those use a computer in their music and audio creation process. And at the bottom end of the pyramid, you'll see the creative individuals. 55.9 million individuals in the United States who are creatively inclined to do music and audio, and they're not just going to be doing music and audio in some fashion or another. These could be guitar players, people who are just learning to play a musical instrument, students, DJs, hobbyists, etc.

But only about 5% of those individuals currently use a computer in their music and audio creation process. And one of the things we want you to be thinking about in today's session is how you can be using our core audio API services to create applications that are easy to use, and are going to show these individuals how they can use a computer in their music and audio creation process.

This represents about 26% of the United States population. That's one in four individuals is your market. It's a really big market, and there's a lot of opportunities there for you to tap into. So let's drill down into it just a little bit further. The purple wedge that you see up here represents guitar players. Now, there's about 28 million guitar players in the United States. A lot of guitar players.

15 million keyboard players, 9 million DJs and remixers and producers, and the rest of the pie chart breaks down via brass, woodwind, percussion, orchestral, etc. And again, the point I want to drive home here is that not a lot of these individuals are even aware that they can use a computer in their music and audio creation process. It's never been easier than today to create a music and audio application on the Mac. Using our Core Audio API services, you're going to find out today just how easy it is.

So what were some of our design objectives in our Core Audio API services? Well, two things: ease of use and performance. We wanted simplified user configuration for the end user, and streamlined development for the developer. Performance that was built with the professional in mind. In a lot of cases, it was built with direct feedback from the professional music and audio developer and musician communities.

It's the most feature-rich audio platform out there. Core audio is multi-channel and high resolution, with sample rates up to 24-bit, 96k, and beyond. It's extremely low latency, and we have native MIDI support built in. If you've ever looked in your Utilities folder, you might have seen this little keyboard icon up here, which represents AudioMIDI setup. You can configure your entire Audio and MIDI studio with this utility. Very cool stuff.

We have Class Driver Support, plug-and-play support for all of your hardware, spec-compliant USB and FireWire Audio and MIDI devices. We highly encourage all the hardware developers in the audience to really look at making your hardware devices spec-compliant. It's really important. It cuts down on your development time and costs, and it makes it much, much easier on the end user. It's plug-and-play device support. You just plug it into Mac OS X, and your device will work. No need to worry about device drivers.

And finally, AudioUnits and AudioCodecs: Apple platform standards which extend the capabilities of the operating system via DSP plugins, virtual instruments, and AudioCodecs, making it much, much easier in your development. Now, we're going to get into AudioUnits and AudioCodecs in a lot more detail in Bill's session coming up here in just a few minutes. But one of the biggest questions I get from developers about AudioUnits is, well, who uses this? The following slide should really speak for itself.

A really large development community has already sprung up around AudioUnits, and it's getting bigger every day. There's a lot of individuals out there who are very excited about the work that Bill Stewart's team has done, and he's going to get up here in just a second and get into a little more detail. So I'd like to bring up Nick Thompson. He's going to talk about audio device drivers in Mac OS X.

Thanks, Craig. Hi, my name's Nick Thompson, and I manage the audio driver team in Apple's hardware division. I wanted to kind of give you an update on, kind of talk about the drivers that are in Darwin, and how that's structured, and also give you a little bit of an update on what we've done for the Power Mac G5, and changes that you might need to make for your product.

And then I wanted to cover class drivers for USB and FireWire, and point you to some resources for when you're writing your own drivers. So looking at the structure, kind of a block diagram of kernel-based drivers on Mac OS X, they're all pretty much based on our audio family. When you look at the built-in drivers, you'll see there's basically a super-class Apple Onboard Audio, which encapsulates most of the common chip features for each chip.

And then we write plug-ins for each chip specific to the platform. So when you're looking at the code in Darwin, that's kind of the structure of it. And we also develop class drivers for USB and FireWire. And the source code for the USB driver is in Darwin currently. The source code for the FireWire driver isn't in Darwin currently. And we're planning on getting that in there after Panther.

For the Power Mac G5, there's a couple of new things that you've probably already heard about. I think the biggest one is this is the first computer from Apple with built-in digital I/O for audio. And we're really excited about this. We've also made some improvements to the analog section.

It's basically similar to previous computers, but we've added support for 24-bit data, as well as 16-bit data, and also added support for different sampling frequencies. So previously, it was basically CD quality output. Now you can have 44:1, 48, 32K at different depths. And I want to talk a little bit about some of the changes that are needed for devices and device drivers on the new computer platform.

The really cool thing about the digital section is that we do clock recovery on input. This means that you can do bit-accurate copies of original source material, which, if you're a musician, really matters to you. The other thing, as I mentioned, is that both on the analog and digital parts, we've got support for new sampling frequencies and different sample sizes. The connector that we use is basically a Toslink connector, and the specs that cover this are up there. It's IEC 608.74-617. It's basically the standard friction lock connector that you'll see on most consumer AV devices that have optical connectors on them.

So the back panel, you'll see the two connectors. We're testing with every piece of consumer AV gear that we can find. The cable is just the standard TOSLINK connector that you'll be familiar with if you've come across optical gear before. I wanted to talk a little bit about AC3 support. Obviously, because we support digital output, it's possible to either stream PCM data across there or encoded streams. There's an important thing to know about encoded streams.

You can either output data from all apps as PCM data, and it's mixed together in the same way that the analog section is today. Or you can have one single AC3-encoded stream, so if you have, say, a DVD player app. It's important to note that you won't hear system alerts if you're in this mode, which we've termed "hog mode." If you're developing PCI cards, there's some changes that you need to be aware of.

The biggest one is that we're no longer supporting 5-volt signaling on cards. And you can tell whether a card is keyed for 5-volt signaling. There's basically a notch at the back end of the card. If it has both notches removed, it's a 3.3-volt universal card, so you're OK. We're encouraging you to visit the compatibility lab, and make sure, if you have your cards, that they work in the new computer. And the G5 labs are downstairs.

There's also some changes that you'd want to make in a kernel-based device driver. Basically, we need you to make your drivers 64-bit safe, and we've seen a number of drivers from third parties that currently don't work. If you have any questions about this, come find me. We're happy to work with developers and make machines available if you can come to Cupertino, so that you can get your stuff working. But basically, there's a bunch of macros that are kind of ambiguous right now. They've been replaced with specific macros for the word size that you're using.

You also need to remove any assumptions you have in your driver about logical and physical addresses being the same. When we were porting the Apple onboard audio drivers, we basically came across places where we were making these assumptions and we weren't working. So, you know, we had a call in there, physical to KV, which maps physical addresses to kernel virtual. You need to basically work around these issues. And then finally, you also need to make sure that you prepare memory for I/O. If you do this, a Dart entry is created and the world is good. If you don't do this, memory isn't where you think it is.

Talk about the class drivers a bit. First with USB. Basically, using our driver is going to save you development costs. So you should try and make sure that if you're developing a USB device, it conforms to the USB audio spec. We've basically implemented support for everything we've seen so far. And I think the message of this slide is, you know, if you have a device that you're working on with a mixer unit, for example, let us know, and we'll make sure that the driver works with that.

We're also working on the USB Device 2.0 specification, which is kind of different to USB 2.0. It gets a bit confusing here, but this is basically the specification for USB audio devices on both full-speed and high-speed USB. We're going to be making sure that we provide support for those new devices in the class driver. If you are working on a USB 2.0 audio device, let us know. We're interested in taking a look at it and making sure that we work with your hardware.

This year is going to be a big year for FireWire, I think. We're working with a number of developers doing FireWire products. Again, using the standards-based driver is great for both us and you. It reduces your development costs. And for us, it makes it easier for customers. When they buy a device, they can plug it into their Mac, and it should just work.

Basically, there's kind of two flavors of FireWire devices out there at the moment. There's AVC devices, and we're seeing silicon from Bridge, Conox, and Semiconductor for those. And there's MLan devices from a variety of manufacturers, including Yamaha, Korg, Atari, Presonus, Kurzweil, Apogee, and the list is actually growing. And we're aiming to basically support any silicon that anyone comes up with that's basically class-compliant.

In Panther, we've made some changes. The driver that we shipped in Jaguar was essentially a driver for speakers. We are now adding music subunit support, so we're providing input and output support, different numbers of input and output streams, support for MIDI. We're also working on lowering the latency in Jitter.

This is really important to us for a variety of reasons. The other thing that we're doing is, if you have a network of devices such as an MLAN network or a network of speakers, we're making some changes in how they're presented. I wanted to talk a little bit about what happens when you hot-plug a device.

So basically, when you hot-plug a device, it'll show up. We'll create a unit for it. And we'll start building the stack that we need. So we'll build a device reference. We'll build an engine reference and start linking that up. And then we'll start building stream references for input and output.

We're going to start building stream rests for the output stream and stream rests for the input stream. What happens when you plug in a new device right now is that we'll create a new engine, which is inefficient because you're going to start getting interrupts on each engine. So what we're going to do in Panther is basically start linking the whole thing together so the entire network is presented as a single device.

So a new device is hot-plugged. We'll actually create a new stream ref for the input, but the output will be the same output device. So the way that this is going to wind up getting presented is you'll see two input devices with four streams on them and a single output device with eight streams on it. So the goal here is essentially to present a network as a single block device, which is the intention of MLan networks, but it's also, for speakers, a much more intelligent way of doing things.

Finally, I wanted to point you at some developer resources. There's driver code in Darwin. You should definitely check out iAudio Family if you're doing a kernel-based driver. You should also really consider whether your driver needs to be in the kernel. It's possible to hook into Core Audio and write a driver that sits in user land, and that's often a better way of doing things. For PCI cards, you do want to be in the kernel.

For FireWire, maybe you want to be in the kernel, maybe you don't. You need to think about it on a device-specific basis. We've also got sample code in the SDK for the Phantom Audio Driver, which is a great place to start. And we're looking at, hopefully by the Panther timeframe, getting out a sample PCI driver for a couple of devices, because a number of developers have been asking for that. And as always, you should check out the Audio Developer page at developer.apple.com/audio. So that's it for drivers. I want to introduce Bill Stewart to talk about Core Audio. Thanks, Greg.

[Transcript missing]

As Craig discussed earlier, when we first started with Cordio, we wanted to set a high standard for ourselves to reach. I think this was a very important initial step to take for us because it's extremely difficult if you aim low and then have to scale up. It's a lot easier if you aim as high as you need to aim and then you can appropriately scale back.

We really looked at what was required from the pro audio market. This is not to say that this is a set of APIs that can only be used in musical applications or pro audio workstations and things. It really can scale down to games and to lower sample rates and lower quality abstractions and so forth. We wanted to start at this point. We had requirements.

We had requirements for latency and jitter that were very important to us both for MIDI and for audio. We wanted obviously no restriction on sample rate, which was a problem with previous versions of the OS. Multi-channel awareness throughout the system, not just with devices but also with codecs, with audio units, with the whole way that we think about audio in the system. And we didn't want to be limited to just some.

We wanted to have a small subset of how we can represent audio data. We wanted to have rich abstractions that can be applied throughout the system. And the session after this one will go into some more detail about how we represent data in the different subsystems of core audio.

So let's start at the bottom and we'll kind of work our way up. So I'm not going to spend a lot of time in the Cordial HAL part. This is very much a low level interface to devices. There's a lot of abstraction here for the devices, but there's also a lot of specific device state that you need to manage.

And if you are in a situation where you really need to interact very intimately with the device, this is the API that you use. This is in the Cordial framework. We affectionately call it the HAL, the hardware abstraction layer. And you'll get all of the characteristics of devices published here, configuration, system device preferences, management of the device status and all that kind of thing.

And for MIDI, we've got the Core MIDI Framework, and this is really the APIs that are published through here for transporting MIDI data through the system, both in and out of the system through drivers, as well as into application. In Jaguar's Core MIDI, we have a concept of virtual sources and destinations, and we've found that a lot of developers from Mac OS 9 and the MIDI services that OMS and FreeMIDI gave, that one of the things that have been missing from the way the system is used is an IAC bus, which is an Inter-Application Communication Bus.

And this is basically just a driver that looks at... It's a software that looks like a driver, but it actually gives you a way to kind of connect MIDI between different apps, and it looks like you're dealing with drivers, rather than sort of having to do extra work to look for virtual sources and destinations.

So this will be a new feature in Panther. And the other thing, of course, with Core MIDI is needing to configure device and publish device characteristics and so forth. I'm going to go to the demo machine and just... very briefly walk through the Audio MIDI Setup application, just as a way to kind of give you some sense of how all this is put together. So this is the Audio tab. We've got the Mark the Unicorn 896 box here. We use different devices each year. We've used eMagic and Delta M-Audio devices. We thought we'd use the Motu one this year.

This top part of the section here is the System Settings, and these are the default input and output. They're typically the devices that are used by apps like iTunes, by QuickTime Player, by games. And the user can typically see in the sound press, they can see like these are all the output devices I've got, and they can choose which one they want to use as their default output. And another thing we introduced in Mac OS X was a distinction between the device that you would use, for playing audio on, and the device that you would use for things like sys beeps and so forth. So the system output device is the device where your beeps go.

[Transcript missing]

If I go to the MIDI section now, I've got one USB device here from Roland. It's an MPU64, and it has four MIDI ins and outs. I've got a studio set up here and -- that's not Dre. I'll leave that over there. Early software. So one of the things I wanted to talk about with this app is a little bit of confusion about exactly what it's meant to be doing, and this really reflects, as the audio side does, both of these apps, if you're developing and you're familiar with the API, they really reflect the -- a lot of the structures that are in the API themselves.

And so you can sort of see this as like this is what the user sees, but also as a developer, we can -- just with a couple of grafting concepts here, we can understand what you're seeing from an API point of view. So this is a driver. This driver has what is in MIDI is called MIDI entities, and it has four MIDI entities, which are these pairs of in and out ports, and the in and out ports are MIDI end points, and each MIDI end point takes a full MIDI stream, so that's 16 channels, and you're able to, you know, talk to whatever is at the end of that stream or get the data from the end of that stream. So this configuration here has four MIDI entities, and each MIDI entity has an end point for in and out, and what we're doing here is describing the driver.

Now, what is out here is just three devices that I've added by just doing this add device thing, and I can add a new external device, and I'm not actually creating any -- it's not like I'm going to really create like a whole patch flow here. All I'm doing is describing to the system what I've got plugged into what the system knows about, which is the actual driver.

You can't really do through connections here in the sense that that will like just automatically route. You have to route these with your cables. You really just have to route these with your cables. So this is a very simple way of describing to the system the keyboards or the control surfaces or the modules that you have on your system, and you can save different configurations, and if I unplug the USB device, the driver will go offline, and you'll see that reflected in AMS. And what can be done here is so on -- rather than sort of showing this as port 4 of USB, you know, SMPU64, you can actually present by querying core midis, APIs, you can actually present my synth module as the destination for that to the user.

Now the thing that's missing from this that I'm sure many of you who have dealt with OMS and FreeMIDI is some way to describe what is actually in the device itself. If it's a Roland U-20 or U-220 thing, what patches does it have, and what is the current patch banks that I've had, what are the capabilities of the device? We've worked with the MIDI Manufacturers Association to describe a document format for that using XML.

The spec has been ratified and has passed through all of the MMA processes for doing that. We wanted to do this through a body rather than as an Apple sort of only initiative, just to make this something that could be broadly adopted both by manufacturers and other computer companies besides ourselves, so that for the user, there's one data format.

The document's not available yet from the MMA site, though it will be soon. We're going through the final stage of actually making the documents clearer about what it contains, so that it's an easy to understand document. You'll be able to author these XML files yourself to describe custom patches, and we hope that there will be websites available that manufacturers will publish them for their devices.

And so then the user can see my synth module, and it's a Roland synth module, or yada, yada, and it's got these patches on it. And just like with OMS and FreeMIDI on Mac OS 9, it should be as easy an experience for the user on 10. If we can go back to slides, please. Right, so that's the bottom layers. Let's get into AudioUnits. So AudioUnits, we have a logo, and that's the logo.

I have to... took some effort, I can tell you. So audio units and I'm going to say that these are a generalized plug-in format and by generalized I mean that it has very many different uses. It's not just for effects, it's not just for software instruments and what we'll do today is later on as a demo for some other audio units and I just want to give an overview generally of what audio units are, the types of audio units that you can have and their kind of categories and their functionality.

So how does it work? An audio unit uses the component manager, and we use the component manager because it has some nice features about it. It has discovery mechanisms. Find next component. You can just make this call, and you can specify the type of audio unit that you're looking for, and we have different types. Once you find the audio unit that you want, and you can either be doing this programmatically or you can present menus based on what you've discovered to the user, then you open that component, and once you've opened the component, you get back what's called a component instance.

So you can think of a component in object-oriented terms as a class, specifies an API, specifies what a member of that class can do, and the audio unit itself is the component instance is like an object, an instance of the class. And so audio unit as a type def is type def to a component instance.

So you've opened your audio unit. So what do you need to do to it? Well, it can be as simple and complex as the type of audio unit you're dealing with, what you want to do with it. And the first step, really, for an audio unit is to look at the properties that it has. And properties really represent, in this kind of sense of the property mechanism, it's the state of the audio unit. And so it is a general mechanism.

It's extensible. We define property types. You can get information about the property. How big is the property? Can I read the property or write the property or both? I can use the property mechanism to find out the state that the unit comes up in, and I can change the state of it and so forth. And it's really the way that you manage that fundamental state of an audio unit.

And then once you've sort of set up the state of the unit, then we have a second phase, which is initialisation. We split these up because there's often a lot of things you might want to discover about what an audio unit can do, particularly if you're in, like, a hosting environment, before you initialise it and make it able to render and able to do its job. And so audio unit initialises this concept of allocation in order to actually operate.

And once an audio unit is initialised, then it's considered to be in a state to be usable. And that is the one call that you do, really, to use it, is audio unit render. And I'm not going to go into the specific details of the API for that, but there's a lot of arguments that you can pass to this and flags and so forth, and they're fairly well documented in the headers and you can ask questions on the list.

Now, we're ready to go. We're going to call it a unit render, but where are we going to get input data from? You can get inputs for an audio unit from two different places. We wanted to have, with audio units, an idea that we can connect them up into processing graphs, or we can use them independently, either just one-off, or maybe two or three, and then I want to provide data to it, but I don't want to have to be an audio unit to provide data to another audio unit.

So you can also have a callback function, or you can have the connection. So there's two ways that you can provide data to an audio unit, and at this point we're talking about audio data to an audio unit. And so when you call audio unit render on an audio unit, it's going to go, if it wants input, and some audio units may not want input, and we'll have a look at those in a minute.

It'll call its input proc, or its connection for its input data. When that returns, it's now got input data. It's now got input data to process. It processes that data, and then you get that back in the buffer list that you provide to audio unit render, and you're done.

Well, are we done? No, not quite. Because one of the things you want to do when you're processing audio is you want to be able to tweak it. You want to be able to set delay times differently. You may want to be able to set frequencies. If you're talking about volumes, you want to change volumes and so forth. So all of these are considered as real time operations on the audio unit, things that you can do to the audio unit while you're in the process of rendering it.

And we abstract that into what we call parameters. An audio unit using the property mechanism publishes a list of the parameters that it allows the user to manipulate. It publishes things like what's the range of the parameter, what kind of values does it have, maybe dB or hertz or maybe just a generic zero to one sort of parameter.

There's a whole bunch of different types of parameters that an audio unit may publish. And it really, this can, we've seen some third party units that have a couple of hundred parameters. A lot of our units may be fairly simple. They may have just two or three parameters. It really depends on what the unit is doing and what the developer of the unit wants you to be able to manipulate.

So we have effects units, and that's really kind of the meat of a lot of where we think the third parties will be developing units. In Jaguar, we ship various filters. There's high, low pass, band pass, high shelf and low shelf filters. We ship a reverb unit, and the reverb unit has quality settings that you can adjust.

The quality really determines how much CPU the unit's going to take at runtime, and it actually affects the quality of the rendering. And for things like games and stuff, they may be less concerned about a very high quality and more concerned about the load that the unit's going to take.

And we have a digital delay unit. We have a peak limiter. In Panther, we've added a multiband compressor unit. It's a four-band compressor, and it's pretty nice, actually. And we're not going to publish a couple of hundred audio units as a company ourselves. This is, for us, the big part of audio units is what developers are going to do with this. This is where it can get very interesting and very bizarre sometimes.

To sort of summarize that, we create one, we've got the state management, we've got the resource allocation, we've got the rendering, and we've got the control of the rendering. Well, is that all we need? No. Typically, if you've used Logic or if you've used Digital Performer or any of these sort of hosting environment type applications, you want to present some kind of view to the audio unit so the user can interact with them.

We've published a generic view which will just query the parameters that the unit tells us and we'll assemble like a bunch of sliders and that's, you know, it's not bad but it's probably not as good as what you can do if you really understand your DSP. And so a developer of an audio unit can publish a custom view and some of the views, if you've seen these, are pretty creative and pretty interesting. In Jaguar, we only had the ability to publish Carbon views. In Panther, we're adding support for Cocoa. So you can publish a class that implements a Cocoa protocol.

And it can be discovered from asking the audio unit for its Cocoa view. And a Carbon app can put a Cocoa UI up in a separate window and a Cocoa app can put a Carbon UI up in a separate window. So there's not a lot of extra work on the hosting side to deal with both of these and we think that probably a lot of developers will be very interested in the Cocoa UI so it's there now.

And the other side of this is communication and control mechanisms. In Jaguar we have a parameter listener API. It's in Audio Toolbox, Audio Unit Utilities.h and it is an API called AUParameterListener. I always get told off when I'm wandering around, but I just can't stand still so I'm going to keep wandering.

Excuse me. So the parameter listener, when we were looking at this to design this, there's two ways that you can do this. You can have a UI or an app that is going to, you know, 30 times a second or whatever, it's going to poll for the values of the parameters, see if anything's changed, and that seemed to us a less than elegant way of dealing with this. So we decided to do a notification service and the notification service is aimed at allowing anybody who's using the app to be able to use the app.

So we decided to do a notification service and the notification service is aimed at allowing anybody who wants to know about a parameter changing on an audio unit to be able to listen to that parameter, to see that that parameter has changed and then to react appropriately. And then when they want to set a value of a parameter, if they just use the standard audio unit set parameter call, that's not going to invoke the notification. You need to use the AUParameterSet call which is in this header file. And that basically then will tell the notification mechanism that, you know, if you've got anyone who's listening for this, you better tell them about it.

We've decided to extend this a little bit in Panther and to include the ability to notify state changes from the audio unit property. And so you can do both parameter changes and property changes using this notification mechanism. Now one of the important things about audio units with all of this kind of stuff is that we're never going to know everything that we need to know about every possible audio unit that every developer is ever going to write. And so there are mechanisms in place to be able to share private data with private property IDs and all this kind of thing between your audio units and your views or this kind of stuff.

And so this mechanism can be used to communicate that there's a need to go and get some state that may be in the audio unit if you're the view. And it can be done in a thread-safe manner. So you can do that. You can call this sort of API from your render thread as well as from a UI thread.

The other weakness that I think we had in Jaguar with the Carbon UI is that we had this idea if you're doing automation that you need to know like start and end state as you're doing automation. And we sort of put that into the view and that really wasn't the right place for that.

And it kind of restricted some of the use of it where the audio unit may know that it's a start and end state, not the view. And so we added to the Panther services for this, this idea of a begin and end of a parameter ramp state. So you could imagine if you've got a control surface, when the user touches a control, some of these control surfaces are sensitive to actual touch.

So you could touch the control and then that would be a signal to say, hey, I'm about to start ramping this parameter. And the UI could reflect that the user's touched that control. 00:19:00:00 - 00:19:18:00 Unknown So we've got a control with the changing button in the UI.

And by putting this into the controller, this means that we can also support this with Cocoa UIs as well as Carbon UIs without having to add additional logic to the Cocoa UI. So I think this is a very nice addition to this. And as I said before, this is real time thread safe and we'll continue to work very hard to ensure that that remains true.

[Transcript missing]

One of the things that we're working on, this is to address some concerns that were raised by Waves, who are a very large developer of audio processing plugins. And they wanted to have a way to do offline rendering. Offline rendering is typically done when you want to actually process a file of data, and you want to look at the whole contents of the file, not just in real time. All of the AudioUnit development that we've done up until this particular unit is really aimed at working in real time.

And so it has constraints about having to work in real time. You need to report latency. You need to report how long it takes you to process sounds in terms of the time delay between input and output. With an offline unit, you need to be able to look at the data, all of the data that you're going to be asked to process before you process. So there are two render phases with an offline unit.

There's an analyze phase, and then there's a render phase. So if you think of reversing the sample order in a file, you need to actually start at the end and work your way back. If you think of an offline unit that may normalize, you need to look at the whole audio data before you start to process it so you can do the normalization.

There are not really any additional API changes for this. There are a couple of different flags. There are a couple of properties that are specific for an offline unit. There's a new audio unit type AUOL, audio unit offline. It's not in your Panther headers yet because we're still revising and discussing this with Waves and some other companies.

This will be published in Panther and we're getting pretty close. If people are interested, they can contact me and I'll send them the spec and the code that we've got at the moment. We will ship an audio unit that does reversal in the SDK at some point as an example and there will be code there as well as to how you host these offline units.

Okay, so when I said generalized audio unit, we're still sort of like in the general field at the moment about normal types of audio units. Now, let's look at some abnormal type of audio units. And one of those types is mixer units. So in Jaguar, we had two mixers, a stereo mixer that takes mono or stereo inputs and gives you a mixed single stereo output.

And in Jaguar as well, we had a 3D mixer. The 3D mixer will take multiple inputs. It will have a single output, and it will be in either two-, four-, or five-channel output. And the four-channel is like a quad setup, which is pretty much what we've got in the room today, and 5.0, where we don't actually do the 3D stuff into the LFE channel, we just do the five channels.

And what you can do with this 3D mixer is quite a lot. You can pan and localize within a 3D space. You have a number of on/off options. And I'll get Chris Rogers to come up, and we're just going to give you a fairly quick demo of the 3D mixer.

Well, thanks, Bill, for setting me up there. Last year, I gave a more complete demonstration of the 3D mixer, but some things have changed since last year because developers have made some requests of the mixer. So we put some new features into the 3D mixer, and we can have a look right here.

What I have right here is a simple little app, simple little user interface onto the 3D mixer, and it has a number of different controls for choosing the type of rendering for the source. In this demo, there are going to be three sources, or there can be up to three sources, and you can choose equal power panning, simple spherical head model, an HRTF model, and these first three are for rendering to stereo output, and then the last two, sound field and vector based, those can be used for stereo, quad, or 5.0. Then over here, we have some check boxes that let certain features of the mixer be enabled or disabled for individual sources.

Down here we have master volume control. And here this is kind of an obscure slider that controls distance attenuation. That is when sources -- when sound sources get farther away, they get quieter. But how much quieter do they get, say if they're 10 meters away or 100 meters away, how much quieter do they get? This can control what that curve is at the falloff. And that's a feature that developers have been asking for.

Down here in this part of the display, we have some meters, which the 3D mixer now supports live metering. You can meter the input levels and output levels, both RMS power and peak levels. And metering is something that we put into a couple of our audio units, and later on today we'll see that new audio unit matrix mixer also supports this, which is kind of interesting.

So maybe I should just bring a source in and

[Transcript missing]

Maybe I've turned all my sources off. Okay, here we go. So, I'm using vector-based panning right here, and I've turned these sources off, so we're not listening to those, we're just listening to this blue one here. But if I put this dot right here, then we're essentially just in the center channel.

Hold on, we have the helicopter coming in, I'm sorry. There he is. Let me turn him off. OK. I think now maybe our meters will show this a little better. OK. So, I'm about straight ahead here. I should be coming out of the center speaker. And the channel ordering down here is left, right, surround left, right, center.

[Transcript missing]

Okay. Now, let me change this sound.

[Transcript missing]

It looks like we're running a little bit short on time, so I'll try to wrap it up here. Basically, The new features that are the most important are the ability to turn off or on individual features here. And that affects the performance that you're going to get. This distance filter is a lowpass filter that makes sounds sound more muted as they're getting further away. And some developers had some comments about that, that they didn't necessarily want their sounds to get more muted. So there's a way that that can be turned on or off.

And any of these characteristics can be turned on or off separately. And as far as performance goes, we've made some optimizations to the mixer. And to give you an idea of the kind of performance that you can get for an individual source using equal power panning and stereo, I think on a pretty modest machine like an 800 megahertz G4, you can get a single equal power source -- -- -- -- -- -- at, I think it's 0.18% of the CPU.

And HRTF, which would be our high-quality stereo panning mode, that's at about .55% of the CPU per source, so a little tiny bit more than half a percent. Now I have a... We'll just move on, because we're running out of time. Okay. Go back to the slides. Thank you, Chris.

The other demo that we were going to show, but we're running a bit short, is a very speed unit, and that's also new in Panther, and you're able to have your sound come into the very speed, and it can go faster or slower. It can go kind of a chipmunk effect. We'll show you that later on. We'll be in a lab tomorrow from 1:00 to 6:00, the QuickTime lab, and we can give you a demo there if you like.

Okay. So in Panther as well, we have a matrix mixer. Matrix mixer is a very powerful beast, and we'll be going into some detail about that in the next session. All mixers have metering capabilities in Panther, so you'll see some of that in the next demo, next session. The other type of audio unit that we have is a converter unit. We'll be talking about the audio converter in the next session as well, and this brings some of the functionality of the audio converter into the audio unit world.

Essentially, all of the conversion operations to do with PCM audio, so sample rates, bit depths, channel mapping, and so forth. And all of this is configured with properties, and actually, there's very little configuration of it for you to do because just describing to the audio unit what your input and your output format is, is enough to tell the converter what work it should do.

The Converter Unit's functionality is included as a part of the Output Unit. The Output Unit is interfaced to a device. There is no additional latency, there is no runtime cost to you for dealing with the Output Units. You can manage how much work the Output Unit does for you by seeing the difference between the format of the device, which is on the Output Unit, and the format that you provide it.

In Jaguar, we only did output, and now our output units do input, so that's why they're RabbitEars output units now. In Panther, the output units will do output as they do in Jaguar on bus one, and on element one or bus zero, I mean, on element one or bus one, you can do input. And so what does this look like? It looks like this. So I'm going to use the slide. You know, they hate us using these things, but I love it.

So here's your output unit. On the output side, here is your device output, and this is on bus zero, and then this is either a callback or a connection that you make to the audio unit, and this is if you're using this unit, this is what you're doing today. In Panther, you can also see if there is input available. There's a property to query that, and then the device's input is actually on the input side of the output unit. Confused yet? It took me a while.

And then the application actually calls audio unit render for bus one, and that's where it gets the input from the device. And it can do the conversion for you as well. So if your device is 20 channels and you only want four channels, then it will remove those 20 channels and just give you the four channels, and you can tell the output unit which four channels that you want from the device, including rate conversion, all this sort of thing. And if your device just has input, and you need to know when to get input, then you can get a callback, and it will tell you, hey, the input's ready, go and get it. And so that's new in Panther. Thank you.

Okay, so audio units, we talked about connection, we talked about wanting to be able to connect all these up and we have a structure in the audio toolbox APIs called AUGraph and the AUGraph manages the connections between these units. It'll manage the state, it has a very abstract representation of the graph and then the graph itself has a couple of different states. You open the graph and that'll basically open the audio units that you've described as being a part of your graph, then you can initialize them and then you can actually start the graph running. And you can update the state of the graph while it's changing.

You can make or break connections, you can have a bunch of audio units sitting off to the side that may be one channel, one chain into a mixer or something and then you can just connect that chain in, play your sound, disconnect it and the graph will do all of this.

It'll manage the different threads of the thread that you're doing the rendering in, the thread that you're doing the state management from and it'll just make this a lot simpler to do than you can write your own code but if you're going to do this, it's a good API to look at. Another API in the toolbox is the Music Sequence API. Music sequences are a collection of events and there's really two different types of events and we have two different types of tracks. They're still just basically a track.

A music sequence has some concept of tempo. Music sequences talk about their timing in terms of beats and so a tempo describes how many beats per second. If you want to deal in seconds, you can just make a sequence with the tempo event of 60 beats per minute and then you can deal in seconds.

And then you've got any number of events tracks and the events tracks can take any number of different types of events. You can have MIDI events, you can have user events where you can put your own data in there and you can have parameter events so you can talk directly to particular parameters on different audio units. And of course the connection is that you have a sequence and you connect that up to a graph and we'll show you that in a minute.

You can create a sequence from a MIDI file, or you can create a sequence programmatically. You can once created a sequence, you can save the sequence off to a MIDI file. We'll only save the MIDI type events that are in that sequence to the MIDI file, obviously. We won't do the other ones. And you can iterate over the events. You can loop, mute, solo tracks when you're playing them. And you can copy, paste, insert, edit them, and these can be edited while the sequence is actually being applied in real time.

And you can target the tracks that are in a sequence to two different destinations. You can target them to a specific AudioUnit that's in a graph that you've attached to the sequence, or you can target a sequence's track to a MIDI endpoint, so you could be sending from a sequence that you're playing directly out to a MIDI device.

[Transcript missing]

Okay, so they're the four frameworks: Core Audio for drivers, Core MIDI for MIDI. The AudioUnits really, that framework just publishes the API for the extendable AudioUnits and Codecs, and then the toolbox for formats, files. We'll be talking a lot about the format stuff in the next session.

Wrap-up. Roadmap, there's some of the sessions. We have the Audio and QuickTime session tomorrow, which is going to be a very interesting session, I think, for you to go to. Feedback forum for us is after then, and Nick and Craig and myself and others will be there. Who to contact? And for more information, we've got the Audio Technologies for Mac OS X website, developer.apple.com/audio.