Application Frameworks • 1:09:55
With Mac OS X, professional-level audio is designed right into the OS, and features ultra-low latency, high resolution, and multichannel capabilities, with the ability to be flexible and extensible. This session presents an overview of Apple Audio Technologies, system services, drivers, and hardware. We discuss AudioUnit and MIDI, and provide insight into the design strategy and fundamental paradigms implemented throughout audio on Mac OS X. We address all APIs, so view this session, especially if you are new to audio on Mac OS X.
Speakers: Craig Linssen, Nick Thompson, Bill Stewart
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it may have transcription errors.
I'm Craig Linssen, Music and Audio Partnership Manager at Apple. And today's session, you're going to learn why Mac OS X audio truly does rock. So what we're going to cover today, we're going to go over some developer market opportunities. We at Apple, we're excited about our advancements in the music and audio space recently. And we'd like you to be able to share in our success. The market is really large, and I'm going to go over a couple of slides to show you just how large that market is, and hopefully show you some new opportunities that you can tap into with your applications.
We're going to cover some audio device driver changes in Panther. And we're going to go over the core audio architecture design objectives. Finally, we're going to dive right into the APIs and show you how to get started. Whether you're a game developer, a music and audio developer, or you just want to input, output, or process music or audio with your application, you're going to learn about the APIs that you need to get started today.
So market. The following is a slide of the estimated number of music and audio creators in the United States today. At the top end of the spectrum is the pro market. There's approximately 50,000 aspiring-- sorry, 50,000 professionals who are doing music and audio day in and day out and making a living at it. 100% of those individuals are using a computer in their creation process.
Now, in the middle end of the pyramid is the aspiring professional market. These are individuals who aspire to be at the top of the pyramid. They aren't quite making a living at it, but they spend a lot of money on their gear, and it's a very large market. However, only about 30% of those use a computer in their music and audio creation process. And at the bottom end of the pyramid, you'll see the creative individuals. 55.9 million individuals in the United States who are creatively inclined to do music and audio in some fashion or another. These could be guitar players, people who are just learning to play a musical instrument, students, DJs, hobbyists, etc.
But only about 5% of those individuals currently use a computer in their music and audio creation process. And one of the things we want you to be thinking about in today's session is how you can be using our Core Audio API services to create applications that are easy to use and are going to show these individuals how they can use a computer in their music and audio creation process.
This represents about 26% of the United States population. That's one in four individuals is your market. It's a really big market and there's a lot of opportunities there for you to tap into. So let's drill down into it just a little bit further. The purple wedge that you see up here represents guitar players. Now, there's about 28 million guitar players in the United States. A lot of guitar players.
15 million keyboard players, 9 million DJs and remixers and producers, and the rest of the pie chart breaks down via brass, woodwind, percussion, orchestral, et cetera. And again, the point I want to drive home here is that not a lot of these individuals are even aware that they can use a computer in their music and audio creation process. It's never been easier than today to create a music and audio application on the Mac. Using our Core Audio API services, you're going to find out today just how easy it is.
So what were some of our design objectives in our Core Audio API services? Well, two things-- ease of use and performance. We wanted simplified user configuration for the end user and streamlined development for the developer. Performance that was built with the professional in mind. In a lot of cases, it was built with direct feedback from the professional music and audio developer and musician communities.
It's the most feature-rich audio platform out there. Core Audio is multichannel and high resolution with sample rates up to 24-bit, 96K and beyond. It's extremely low latency. And we have native MIDI support built in. If you've ever looked in your Utilities folder, you might have seen this little keyboard icon up here, which represents Audio MIDI Setup. You can configure your entire audio and MIDI studio with this utility. Very cool stuff.
We have Class Driver Support, plug and play support for all of your hardware spec compliant USB and FireWire audio and MIDI devices. We highly encourage all the hardware developers in the audience to really look at making your hardware devices spec compliant. It's really important. It cuts down on your development time and costs and it makes it much, much easier on the end user. It's plug and play device support. You just plug it into Mac OS X and your device will work. No need to worry about device drivers.
And finally, audio units and audio codecs. Apple platform standards which extend the capabilities of the operating system via DSP plugins, virtual instruments, and audio codecs, making it much, much easier in your development. Now, we're going to get into Audio Units and Audio Codecs in a lot more detail in Bill's session coming up here in just a few minutes. But one of the biggest questions I get from developers about Audio Units is, well, who uses this?
The following slide should really speak for itself. A really large development community has already sprung up around audio units, and it's getting bigger every day. There's a lot of individuals out there who are very excited about the work that Bill Stewart's team has done, and he's going to get up here in just a second and get into a little more detail. So I'd like to bring up Nick Thompson. He's going to talk about audio device drivers in Mac OS X.
Thanks, Craig. Hi. My name's Nick Thompson, and I manage the audio driver team in Apple's hardware division. I wanted to give you an update on-- kind of talk about the drivers that are in Darwin and how that's structured, and also give you a little bit of an update on what we've done for the Power Mac G5 and changes that you might need to make for your product. And then I wanted to cover class drivers for USB and FireWire and point you to some resources for when you're writing your own drivers.
So looking at the structure, kind of a block diagram of kernel-based drivers on Mac OS X, they're all pretty much based on I/O Audio family. When you look at the built-in drivers, you'll see there's basically a super class Apple Onboard Audio which encapsulates most of the common chip features for each chip. And then we write plug-ins for each chip specific to the platform. So when you're looking at the code in Darwin, That's kind of the structure of it. And we also developed class drivers for USB and FireWire. And the source code for the FireWire driver isn't in Darwin currently. And we're planning on getting that in there after Panther.
For the Power Mac G5, there's a couple of new things that you've probably already heard about. I think the biggest one is this is the first computer from Apple with built-in digital IO for audio. And we're really excited about this. We've also made some improvements to the analog section. It's basically similar to previous computers, but we've added support for 24-bit data as well as 16-bit data, and also added support for different sampling frequencies. So previously it was basically CD quality output. Now you can have 44:1, 48, 32K at different depths. And I want to talk a little bit about some of the changes that are needed for devices and device drivers on the new computer platform.
The really cool thing about the digital section is that we did clock recovery on input. This means that you can do bit accurate copies of original source material which, if you're a musician, really matters to you. The other thing, as I mentioned, is that both on the analog and digital parts we've got support for new sampling frequencies and different sample sizes. The connector that we use is basically a Toslink connector. And the specs that cover this are up there. IEC 608 74-617. It's basically the standard friction lock connector that you'll see on most consumer AV devices that have optical connectors on them.
So the back panel, you'll see the two connectors third and fourth down from the top are the optical connectors. We've put a little door on there so that you don't see the red light. We connect to basically, we're testing with every piece of consumer AV gear that we can find and the cable is just the standard TOSLINK connector that you'll be familiar with if you've come across optical gear before.
I wanted to talk a little bit about AC3 support. Obviously, because we support digital output, it's possible to either stream PCM data across there or encoded streams. There's an important thing to know about encoded streams. You can either output data from all apps as PCM data, and it's mixed together in the same way that the analog section is today. Or you can have one single AC3 encoded stream. So if you have, say, a DVD player app. It's important to know that you won't hear system alerts if you're in this mode, which we've termed hog mode.
If you're developing PCI cards, there's some changes that you need to be aware of. The biggest one is that we're no longer supporting 5-volt signaling on cards. And you can tell whether a card is keyed for 5-volt signaling. There's basically a notch at the back end of the card. If it has both notches removed, it's a 3.3-volt universal card, so you're okay. We're encouraging you to visit the compatibility lab and make sure, if you have your cards, that they work in the new computer. And the G5 labs are downstairs. guys.
There's also some changes that you'd want to make in a kernel-based device driver. Basically, we need you to make your drivers 64-bit safe, and we've seen a number of drivers from third parties that currently don't work. If you have any questions about this, come find me. We're happy to work with developers and make machines available if you can come to Cupertino so that you can get your stuff working. But basically, there's a bunch of macros that are kind of ambiguous right now. They've been replaced with specific macros for the word size that you're using. thing.
You also need to remove any assumptions you have in your driver about logical and physical addresses being the same. When we were porting the Apple onboard audio drivers, we basically came across places where we were making these assumptions and we weren't working. So we had a call in there, physical to KV, which maps physical addresses to kernel virtual. You need to basically work around these issues. And then finally, you also need to make sure that you prepare memory for I/O. If you do this, a Dart entry is created and the world is good. If you don't do this, memory isn't where you think it is.
Let's talk about the class drivers a bit. First with USB. Basically, using our driver is going to save you development costs. So you should try and make sure that if you're developing a USB device, it conforms to the USB audio spec. We've basically implemented support for everything we've seen so far. And I think the message of this slide is, you know, if you have a device that you're working on with a mixer unit, for example, let us know and we'll make sure that the driver works with that. Amen.
We're also working on the USB device 2.0 specification, which is kind of different to USB 2.0. It gets a bit confusing here. But this is basically the specification for USB audio devices on both full speed and high speed USB. We're going to be making sure that we provide support for those new devices in the class driver. If you are working on a USB 2.0 audio device, let us know. We're interested in taking a look at it and making sure that we work with your hardware.
This year's going to be a big year for FireWire, I think. We're working with a number of developers doing FireWire products. Again, using the standards-based driver is great for both us and you. It reduces your development costs. And for us, it makes it easier for customers. When they buy a device, they can plug it into their Mac, and it should just work.
Basically, there's kind of two flavors of FireWire devices out there at the moment. There's AVC devices, and we're seeing silicon from Bridge, Conox, and Semiconductor for those. And there's MLan devices from a variety of manufacturers, including Yamaha, Korg, Atari, PreSonus, Kurzweil, Apogee, and the list is actually growing. And we're aiming to basically support any silicon that anyone comes up with that's basically class compliant.
In Panther, we've made some changes. The driver that we shipped in Jaguar was essentially a driver for speakers. We're now adding music subunit support. So we're providing input and output support, different numbers of input and output streams, support for MIDI. We're also working on lowering the latency in jitter. This is really important to us for a variety of reasons. The other thing that we're doing is if you have a network of devices such as an MLan network or a network of speakers, we're making some changes in how they're presented. Wanted to talk a little bit about what happens when you hot plug a device.
So basically, when you hot plug a device, it'll show up. We'll create a unit for it. And we'll start building the stack that we need. So we'll build a device reference. We'll build an engine reference and start linking that up. And then we'll start building stream references for input and output.
start building stream rest for the output stream and stream rest for the input stream. What happens when you plug in a new device right now is that we'll create a new engine, which is inefficient because you're going to start getting interrupts on each engine. So what we're going to do in Panther is basically start linking the whole thing together so the entire network is presented as a single device. So a new device is hot plugged. We'll actually create a new stream ref for the input, but the output will be the same output device. So the way that this is going to wind up getting presented is you'll see two input devices with four streams on them and a single output device with eight streams on it. So the goal here is essentially to present a network as a single block device, which is the intention of MLan networks, but it's also for speakers a much more intelligent way of doing things.
Finally, I wanted to point you at some developer resources. There's driver code in Darwin. You should definitely check out iAudio family if you're doing a kernel-based driver. You should also really consider whether your driver needs to be in the kernel. It's possible to hook into Core Audio and write a driver that sits in user land, and that's often a better way of doing things. For PCI cards, you do want to be in the kernel. For FireWire, maybe you want to be in the kernel, maybe you don't. You need to think about it on a device-specific basis. We've also got sample code in the SDK for the Phantom Audio Driver, which is a great place to start. And we're looking at, hopefully by the Panther time frame, getting out a sample PCI driver for a couple of devices, because a number of developers have been asking for that. And as always, you should check out the Audio Developer page at developer.apple.com/audio. So that's it for drivers. I want to introduce Bill Stewart to talk about Core Audio. - Thanks, Greg.
Okay, so thank you all for coming. So what is Core Audio? I hope I can tell you something about it. When we started on this, we had a number of goals, and we'll go through some of those. But basically, Core Audio is an API that applications can use to access audio and MIDI services. We're talking about access to devices. We present a unified API regardless of the device that you're dealing with, whether that's audio or MIDI. We wanted to provide extensibility mechanisms for doing software processing, and that covers both audio units as well as codecs. And we also wanted to provide some general audio sequencing sorts of services that could be used in games, could be used in audio music applications, and just to basically give you a set of services that can satisfy the kinds of things that you need to do.
That's what we're going to cover in this session. I hope you can realize that. This is probably a good kind of explanation of what this looks like to many of you now. It's like a bunch of sort of things that are kind of all over the place. What I want to try and do today is by the end of this session get to a point where you can see this diagram and have some sense of how this might all fit together and it's not just a bunch of sort of weird names, of which start with AU and some of them with audio and some of them have MIDI in them, all kinds of confusing terms. Okay, so there's four frameworks in the OS that you need to look at for headers and for linking against. There's the Core Audio Framework, Core MIDI Audio Unit and Audio Toolbox, and then there's different modules that load dynamically, units, codecs and MIDI drivers. This is specifically like user level space and then of course you've got the kernel drivers from iKit for audio.
As Craig discussed earlier, when we first started with Cordio, we wanted to set a high standard for ourselves to reach. I think this was a very important initial step to take for us because it's extremely difficult if you aim low and then have to scale up. It's a lot easier if you aim as high as you need to aim and then you can appropriately scale So we really looked at what was required from the pro audio market. And this is not to say that this is a set of APIs that can only be used in musical applications or pro audio sort of workstations and things. It really can scale down to games and to lower sample rates and lower quality abstractions and so forth. But we wanted to start at this point. So we had requirements for latency and jitter that were very important to us, both for MIDI and for audio. We wanted obviously no restriction on sample rate, which was a problem with previous versions of the OS. Multi-channel awareness throughout the system, not just with devices, but also with codecs, with audio units, with the whole way that we think about audio in the system.
And we didn't want to be limited to just some small subset of how we can represent audio data. We wanted to have rich abstractions that can be applied throughout the system. And the session after this one will go into some more detail about how we represent data in the different subsystems of Core Audio.
So let's start at the bottom and we'll kind of work our way up. So I'm not going to spend a lot of time in the Cordial HAL part. This is very much a low level interface to devices. There's a lot of abstraction here for the devices, but there's also a lot of specific device state that you need to manage.
And if you are in a situation where you really need to interact very intimately with a device, This is the API that you use. This is in the Cordio framework. We affectionately call it the HAL, the Hardware Abstraction Layer. And you'll get all of the characteristics of devices published here, configuration, system device preferences, management of the device status, and all that kind of thing.
And for MIDI, we've got the Core MIDI Framework, and this is really the APIs that are published through here for transporting MIDI data through the system, both in and out of the system through drivers, as well as into application. In Jaguar's Core MIDI, we have a concept of virtual sources and destinations, and we found that a lot of developers from Mac OS 9 and the MIDI services that OMS and FreeMIDI gave, that one of the things that have been missing from the way the system is used is an IAC bus, which is an inter-application communication bus. And this is basically just a driver that looks... It's a software that looks like a driver, but it actually gives you a way to kind of connect MIDI between different apps, and it looks like you're dealing with drivers rather than sort of having to do extra work to look for virtual sources and destinations. So this will be a new feature in Panther. And the other thing, of course, with core MIDI is needing to configure device and publish device characteristics and so forth. I'm going to go to the demo machine and just very briefly walk through the Audio MIDI setup application just as a way to kind of give you some sense of how all this is put together. So this is the audio tab. We've got the Mark the Unicorn 896 box here. We use different devices each year. We've used eMagic and Delta M-Audio devices. We thought we'd use the Motu one this year. This top part of the section here is the system settings, and these are the default input and output. They're typically the devices that are used by apps like iTunes, by QuickTime Player, by games. The user can typically see in the sound press, they can see like these are all the output devices I've got and they can choose which one they want to use as their default output. Another thing we introduced in Mac OS X was a distinction between the device that you would use for playing audio on and the device that you would use for things like sys beeps and so forth. So the system output device is the device where your beeps go.
And the bottom part here is device specific configuration. You can see for the 896 that it has different clock sources. And you have also in Core Audio, you have this concept that a device can have different streams. And this is a way that a device can publish its capabilities. So on the 896, we have two streams. We have one stream here, this first stream. And it has eight channel capabilities. and this would be like the analog 8-channel IOs and the 8-channel digital IOs on the device. And then the second stream would just give us two channels, and that would be the SPDIF input or output. And you can see here I've got, like, sample rate options for the streams, and typically a device will run at the same sample rate across the different streams. One of the optimizations that we didn't do enough work on in JGUA, but we've rectified this in Panther, is to turn streams on and off. You can think of a stream as kind of a logical unit on a device. And you can turn a stream off, and that can tell the driver to not do work on that stream. And that can be done on an application and application basis. And that can make the load of running the system a lot less than if you're just running a device that may have 50 I/O streams. Another thing that we're doing for Panther is trying to address the problem of how to configure your setup for surround systems, different speaker orders and so forth.
On your Panther CD you'll see there's this panel for speaker configuration. Now you won't see this if all you're looking at is a device with two channels because it's just two channels and everyone pretty much understands left and right. But if you have a device like the 896 or the metric halo devices or PCI cards or some of the USB devices with more than two channels, you'll see this speaker configuration utility and I can basically tell the system that these are the, on my device, the way that I've got this wide up in the studio or in the home or whatever. Channel one is going to be this speaker, channel two is that speaker and you can set these up for different types of surrounds.
Now you can't have all these different surrounds active at the same time so really you choose the surround speakers that you've got in your particular location and then you basically say well okay I've got left front here and I can make a sound and I can get that sound out there. We're going to change that to a pink noise rather than a sine wave sound. I can get the center sound and right surround and so forth. And that can make the whole configuration for the user a little bit simpler. And then we will do mapping in the output units to remap channels. And there will probably need to be some API changes for that. We're would push this into the existing output unit, but we're not sure at this point if we can do that without actually doing the wrong thing. And we'll be publishing some details about this later on. And there's some more details about how we represent this stuff in the audio format session.
If I go to the MIDI section now, I've got one USB device here from Roland. It's an MPU64, and it has four MIDI ins and outs. I've got a studio set up here and -- that's not Dre, I'll leave that over there. Early software. So one of the things I wanted to talk about with this app is there's a little bit of confusion about exactly what it's meant to be doing. This really reflects, as the audio side does, both of these apps, if you're developing and familiar with the API, they really reflect a lot of the structures that are in the API themselves. And so you can sort of see this as like this is what the user sees, but also as a developer, we can just with a couple of graphing concepts here, we can understand what you're seeing from an API point of view. So this is a driver. This driver has what What is in MIDI is called MIDI entities and it has four MIDI entities which are these pairs of in and out ports. And the in and out ports are MIDI endpoints and each MIDI endpoint takes a full MIDI stream, so that's 16 channels and you're able to talk to whatever is at the end of that stream or get the data from the end of that stream. So this configuration here has four MIDI entities and each MIDI entity has an endpoint for in and out. And what we're doing here is describing the driver. Now what is out here is just three devices that I've added by just doing this add device thing and I can add a new external device. And I'm not actually creating any, it's not like I'm going to really create like a whole patch flow here. All I'm doing is describing to the system what I've got plugged into what the system knows about, which is the actual driver. You can't really do through connections here in the sense that that will just automatically route. You have to route these with your cables. You're really just describing to the system the keyboards or the control surfaces or the modules that you have on your system. You can save different configurations. If I unplug the driver will go offline, and you'll see that reflected in AMS. And what can be done here is, so on -- rather than sort of showing this as port 4 of USB, you know, SMPU64, you can actually present by querying core MIDI's, APIs, you can actually present my synth module as the destination for that to the user.
Now, the thing that's missing from this that I'm sure many of you who have dealt with OMS and free MIDI is some way to describe what is actually in the device itself. If it's a Roland U20 or U220 thing, what patches does it have and what is the current patch banks that I've had, what are the capabilities of the device? We've worked with the MIDI Manufacturers Association to describe a document format for that using XML. The spec has been ratified and has passed through all of the MMA processes for doing that. We wanted to do this through a body rather than as an Apple sort of only initiative just to make this something that could be broadly adopted both by manufacturers and other computer companies besides ourselves so that for the user there's one data format. The document is not available yet from the MMA site, though it will be soon as we're going through the final stage of actually making the documents clearer about what it contains so that it's an easy to understand document.
You'll be able to author these XML files yourself to describe custom patches and we will hope that there will be websites available that manufacturers will publish them for their devices. And so then the user can see my synth module, and it's a Roland synth module, yada, yada, and it's got these patches on it. Just like with OMS and FreeMini on Mac OS 9, it should be as easy an experience for the user on 10. If we can go back to slides, please.
Right, so that's the bottom layers. Let's get into Audio Units. So Audio Units, we have a logo, and that's the logo. I have to... took some effort, I can tell you. So audio units, and I'm going to say that these are a generalized plug-in format. And by generalized, I mean that it has very many different uses. It's not just for effects. It's not just for software instruments. And what we'll do today is later on as a demo for some other audio units. And I just want to give an overview Generally, if what audio units are, the types of audio units that you can have and their kind of categories and their functionality.
So how does it work? An audio unit uses the component manager, and we use the component manager because it has some nice features about it. It has discovery mechanisms, find next component. You can just make this call, and you can specify the type of audio unit that you're looking for, and we have different types. Once you find the audio unit that you want, and you can either be doing this programmatically or you can present menus based on what you've discovered to the user, then you open that component, Once you've opened the component, you get back what's called a component instance. So you can think of a component in object-oriented terms as a class, specifies an API, specifies what a member of that class can do. The audio unit itself is the component instance is like an object, an instance of the class. So audio unit as a typedef is typedef to a component instance.
So you've opened your audio unit, so what do you need to do to it? Well, it can be as simple and complex as the type of audio unit you're dealing with, what you want to do with it. And the first step, really, for an audio unit is to look at the properties that it has. And properties really represent, in this kind of sense of the property mechanism, it's the state of the audio unit. And so it is a general mechanism. It's extensible. We define property types. You can get information about the property, how big is the property, can I read the property or write the property or both. I can use the property mechanism to find out the state that the unit comes up in and I can change the state of it and so forth. It's really the way that you manage that fundamental state of an audio unit.
And then once you've sort of set up the state of the unit, then we have a second phase, which is initialization. We split these up because there's often a lot of things you might want to discover about what an audio unit can do, particularly if you're in like a hosting environment, before you initialize it and make it able to render and able to do its job. And so audio unit initializes this concept of allocation in order to actually operate. And once an audio unit is initialized, then it's considered to be in a state to be usable. And that is the one call that you do really to use it is Audio Unit Render. And I'm not going to go into the specific details of the API for that, but there's a lot of arguments that you can pass to this and flags and so forth. And they're fairly well documented in the headers, and you can ask questions on the list.
Now, we're ready to go. We're going to call it a unit render, but where are we going to get input data from? You can get inputs for an audio unit from two different places. We wanted to have with audio units an idea that we can connect them up into processing graphs, or we can use them independently, either just one off or maybe two or three, and then I want to provide data to it, but I don't want to have to be an audio unit to provide data to another audio unit. So you can also have a callback function or you can have the connection.
So there's two ways that you can provide data to an audio unit. And at this point, we're talking about audio data to an audio unit. And so when you call audio unit render on an audio unit, it's going to go if it wants input. And some audio units may not want input. And we'll have a look at those in a minute. It'll call its input proc or its connection for its input data. When that returns, it's now got input data to process. It processes that data. And then you get that back in the buffer list that you provide to Audio Unit Render, and you're done.
Well, are we done? No, not quite. Because one of the things you want to do when you're processing audio is you want to be able to tweak it. You want to be able to set delay times differently. You may want to be able to set frequencies. If you're talking about volumes, you want to change volumes and so forth. So all of these are considered as real time operations on the audio unit. Things that you can do to the audio unit while you're in the process of rendering it. And we abstract that into what we call parameters. An audio unit using the property mechanism publishes a list of the parameters that it allows the user to manipulate. It publishes things like what's the range of the parameter, what kind of values does it have, maybe dB or hertz or maybe just a generic zero to one sort of parameter. And there's a whole bunch of different types of parameters that an audio unit may publish. And it really, this can, we've seen some third party units that have a couple of hundred parameters. A lot of our units may be fairly simple. They may have just two or three parameters. It really depends on what the unit is doing and what the developer of the unit wants you to be able to manipulate.
So we have effects units, and that's really kind of the meat of a lot of where we think the third parties will be developing units. In Jaguar, we ship various filters. There's high, low pass, band pass, high shelf and low shelf filters. We ship a reverb unit, and the reverb unit has quality settings that you can adjust. The quality really determines how much CPU the unit's going to take at runtime, and it actually affects the quality of the rendering.
And for things like games and stuff, they may be less concerned about a very high quality and more concerned about the load that the unit's going to take. And we have a digital delay unit, we have a peak limiter. In Panther, we've added a multi-band compressor unit. It's a four-band compressor, and it's pretty nice, actually. And, you know, we're not going to publish a couple of hundred audio units as a company ourselves. This is, for us, the big part of audio units is what developers are going to do with this. This is where it can get very interesting and very bizarre sometimes. Thank you. So, to sort of summarise that, we create one, we've got the state management, we've got the resource allocation, we've got rendering and we've got the control of the rendering. Well, is that all we need?
No. Typically, if you've used Logic or if you've used Digital Perform or any of these sort of hosting environment type applications, you want to present some kind of view to the audio unit so the user can interact with them. We've published a generic view which will just query the parameters that the unit tells us and we'll assemble like a bunch of sliders And that's, you know, it's not bad, but it's probably not as good as what you can do if you really understand your DSP. And so a developer of an audio unit can publish a custom view. And some of the views, if you've seen these, are pretty creative and pretty interesting. In Jaguar, we only had the ability to publish Carbon views.
In Panther, we're adding support for Cocoa. So you can publish a class that implements a Cocoa protocol and it can be discovered from asking the audio unit for its Cocoa view. And a Carbon app can put a Cocoa UI up in a separate window and a Cocoa app can put a Carbon UI up in a separate window. So there's not a lot of extra work on the hosting side to deal with both of these and We think that probably a lot of developers will be very interested in the Cocoa UI, so it's there now.
And the other side of this is communication and control mechanisms. In Jaguar we have a parameter listener API. It's in audio toolbox, audio unit utilities.h. And it is an API called AU parameter listener. I always get told off when I'm wandering around, but I just can't stand still, so I'm going to keep wandering. Excuse me. So the parameter listener, when were looking at this to design this, there's two ways that you can do this. You can have a UI or an app that is going to, you know, 30 times a second or whatever, it's going to poll for the values of the parameters, see if anything's changed, and that seemed to us a less than elegant way of dealing with this. So we decided to do a notification service, And the notification service is aimed at allowing anybody who wants to know about a parameter changing on an audio unit to be able to listen to that parameter to see that that parameter has changed and then to react appropriately. And then when they want to set a value of a parameter, if they just use the standard audio unit set parameter call, that's not going to invoke the notification. You need to use the AU parameter set call, which is in this header file. And that basically then will tell the notifier, the notification mechanism that, hey, if you've got anyone who's listening for this, you better tell them about it. We've decided to extend this a little bit in Panther and to include the ability to notify state changes from the AU, or their unit property. And so you can do both parameter changes and property changes using this notification mechanism. Now one of the important things about audio units with all of this kind of stuff is that we're never going to know everything that we need to know about every possible audio unit that every developer's ever going to write. And so there are mechanisms in place to be able to share private data with private property IDs and all this kind of thing between your audio units and your views or this kind of stuff.
And so this mechanism can be used to communicate that there's a need to go and get some state that may be in the audio unit if you're the view. And it can be done in a thread-safe manner. So you can call this sort of API from your render thread as well as from a UI thread.
The other weakness that I think we had in Jaguar with the Carbon UI is that we had this idea if you're doing automation that you need to know start and end state as you're doing automation. We sort of put that into the Vue and that really wasn't the right place for that. It kind of restricted some of the use of it where the audio unit may know that it's a start and end state, not the Vue. added to the Panther services for this idea of a begin and end of a parameter ramp state. So you can imagine if you've got a control surface, when the user touches the control, some of these control surfaces are sensitive to actual touch. So you could touch the control and then that would be a signal to say, hey, I'm about to start ramping this parameter and the UI could reflect that the user has touched that control with the changing button in the UI. And by putting this into the controller, this means that we can also support this with Cocoa UIs as well as Carbon UIs without having to add additional logic to the Cocoa UI. So I think this is a very nice addition to this. And as I said before, this is real-time thread safe and we'll continue to work very hard to ensure that that remains true.
Okay. You all asleep yet? No? All right. Still with me? Okay. So we've got effects, and this is really, I think, where a lot of the work is done for third parties, but also another area that a lot of third parties do work in is instruments and software synthesizers. And the idea with this, of course, is that you're not going to be processing audio, although some synths will let you have audio input into their synthesis algorithms. But also the main part here is that there will be MIDI data coming either from a host app or from an external keyboard. And so we have an extension to the basic audio unit API for musical instruments and I'll get my clicker to work. And we call this a music device. It's in the musicdevice.h file. And this just adds semantics for starting and stopping notes and for the control of notes. We've got a couple of APIs there that really just talk MIDI. So you've got, you know, music device, send MIDI, and you just take the two or three byte MIDI for channel and then you've got another API for dealing with the SysX and the other sorts of extended MIDI messages. We also added an additional API for the music devices because we didn't want to just talk MIDI. As a software architecture, we don't really have to abide by the same limitations as hardware does with the MIDI stuff. So we have an extended API where you have the ability to have a software synth have more than just 16 channels. We call them groups in our parlance. So you can have an arbitrary number of groups. You can actually group notes into different instruments playing on the same group. So I could have three notes that are playing on different sounds and they're all playing on the same group, then I can just send control messages to that group. The extended protocol works in a similar kind of way to MIDI if you kind of break MIDI down in a certain way and you can express all of the control semantics of MIDI in the extended protocol, but it gives you some flexibility. You can specify pitch using a fractional floating point number, 60.3, 60.5 will give you the half tone between C and C# and away you go. So this is something that we're seeing a lot of third parties using the music device stuff. There's some very interesting third party sense out there already.
One of the things that we're working on, this is to address some concerns that were raised by Waves, who are a very large developer of audio processing plugins. And they wanted to have a way to do offline rendering. Offline rendering is typically done when you want to actually process a file of data and you want to look at the whole contents of the file, not just in a real time.
All of the audio unit development that we've done up until this particular unit is really aimed at working in real time. And so it has constraints about having to work in real time. You need to report latency. You need to report how long it takes you to process sounds in terms of the time delay between input and output. With an offline unit, you need to be able to look at the data, all of the data that you're going to be asked to process before you process.
So there's two render phases with an offline unit. There's an analyze phase and then there's a render phase. So if you think of reversing the sample order in a file, you need to actually like start at the end and work your way back. If you think of an offline unit that may normalize, you need to look at the whole audio data before you start to process it so you can do the normalization.
And there's not really any additional API changes for this. There's a couple of different flags. There's a couple of properties that are specific for an offline unit. There's a new audio unit type, AUOL, audio unit offline. It's not in your Panther headers yet because we're still revising and discussing this with Waves and some other companies. And this will be published in Panther, and we're getting pretty close. us. We can always -- if people are interested, they can contact me and I'll send them the spec and the code that we've got at the moment. We will ship a reverse -- an audio unit that does reversal in the SDK at some point as an example, and there will be code there as well as to how you host these offline units.
Okay, so when I said generalized audio unit, we're still sort of like in the general field at the moment about normal types of audio units. Now let's look at some abnormal type of audio units. And one of those types is mixer units. So in Jaguar we had two mixers, a stereo mixer that takes mono or stereo inputs and gives you a mixed single stereo output.
And in Jaguar as well, we had a 3D mixer. The 3D mixer will take multiple inputs. It will have a single output, and it will be in either two-, four-, or five-channel output. And the four-channel is like a quad setup, which is pretty much what we've got in the room today, and 5.0, where we don't actually do the 3D stuff into the LFE channel. We just do the five channels. What you can do with this 3D mixer is quite a lot. You can pan and localise within a 3D space. You have a number of on/off options. I'll get Chris Rogers to come up and we're just going to give you a fairly quick demo of the 3D mixer.
Well, thanks, Bill, for setting me up there. Last year, I gave a more complete demonstration of the 3D mixer, but some things have changed since last year because developers have made some requests of the mixer. So we put some new features into the 3D mixer, and we can have a look right here.
What I have right here is a simple little app, simple little user interface onto the 3D mixer. And it has a number of different controls for choosing the type of rendering for the source. In this demo there are going to be three sources, or there can be up to three sources. And you can choose equal power panning, simple spherical head model, an HRTF model. And these first three are for rendering to stereo output. And then the last two, sound field and vector based, those can be used for stereo quad or 5.0. Then over here we have some check boxes that let certain features of the mixer be enabled or disabled for individual sources.
Down here we have master volume control. And here this is kind of an obscure slider that controls distance attenuation. That is when sources -- when sound sources get farther away, they get quieter. But how much quieter do they get, say, if they're 10 meters away or 100 meters away, how much quieter do they get? This can control what that curve is at the falloff. And that's a feature that developers have been asking for.
Down here in this part of the display, we have some meters, which the 3D mixer now supports live metering. You can meter the input levels and output levels, both RMS power and peak levels. And metering is something that we put into a couple of our audio units, and later on today we'll see that new audio unit, Matrix Mixer also supports this, which is kind of interesting. So maybe I should just bring a source in and see.
So, maybe I should simplify this a little bit. Maybe I've turned all my sources off. OK, here we go. So I'm using vector-based panning right here. And I've turned these sources off, so we're not listening to those. We're just listening to this blue one here. But if I put this dot right here, then we're essentially just in the center channel.
Hold on, we have the helicopter coming in. I'm sorry. There he is. Let me turn him off. OK. I think now maybe our meters will show this a little better. OK. So I'm about straight ahead here. I should be coming out of the center speaker. And the channel ordering down here is left, right, surround left, right, center. So-- Okay, now we have left and rear left. We have probably surround left, surround right, OK. Now let me change this sound.
I'll use this sound, this sound right here. Okay. Okay. Turn it off. It looks like we're running a little bit short on time, so I'll try to wrap it up here. The new features that are the most important are the ability to turn off or on individual features here, and that affects the performance that you're going to get. This distance filter is a low pass filter that makes sounds sound more muted as they're getting further away, and some developers had some comments about that, that they didn't necessarily want their sounds to get more muted, so there's a way that that can be turned on or off. And any of these characteristics can be turned on or off separately. And as far as performance goes, we've made some optimizations to the mixer. And to give you an idea of the kind of performance that you can get for an individual source using equal power panning and stereo, I think on a pretty modest machine like an 800 megahertz G4, you can get a single equal power source at I think it's.18% of the CPU.
And HRTF, which would be our high quality stereo panning mode, that's at about 0.55% of the CPU per source. So a little tiny bit more than half a percent. Now, I have -- We'll just move on because we're running out of time. Okay. Go back to the slides. Thank you, Chris.
The other demo that we were going to show but we're running a bit short is a very speed unit and that's also new in Panther and you're able to have your sound come into the very speed and it can go faster or slower, it can go kind of, what's it, chipmunk effect. So we'll show you that later on. We'll be in a lab tomorrow from 1:00 to 6:00, the QuickTime lab, and we can give you a demo there if you like.
Okay. So in Panther as well, we have a matrix mixer. Matrix mixer is a very powerful beast, and we'll be going into some detail about that in the next session. All mixers have metering capabilities in Panther, so you'll see some of that in the next demo, next session. The other type of audio unit that we have is a converter unit. We'll be talking about the audio converter in the next session as well, and this brings some of the functionality of the audio converter into the audio unit world. Essentially, all of the conversion operations to do with PCM audio, so sample rates, bit depths, channel mapping, and so forth. And all of this is configured with properties, and actually there's very little configuration of it for you to do because just describing to the audio unit what your input and your output format is is enough to tell the converter what work it should do.
And the converter unit's functionality is included as a part of the output unit. The output unit's interfaced to a device. There's no additional latency. There's no runtime cost to you for dealing with the output units. And you can manage how much work the output unit does for you by seeing the difference between the format of the device, which is on the output unit, and the format that you provide it.
In Jaguar, we only did output, and now our output units do input. So that's why they're RabbitEars output units now. In Panther, and the output units will do output as they do in Jaguar on bus 1, and on element 1 or bus 0, I mean on element 1 or bus 1, you can do input. And so what does this look like? It looks like this.
So I'm going to use the slide. You know, they hate us using these things, but I love it. So here's your output unit. On the output side, here is your device output, and this is on bus zero. And then this is either a callback or a connection that you make to the audio unit. And this is if you're using this unit, this is what you're doing today. In Panther, you can also see if there is input available. There's a property to query that. And then the device's input is actually on the input side of the output unit. Confused? yet? It took me a while.
And then the application actually calls audio unit render for bus one, and that's where it gets the input from the device. And it can do the conversion for you as well. So if your device is 20 channels and you only want four channels, then it will remove those 20 channels and just give you the four channels. And you can tell the output unit which four channels that you want from the device, including rate conversion, all this sort of thing. And if your device just has input, and you need to know when to get input, then you can get a call back and it'll tell you, hey, the input's ready, go and get it. And so that's new in Panther.
Okay, so audio units, we talked about connection, we talked about wanting to be able to connect all these up, and we have a structure in the audio toolbox APIs called AUGraph, and the AUGraph manages the connections between these units. It'll manage the state, it has a very abstract representation of the graph, and then the graph itself has a couple of different states. You open the graph and that'll basically open the audio units that you've described as being your graph. Then you can initialize them and then you can actually start the graph running. You can update the state of the graph while it's changing. You can make or break connections. You can have a bunch of audio units sitting off to the side that may be one channel, one chain into a mixer or something and then you can just connect that chain in, play your sound, disconnect it. The graph will do all of this. It will manage the different threads of the thread that you're doing the rendering in, the thread that you're doing the state from and it will just make this a lot simpler to do than you can write your own code, but if you're going to do this, it's a good API to look at. Another API in the toolbox is the music sequence API. Music sequences are a collection of events and there's really two different types of events and we have two different types of tracks.
They're still just basically a track, but a music sequence has some concept of tempo. Music sequences talk about their timing in terms of beats. And so a tempo describes how many beats per second. If you want to deal in seconds, you can just make a sequence with the tempo event of 60 beats per minute, and then you can deal in seconds. And then you've got any number of events tracks. And the events tracks can take any number of different types of events. You can have media events, you can have user events where you can put your own data in there and you can have parameter events so you can talk directly to particular parameters on different audio units. And of course the connection is that you have a sequence and you connect that up to a graph and we'll show you that in a minute.
You can create a sequence from a MIDI file or you can create a sequence programmatically. You can, once created a sequence, you can save the sequence off to a MIDI file. We'll only save the MIDI-type events that are in that sequence to the MIDI file, obviously. We won't do the other ones. And you can iterate over the events. You can loop, mute, solo tracks when you're playing them. And you can copy, paste, insert, edit them. And these can be edited while the sequence is actually being played in real time.
And you can target the tracks that are in a sequence to two different destinations. You can target them to a specific audio unit that's in a graph that you've attached to the sequence, or you can target a sequence's track to a MIDI endpoint, so you could be sending from a sequence that you're playing directly out to a MIDI device.
And if you want to play it, then you've got to have some object to play it, and that's the music player. Music player has very simple start and stop semantics. It has pre-rolling, so you can, particularly with software, this is very important, so you can actually pre-roll, chase your events to whatever the time that you've set. One of the things that we added in Panther is the ability to scale the playback rate, so that you can, if you were wanting to do some work to synchronize this to some external time source or you just want to play the events back faster or slower, you can do this with this additional API. So if we go back to what we started with, connecting it all together, this was kind of the mess of the APIs. And then if we sort of put some blocks around it, this is how we kind of see it all fitting in. I'm going to go back to my laser pointer here. So here's the audio units, here's the HAL output device, here's the MIDI device. The audio units are sort of connected with these orange bars. You don't really see this connection here. That's really a part of the output unit and the state that it manages. In this case here, which is very similar to the 3D mixer, if you remove this filter and you think of the demo that Chris did, then really what he's doing is sort of this part of this whole kit and caboodle. So we're using the audio file API to read the file data from the disk, the converter to convert that into a format that we can use for the 3D mixer and then that just goes into the, as an input to the 3D mixer and in the demo that Chris was running we had three inputs into that mixer. One thing we didn't show is this side of it. So you can have a music sequence that's a collection of tracks. You can address the track events to different nodes in the graph and you can make the association between the sequence and the graph. And then the player actually owns the sequence. The player plays the sequence which starts the graph or if it's just talking to MIDI then it starts a MIDI scheduler and away you go. And of course you don't, excuse me, you don't need to use this. You can just use this bit or you can just talk to the lower levels of the system or you can just use audio units without using their graph. And so it's a very modular architecture. It's lots of components.
There are lots of complexities to get lost in. We are working on documentation. We're happy to know. And we are going to do some more work on some sample code and we'll see in the next session Doug's -- one of the things Doug's been working on, which will be in the SDK. And so that's pretty much all it. We provide those sort of services. I'm not going to read that. You can read that yourself.
And yeah, don't panic. You don't have to know all of these things. You really need to understand, well, what is it that your app needs to do, and then what is the appropriate API that you need to do to do that? And you can just ignore the stuff that you don't need to do.
Okay, so they're the four frameworks. Core audio for drivers, core MIDI for MIDI. The audio units really, that framework just publishes the API for the extendable audio units and codecs and then the toolbox for formats, files. We'll be talking a lot about the format stuff in the next session.
Wrap up. Road map, there's some of the sessions. We have the audio and quick time session tomorrow, which is going to be a very interesting session, I think, for you to go to. Feedback forum for us is after then, and Nick and Craig and myself and others will be there. Who to contact. And for more information, we've got the audio technologies for Mac OS X website, developer.apple.com/audio. you