Audio Unit Development and Hosting - WWDC 2004

Graphics • 57:57

Mac OS X features a robust plug-in architecture for DSP effects and virtual instruments, called Audio Units. This session takes an in-depth look at developing robust Audio Units and provide best practices that all host applications should follow. Also learn about the Audio Unit logo licensing program, which is designed to let your customers know that your product includes or supports Audio Units.

Speaker: Bill Stewart

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

And as you can see these applications cover a fairly broad range. There's also any number of third party apps I'm not mentioning. So if your app isn't here I apologise. It's just too many. But you have applications like Final Cut Pro which is a video app. You have GarageBand which is a consumer oriented application for making music.

And then you've got Logic and Performer that are oriented towards the pro end of the market. So that's a pretty broad range of context that you're already working in. So this is the problem that we found. processing its effects. It takes audio as input, does some processing to provide output. And they're the types of audio units we can concentrate on for this talk.

Now, just to give you a quick review of the API set, an audio unit is a component. There's component manager curves. You use find components to do searches to see what are the components that are in that system. And then once you find the component that you want to use, you open that. That opens the component.

It's essentially like calling you on a class object in Java or C++, you create an instance. That's called a component instance. And when we talk about audio units, we're talking about audio units as a component instance. And then once you've opened the component, there's things that you want to do to the component.

And so then there's a collection of APIs that are defined Then the main work, of course, is rendering. And so there's an API for Audio Units

[Transcript missing]

It also has a relationship to render, in that if there's some state that's changed in the host, you can use reset to basically tell the Audio Unit that any

[Transcript missing]

Audio Unit APIs are typically implemented by AU instruments and there's a specialisation of the general effect that can do things in particular with MIDI and so we can implement these APIs and answer those questions.

And then because instruments are primarily based around notes, we can

[Transcript missing]

it's often confusing to understand how things should work so we developed a command line tool called the AU Validation Tool. This tool is really just testing what our unit put through its bases, trying to break it in as many ways as it can, trying to make sure that it behaves itself. It's also a tool that can be used to debug.

We've had a lot of developers using this since we released it at the start of the year as a means to actually get their audio unit up and running, getting it to a state where they can then actually maybe try it in a host application. The Validation Tool is also part of the Logo program.

The validation tool is really good for some things. It really can't do it in the context of how do I use an Audio Unit. So the Host App, AU Lab, is provided to do this. And this provides a fairly typical

[Transcript missing]

The validation tool does is it just sees if it can find your Audio Unit. And what we're doing here is we're just finding the component.

We haven't actually

[Transcript missing]

We've done some open tests on the Audio Unit that we've found. And we profile the time it takes us to open the Audio Unit to The first time is what we call a cold time. And this is the time that's going to take in order to load any dependencies that your Audio Unit has.

[Transcript missing]

There's also two places that you can get component version numbers from the component manager, and so we also do a check to make sure they both match.

Then once we've opened the Audio Unit, we have a look at what its default configurations are. What is its initial state in terms of its I/O capabilities? In this case, we've got one input, one output. The default format is 44kHz, two channels. It's got the generic 32 float, B-Indian, D-lived format. We'll come back to this in a couple of slides. Then we look through the required properties. These are properties that all Audio Units are required for.

The last one, Instrument Count, is a property that a host can use to determine if your instrument is multi-channel, multi-tambural or mono-tambural. If it's multi-tambural, it may decide to do some different actions in terms of the way it passes MIDI. And so we require this for instruments. And then the Validation Tool looks at some recommended So this is actually latency, not tail time. And the first property we're looking at is This is a property that tells the host how long it takes for a given sample to appear in the output from some input. And latency can be involved with several different factors.

It could

[Transcript missing]

So that's a way to understand latency as just going from input to how long does it take for the output to appear. And tail time, which is now the correct title, is a similar property, but this time we're looking at how long does it take for a sample to disappear from the output. So latency was how long it takes to appear.

[Transcript missing]

The Audio Unit logo is designed to allow customers to have ramping operations in their algorithms, so that the input sample is going to be present.

[Transcript missing]

The supported number of channels is optional, although it's obviously essential information.

And what the property reports is any global capabilities that an Audio Unit has between, and dependencies between the number of input channels and the number of output channels. And of course, if you're looking at a synth, it may not have any input channels, so the input cycle is zero. And if the property is not implemented, then we make some assumptions.

If you're an effect, This is AU Lab. It's in the Tiger Applications Audio folder. Here is a bunch of tracks. I've also got a bass and then I've got an apple. We'll have a look through these. Now if you remember at the start, in the validation tool, we looked at In this case here, I've got a stereo track going to stereo. I'm able to create sub-menus for So, when we're doing stereo to stereo, we've decided not to put the channels here.

And then in each sub-menu, I can put the name of the Audio Unit here, and that's where we're getting that information from. This track over here is a mono input, and it's going to stereo. So now I've got a mono to mono, or I've got a mono to stereo.

So in the mono to mono, you'll see now that I don't have an entry here for Elemental Audio Suspector, because it's

[Transcript missing]

Okay, so getting into some more of the special properties, we've seen just sort of the basic kind of setup stuff. : Special properties. The first one we want to look at is custom UI.

This is a property where an Audio Unit can say, "Well, I've actually got some custom UI that I've written for myself. Here it is." We support two flavors of UIs, either Carbon UIs or Cocoa UIs. And in both cases, an Audio Unit can

[Transcript missing]

: Thank you, Dr.

Cronin. Even just looking at users of those different apps, you're probably looking at different people with different capabilities and different understanding. And so we really encourage The second property that's also pretty interesting and provides a good user experience is Factory Presets. Factory Presets are a collection of states of an Audio Unit, and by a Preset for an Audio Unit, A factory preset is the Audio Unit developer saying, "I know some really cool things that I can do with my Audio Unit, and they're like this." You can think of keyboards. If you look at the market for keyboards over the last 20 years with MIDI, most of the sounds that people play out of keyboards are the factory presets.

This is a very important We also have user presets. This is with a property called Class Info. So if the user wants to go in and tweak some of the values and have a look at whether they're using the custom UI or slider UI, they can tweak that value and then the host should provide the ability to save the preset and restore it to I'd like to see the host app supporting this location or a couple of locations that can be used to put presets in.

And by doing this, it means that when I'm using one application, I can get my presets, and then I go to another application, I can have the same presets available to me. And they can go across the system, regardless of which Audio Unit I'm using.

[Transcript missing]

: The Audio Unit preset is really saving the parameter state, but a preset may contain more than just parameter states.

So as a host, you can't just go through and save parameters. You've actually got to save the preset. But then there is a useful information that an Audio Unit can

[Transcript missing]

There's a couple of things. We talked about the pre-gain parameter here. So all that the generic UI is doing here is getting the parameter information and drawing this parameter from here. It's found out that it's DB from the parameter tag.

It's got the range. It's given me a slider that I can drag. You'll also notice that there's a kind of visual clumping of This allows us to create quite a useful and reasonable view for a lot of audio units just by doing sliders. So the first clump has just some global settings for the multiband, pre-, post-gain and so forth.

It's a four-band multiband, so it has three crossovers, so you can then set the crossover points in the next clump of three. And then you've got four clumps, one for each of the bands of the multiband compressor. And then down the bottom you've got a compression amount. Now, the compression amount is a parameter as well, and it's a read-only parameter.

I'm going to go ahead and show you the parameters there. That's just still a parameter. It's showing me how much in each of the bands I'm actually compressing at each time.

[Transcript missing]

Mac OS X Reverb makes particularly good use of parameters with different room sizes. And Reverbs can be very hard to program, so having a good collection of factory presets here can be very useful.

[Transcript missing]

In my Presets directory I've got an Apple AUmatrix Reverb directory now with my preset file. Any application can look for preset files in those directories and present the presets however they want to.

: Hi, everyone. I'm Bill Stewart. I'm the CEO of Audio Unit. This is an audio unit from Destroy FX.

[Transcript missing]

: I've got parameters that have got different types of things. I've got menus, I've got check boxes, all of the menu items have : Thank you, Paul.

I'm going to take a moment to thank the panel for this presentation. This is all being just generated from the generic parameter information. As I

[Transcript missing]

is able to do this because he can just send a notification to me. He's not doing this on the render thread. He's just saying, this parameter has changed, and then...

[Transcript missing]

So, we're getting into some really serious preset management problems for you guys.

There's this synth by Urs Heckman called Zebra. Let's just solo him. So, he has a custom

[Transcript missing]

This is fine. You could present this in a menu and that would be... This is fine. You could present this in a menu and that would be... We've got keyboard navigation going on here. I can just cycle through the presets here.

[Transcript missing]

: We're going to start with the Audio Unit logo.

This is the first time that we've used the Audio Unit logo. It's a new feature that gets exposed to the user to use. In the case of ours, as you can see, all of the presets here, and all that we're doing in AULAB, is we're just going through the zebra directory, and every directory that we're seeing, we're creating an entry in there.

[Transcript missing]

The next thing the Audio Unit does is it looks at all the information you've told us about the formats that you can handle. And then it says, "Okay, let's see if you actually can handle them." So it doesn't do any validation at this point in terms of,

[Transcript missing]

does a collection of render tests. Now the render tests are done at different numbers of frames.

It uses also, if : The idea behind this case is that in each of those steps, when it's rendering at a different number of frames, it does the following pattern. It un-initializes the Audio Unit, it sets max frames to that number of frames. If it's got to reset the format, it'll do that. Then it initializes the Audio Unit, then it renders This is the sort of pattern of usage that you'll see in host applications. If we can go to slides just for a moment.

I just wanted to show you, in an application, you may have a, what IO buffer size do you want to do here? So that's part of this pattern of uninitializing and reinitializing. It's very important that Sample rate of the device would have changed, and this is also an important part of this whole step.

So that's what the tests are. also looks at the semantics and connections. As we insert or remove Audio

[Transcript missing]

sort of the best practice that's sort of embedded in the validation tool is that, you know, before initializing hosts should be setting max frames and we, we expect and recommend that the host, whenever at all possible, if it sets max frames to 512, that that's always what it asks the audio unit to do. We haven't made this a mandate, but we're very strongly recommending this, so we'll have a look at some of the kinds of situations we want to avoid in the future.

And that involves using buffer offsets for scheduling parameters and : Let's say we've got a host app and the user has said that they want to ramp from one parameter value to the next over some period of time. The host has translated that into a ramp that's going to start at one parameter value and then Audio Unit says that it can be ramped.

Then the Audio Unit, the host can then just turn it around and set the start and end, start and duration, and starting value and ending value for the ramp. And then that's all it needs to do for that whole slice. And we like to see the supporter because we also think Audio Units know

[Transcript missing]

So once the host has done either of these two actions, then the idea is to render for 512.

So the host will see that this has to take place. So it'll then decide, well, I'm going to break this 512 frames up into a bunch of little renders. So I'm going to render for

[Transcript missing]

This sort of practice is legal and it's fine, but it has caused some Audio Units not to publish parameters because they really can't handle being bundled up into these little slices when they're asked to render, just because of the nature of the way they do their processing.

And so we really would like to see that this is a discouraged practice, that it fades into the distance and that we all This isn't something that we do anything with in AU Lab because we're not a sequencer, it's a mixing console as an application, so that's the last slide. Thank you for your patience.

Now, one of the things that AU Lab does that we mentioned previously is that it really : Audio Unit deals in a very sort of hypersensitive situation. It's very, very twinkly. And that's actually, for this manifestation of it, this is quite deliberate. And it's deliberate. And what will happen is that when there's a problem, when you take too long in your render pool, you'll get an audio device overload from the hell.

You'll miss samples. And When you see this in AU Lab, you should then diagnose this and there are some tools that can be used to help you. In this case we have a CPU or quality property. You may want to actually provide that to users so that they can scale back your processing if they're on a slower machine or they've just got 3,000 plug-ins to run instead of the 50 that you run at home and they want your thing but they don't need it to be high quality or something like that. So that's one way that you can provide UI for that. We provide this in the DLS Music device.

A second common problem

[Transcript missing]

: We've talked a little bit about re-initialization and reset. Reset is a timeline changes in

[Transcript missing]

There's two examples we're going to look at today. One is Apple's DLS Music device. It has two modes. You can have it as an internal reverb mode where it's only got one active output, or a send mode where it has two outputs.

One output is the dry mix, which is the dry mix, which is the sound output. And the other output is the audio output. So this is a very simple way to look at it. The second one we're going to look at is Native Instruments' Reactor Audio Unit. This is, you know, is probably something has a dynamic assignable capability for where his channels go.

And so what we'll have a look at in the lab is how a host can configure that and then Okay, so first of all, here's a collection of tracks here. You'll see that they, it highlights both tracks. This is the DLS Music device. I've got this one turned off from the mix, so I'm not actually having this output going into the mix that's coming from the track mixer at all.

In fact, what I'm doing is that I'm sending it over to a bus, and then on that bus I've got a reverb. And as you saw me previously, I was sort of playing around with the reverb though. : We've got an audio overload here, so there's a problem we should look at. The DLSS code went to sleep.

[Transcript missing]

It doesn't really matter from Reactor's case, because all that Reactor does is process 16 channels. And so the way that we can configure this from the host side is however I want to use this Audio Unit on the hosting side. Now, if I bring up Reactor's UI, because once you get into an Audio Unit that's like this,

[Transcript missing]

I also noticed that Reactor knows that it's only using four channels because when we've configured the Audio Unit, we've said to it, "Well, actually, I wanted two outputs, and those two outputs are going to have : So it knows that I'm using four channels.

Then I can connect these up into I've actually got a layering now of the two instruments. There's that instrument. There's that instrument. One of the things I can do here is I can send this guy over to a send bus. And I'm going to actually just have this coming out of the rears. And I'm turning this off. And so now you should be hearing... This should be coming out of the front two speakers.

And that should be coming out of the rear. So A-Lab lets you kind of mix and match things and see, because I think this is an important thing that we forget as Audio Unit developers, is that Audio Units are going to work with other Audio Units. And so it's nice to have a host where you can play around with these things.

How are they going to mix with other Audio Units? Maybe there's even some nice patches to use with Apple's Reverb or a piano sound or, you know, a couple

[Transcript missing]

The web is really focused on real-time usage and we really only look at effects and instruments. By real-time what we mean is that when I ask you for 512 samples, if you need input, you're only going to ask for 512.

So we call this sort of like a real-time functionality. And converters are a little bit different. Converters, Audio Units, they can be used in a real-time context, but they're also able to ask for more or less data whenever they're asked for output, for their input. And so you can use them in real-time, but there may be some constraints, and we'll have a look at that.

[Transcript missing]

say well I'm going to process 3 million samples. So it knows the range of samples it's going to process and it can jump arbitrarily around when it needs input as you're asking serially for its output. And there's examples in the Cordio SDK about how to use an offline audio unit and refer you to that. If you saw the session this morning Doug talked about audio unit generators as being a new audio tab in Taga.

And audio unit generators only generate audio. They don't have audio I'm going to say the same thing for Converters as well. Then it's potentially an Audio Unit that you might want to consider hosting in your application. If it doesn't publish an interface like the scheduled Slice Player that we provide, then there's really nothing you can do. It's not designed to be used with a UI, so you would skip that one.

[Transcript missing]

There's that. Now, because this is a generator audio unit, it's not, it's kind of, we're going to bend the rules a bit. We said that AU Lab is like this kind of real-time thing. But actually, when I've got a generator unit, I'm going to assume that I'm, it can be kind of, There's that. Now, because this is a generator Audio Unit, it's not, it's kind of, we're going to bend the rules a bit.

We've said that AU Lab is like this kind of real-time thing, but actually, when I've got a generator unit, I'm going to assume that I'm, can be kind of,

[Transcript missing]

This AU time-pitch show, we're pretty pleased with the quality we're getting so far. We've still got some work on optimizing it. I can make this speed up a little bit.

As you can see, I can interact with this quite a nice way. I can define a whole bunch of regions using the Firefly array you hear from transcribing music or learning something here. I can go faster as well. I think that I've really got that bit down. I can make it go slower and I can change The things you'll notice here as well is that we have a standard and an export tab here.

Audio Unit parameters can also publish export parameters.

[Transcript missing]

We looked at AU Lab as a way that you can use to develop your Audio Units and you can have some fun with it. It shows you how your AU is presented and you can

[Transcript missing]