Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2004-223
$eventId
ID of event: wwdc2004
$eventContentId
ID of session without event part: 223
$eventShortId
Shortened ID of event: wwdc04
$year
Year of session: 2004
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC04 • Session 223

Audio Unit Development and Hosting

Graphics • 57:57

Mac OS X features a robust plug-in architecture for DSP effects and virtual instruments, called Audio Units. This session takes an in-depth look at developing robust Audio Units and provide best practices that all host applications should follow. Also learn about the Audio Unit logo licensing program, which is designed to let your customers know that your product includes or supports Audio Units.

Speaker: Bill Stewart

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it may have transcription errors.

What is an audio unit? How does it work? How is it hosted? And then at the end of the session, we'll look at some new audio unit tabs that we're doing in Target. So first of all, what is an audio unit? digital audio workstations, like Imagix Logic, Mark the Unicorn's digital performer. Apple ships a number of Apple-branded applications And as you can see, these applications cover a fairly broad range. There's also any number of third party apps I'm not mentioning, so if your app isn't here, I apologise. It's just too many. But you have applications like Final Cut Pro, which is a video app. You have GarageBand, which is a consumer-oriented application for making music. Then you've got Logic and Performa that are oriented towards the pro end of the market.

So that's a pretty broad range of context that your data unit can work in. So this is the problem that we've processing its effects. So it takes audio as input, does some processing to provide output. And they're the types of audio units we're going to concentrate on for this talk. Now, just to give you a quick review of the API set, an audio unit is a component. There's component manager curves. You use fine components to do searches to see what are the components and then once you find the component that you want to use, you open that. That opens the component.

It's essentially like calling you on a class object in Java, C++, you create an instance. That's called a component instance, and when we talk about audio unit, we're talking about audio unit as a component instance. And then once you've opened the component, there's things that you want to do to the component. And so then there's a collection of APIs that are defined So there's an uninitialized call where you can do a lot of deallocation and then the initialize will follow. Then the main work of course is the rendering.

the audio units would use some amount of audio data. In the process of rendering, there's often parameters or things about the rendering process change their values to affect how the unit is doing its work. And so you look at parameters set a discrete value, or there's a parameter type, it's the same parameter, but you're basically saying, well, instead of just setting a value at this point, It also has a relationship to render in that if there's some state that's changed in the host, you can use reset to basically tell the audit unit that any saved audio data that it has at this time, I want you to throw it away. I want you to go back as if you'd never seen audio before.

to audio unit APIs for dealing with direct MIDI messages that can be passed to audio units. These are typically implemented by AU instruments and there's a special sort of specialization of the general effect at audio unit that can do things in particular with MIDI and so we can implement these APIs and answer those questions. And because instruments are primarily based around notes, and 16 of them. And there's quite a flexible API in the AU instrument.

An instrument can be mono or multi-tangled, and that can be determined as we'll see. So in order to facilitate the development of the audio unit, we obviously spent a lot of time in mailing lists. We have a collection of SDK, it's often confusing to really understand how things should work. So we developed a command line tool called the AU Validation Tool. This tool is really just testing what our unit put it through. It's based to try to break it in as many ways as it can, try to make sure that it behaves itself. It's also a tool that can be used to debug. We've had a lot of developers using this since we released it at the start of the year so they can then actually maybe try it in a host application. The validation tools are also part of the logo And whilst the validation tool is really good for some things, it really can't context of how do I use an audio unit. So the post-app AU Lab is provided to do this. And this provides a fairly typical unit.

background, it's really aimed at a live usage, it has very low latency, does all of its work is to walk through the validation tool. I'm going to take our multiband compressor. We'll just take a look through the validation tool, look through the different stages that we... As we go through some of these, we'll then go and have a look at how a host application uses the information validation tool does is it just sees if it can find your audio unit. And what we're doing here is we're just finding the component.

So, I'm just going to start by abstracting some information from the component about We've done some open tests on the audio unit that we found. And we profile the time it takes us to open the audio unit The first time is what we call a cold time, and this is the time that's going to take in order to load any dependencies that your audio unit has. the user. If you have 200 audio units on your system and each audio unit takes a second to be opened, There's also two places that you can get component version numbers from the component manager, and so we also do a check to make sure they both match.

Then once we've opened the audio unit, we have a look at what its default configurations are. What is its initial state in terms of its I/O capabilities? So in this case we've got one input, one output. The default format is 44 kilohertz, two channels. It's got the generic 32 float, the Indian format.

And we'll come back to this in a couple of slides. And then we look through the required properties. So these are properties that all audio units The last one, instrument count, is a property that a host can use to determine if your instrument is multi-channel, multi-tambled or mono-tambled. If it's multi-tambled, it may decide to do some different actions in terms of the back pass's MIDI. And so we require this for instruments. And then the validation tool looks at some recommended parameters.

So this is actually latency, not tell time. And the first property we're looking at is latency. This is a property that tells the host how long it takes for a given sample to appear in the output from some input. And latency can be involved with several different factors.

received by the audio unit to how long that will take to appear in the audio. of the algorithms that are being used by the audio unit. But it may also come about because of buffering. You've got audio units that wrap code that actually executes on external boxes like TCWorks, PowerPool, Universal Audio, acceleration systems. And so they have to actually do buffers in the audio unit side That's a way to understand latency is just going from input to how long does it take for the output to appear. And tail time, which is now the correct title, is a similar property, but this time we're looking at how long does it take for a sample to disappear from the output. So latency was how long it takes to appear is like a reverb tail. When you have a sample that goes into a reverb, it will take some period of time.

have ramping operations in their algorithms so that the input sample is going to be present So the trial time is really used by applications dealing with both of these properties are used by applications of doing the synchronization and tail time used in conjunction with latency can then be used to say okay if I play a song from the start through to the finish and I listen to the audio that comes out of a sound like this. Now the audio that comes out at one minute is going to have some of if you apply it from the start to the finish, but if you start it halfway through. And so this is a property family card.

to support this bypassing is just passing audio from input to output and not processing it. Bethlehem Audio units can do a better job of bypassing based on what they're doing before they were passed to the bypass. There's a couple of properties that we just really verify that the properties are there, that they're not crashing is host callbacks. Host callbacks are used to get information from the host about their time. Like what beat is it, what is the tempo, what's the key signature, time signature, The supported number of channels is optional, although it's obviously essential information. And what the property reports is any global capabilities that an audio unit has between, and dependencies between the number of input channels and the number of output channels. And, of course, if you're looking at a synth, it may not have any input channels, so the input cycle is zero. And if the property is not implemented, then we make some assumptions. If you're an affair, This is AU Lab. It's in the Tiger Applications Audio folder.

There is a bunch of tracks. I've also got a bass and then I've got an apple. We'll have a look through these. Now, if you remember at the start, we had in the validation tool, we looked at, In this case here, I've got a stereo track going to stereo.

I'm able to create sub-menus for stereo to stereo, we decided not to put the channels here. And then in each sub-menu I can put the name of the audio input here and that's where we're getting that information from. This track over here is a mono input and it's going to stereo. So now I've got a mono to mono, or I've got a mono to stereo. So in the mono to mono, you'll see now that I don't have an entry here for Elemental Audio Suspector because it's told me Whereas we're using the component names, we get names, the audio units, we're using the number of channels. What a host will often do is that it will run once. It will profile your audio units, see what it can do, and then when the host launches again, it looks at the component resource to see what the version of the audio unit is. And then if that hasn't changed from the last time, Okay, so getting into some more of the special properties, we've seen just sort of the basic kind of setup stuff.

So, we have a few special properties. The first one we want to look at is custom UIs. This is a property where an audit unit can say, well, I've actually got some custom UI that I've written for myself. Here it is. We support two flavors of UIs, either Carbon UIs or Cocoa UIs. feature of order units because order units can just publish themselves once, distribute even just looking at users of those different apps, you're probably looking at different people with different capabilities and different understanding.

And so we really encouraged The second property that's also pretty interesting and provides a good user experience is factory presets. Factory presets are a collection of things that have states of an audio unit. And by a preset for an audio unit, So a factory preset is the audio unit developer saying, well, I know some really cool things that I can do with my audio unit, and they're like this. And you can think of keyboards, if you look at the market for keyboards over the last 20 years with MIDI, most of the sounds that people play out of keyboards are the factory presets. So this is a very important, We also have user presets. This is with a property called class info. And so if the user wants to go in and tweak some of the values and have a look at whether they're using the custom UI or a slider UI, they can tweak that value. And then the host should provide the ability to save the preset and restore it to-- to see the host app supporting this location or a couple of locations that can be used to put presets in. And by doing this, it means that when I'm using one application, I can get my presets and then I go to another application, I can have the same presets available to me. And they can go across the system regardless of which audio unit I'm using. to have a look at some of the different ways people use this. So back to the demo machine. I've got there before I could. So here's the multiband that we looked at before.

So we talked about parameters before. We talked about how order units can use parameters to set So the preset is really saving the parameters state. But a preset may contain more than just parameter states. So as a host, you can't just go through and save parameters. You've actually got to save the preset. But then there is a useful information that an audio unit can receive. this is an index parameter and it has values, it's not a continuous parameter. You can set arbitrary ranges and so on.

uses that people make out of the parameter information is published is to draw, back to this, is to draw, I made the mistake once, but twice is, is to draw a generic UI. Now the UI that you're seeing here is a new UI So there's a couple of things. We talked about the pre-gain parameter here. So all that the generic UI is doing here is getting the parameter information and drawing this parameter from here. It's found out that it's DB from the parameter tag. It's got the range, it's given me a slider that I can drag. You'll also notice that there's a kind of visual clumping This allows us to create quite a useful and reasonable view for a lot of audio units just by doing sliders. So the first clump has just some global settings for the multiband, pre, post, gain and so forth. It's a four band multiband so it has three crossovers so you can then set the crossover points in the next clump of three. And then you've got four clumps, one for each of the bands of the multiband compressor. And then down the bottom you've got a compression amount. The compression amount is a parameter as well, and it's a read-only parameter.

you can see that the parameters there, that's just still a parameter, it's read-only, and it's showing me how much in each of the bands I'm actually compressing at each time. We have a few factory presets over here for different types of settings for the multiband. Max Reverb makes particularly good use of parameters with different room sizes. And Reverb is very hard to program, so having a good collection of factory presets here can be very useful. And also, say the priest said, let's say I do a cathedral.

sort of hearing the reverb that's coming out of the synth there, I can give this different sizes. Let's say that, well, actually, you know, I like the cathedral, but it's a little too wet, so I'm going to just make it... save as preset here and what we're doing in the dialog here is saying well we've got a user location, the local user means the preset's only going to be visible for one user, local means anyone on the machine can see the preset and I'm going to call it local user.

set and then it's going to create a directory. I can show this in the and say, okay, so in my presets directory, I've got an Apple AEMatrix reverb directory now with my preset file. And then, you know, any application can look for preset files in those directories and present the presets however they want to.

custom and generic views, this is a unit from DestroyFX. And you can see there that this is obviously not a generic view. It's got some slides. It's got some nice cool things going on here. I can have a look at this as a generic view. So now I've got parameters that have got different types of things. I've got menus, I've got check boxes, all of the menu items have this is all being just generated from the generic parameter information. As I play through see like that he's actually lighting up the parameters as I hit keys on the keyboard.

he's able to do this because he can just send a notification to me. He's not doing this on the render thread. He's just saying this parameter has changed and then system for the parameters that the hosts use will hear the parameter change, come in and then they'll set the value, and then when it goes off it sends another parameter change So, we're getting into some really serious preset management problems for you guys.

going on here. These are just going through the factory presets. This is fine. You could present this in a menu, and that would be... is that Urse ships a couple of hundred user presets. And furthermore, he links them into different categories. He's had a couple of other guys And so we've got keyboard navigation going on here. I can just cycle through the presets here. I think I might run... and by doing a nice sort of a view for this, it gets exposed to the user to use. And in the case of us, as you can see, all of the presets here, and all that we're doing in AU Lab is we're just going through the zebra directory, and every directory that we're seeing, we're creating an entry in there.

Stay on the demo machine. That's fine. Anyone wanna keep count, see how many times I mess this up. One of the things I wanted to show you about I've actually got one sound coming from Zebra and I've got the piano mixed in here. One of the things that we're doing is you can hear the Zebra sound is pitch bending but the piano isn't. We're using a facility in Quill MIDI And it's just controlling how the application's passing media to the different sensors around the system. I can split the keyboard. I have turned the pitch bend off for this one. In the case of I didn't. I'm not hearing any here but down here I am. That's all split. The MIDI through API is in the. I can see the.

I can see the So then the next thing that the audio unit does is it looks at all the information you've told us about the formats that you can handle. And then it says, okay, let's see if you actually can handle them. So it doesn't do any validation at this point in terms I'm going to see if you can. It said that you didn't do one channel in, four channels out. So I'm going to see if you can't and make sure that the audio does a collection of render tests. Now the render tests are done at different numbers of frames.

is that in each of those steps, when it's rendering at a different number of frames, it does the following pattern. It uninitializes the audio unit. It set max frames to that number of frames. If it's got to reset the format, it'll do that. Then it initializes the audio unit, then it renders it. This is the sort of pattern of usage that you'll see in post applications. If we can go to slides just for a moment.

to sort of show you in an application, you might have like a, what IO buffer size do you wanna do here? So that's part of this pattern of uninitialising and reinitialising. sample rate of the device would have changed, then this is also an important part of this It also looks at the semantics and connections.

through connections or we can do this through setting input pullbacks and so forth we just validate to make sure that you don't require any reset or don't behave in a mysterious manner it also checks to see how you're handling parameters So, it's going to look at those parameters, it's cashing a few of them away, and then sort of the best practice that's sort of embedded in the validation tool is that, you know, before initializing, hosts should be setting max frames and we expect and recommend that the host, when at all possible, if it sets max frames to 512, that that's always what it asks the audio unit to do. We haven't made this a mandate, but we're very strongly recommending So we'll have a look at some of the kinds of situations we want to avoid in the future. And that involves using buffer offsets for scheduling parameters.

Let's say we've got a host app, and the user has said that they want to ramp from one parameter value to the next over some period of time. And the host has translated that into a ramp that's gonna start at one parameter value unit says that it can be ramped, then the audio units, the host can then just turn around and set the start and end, start and duration, and starting value and ending value for the ramp. And then it, then that's all it needs to do for that, for that house slice. And we like to see the supporter because we also think that our units know that they're The host doesn't know what your algorithms are doing with the parameter values changes. If the parameter is not ramped, the ramp can be So once the host has done either of these two actions, then the idea is to render for 512.

So the host will see that this has to take place. So it'll then decide, well, I'm gonna break this 512 frames up into a bunch of little renders. So I'm gonna render for, a second render, and now it's gonna schedule a parameter change, which will be the midpoint of the ramp. Then it's gonna schedule the third render, and then it's going to schedule the, This sort of practice is legal and it's fine, but it has caused some audio units not to publish parameters because they really can't handle being kind of bundled up into these little slices when they're asked to render just because of the nature of the way they do their processing. And so we really would like to see that this is a discouraged practice that fades into the distance and that we all go back and forth.

This isn't something that we do anything with in AU Lab because we don't have a sequencer to mixing consoles and applications, so that's the ones that, thank you for your patience. Now, if one of the things that AU Lab does that we mentioned previously is that it really, deals in a very sort of hypersensitive situation. It's very, very tweaky.

And that's actually, for this manifestation of it, this is quite deliberate. And it's deliberate. And what will happen is that when there's a problem, when you take too long in your render pool, you'll get an audio device overload from the hell. You'll miss samples and you see this in AULAB, you should then diagnose this and there are some tools that can be In this case we have a CPU or quality property. You may want to actually provide that to users so that they can scale back your processing if they're on a slower machine or they've just got 3,000 plugins to run instead of the 50 that you run at home. And they want your thing, but they don't need it to be high quality or something like that. So that's one way that you can provide UI for that. We provide this in the DLS music device. A second common problem I think it's important to assume as a developer that when you're being asked to render, if - We talked a little bit about reinitialization and reset.

Reset is a timeline changes in reset will be called. And so that means, you know, if I had audio that was going from 10 minutes ago, I still don't want to hear that when I unmute the track, so reset please will... So that's a good way to test that you're behaving correctly there. Reinitialization, as we talked about previously. Okay, so there's a couple of advanced config things that I want to go through. And one of the things that we've supported right from the first day, but it's been... We've not seen houses providing good solutions for this. So we've done some work in AU Lab to show there's some ways that this can be presented. There's two examples we're gonna look at today. One is Apple's DLS Music device. It has two modes. You can have it as internal reverb mode where it's only got one active output, or a send mode where it has two outputs. One output is the dry mix, The second one we're going to look at is Native Instruments reactor.

has a dynamic assignable capability for where its channels go. And so what we'll have a look at in value lab is how a host can configure that Okay, so first of all, here's a collection of tracks here. You'll see that they, it highlights both tracks. This is the DLS Music device. I've got this one turned off from the mix, so I'm not actually having this output going into the mix that's coming from the track mixer at all. In fact, what I'm doing is that I'm sending it over to a bus, and then on that bus I've got a reverb. And as you saw me previously, I was sort of playing around with the reverb there.

We can see that we've basically got an audio overload here, so there's a problem we should So, we have this ability to add doesn't really matter from Reactors' case because all that React does is process 16 channels. And so the way that we can configure this from the host side is however I want to use this audio unit on the hosting side. Now, if I bring up Reactors UI, because once you get into an audio unit that's like this, And so you're really gonna need to custom your own for this sort of thing. And so you can see in the diagram there that I've got, And you'll also notice that Reactor knows that it's only using four channels because So it knows that I'm using four channels. Then I can connect these up into I've actually got a layering now of the two instruments. Which is, there's that instrument, there's that instrument. One of the things I can do here is I can send this guy over to a send bus. And I'm gonna actually just have this coming out of the rears. And I'm turning this off. And so now you should be hearing This should be coming out of the front two speakers. That should be coming out the rear. So A-Lab lets you kind of mix and match things and see, 'cause I think this is an important thing that we forget as audio unit developers is that audio units are gonna work with other audio units, and so it's nice to have a host where you can play around with these things. How are they going to mix with other audio units? Maybe there's even some nice patches to use with Apple's Reverb or a piano sound or a couple itself does some internal management of its views. You can see there's the difference. It has its own scroll bars. This is probably a pretty reasonable thing for React to do.

is really focused on real-time usage and we really only look at effects and instruments. By real-time, what we mean is that when I ask you for 512 samples, if you need input, you're only going to ask for 512. So we call this sort of like a real-time function. And converters are a little bit different. Converters, or units, they can be used in a real-time context, but they're also able to ask for more or less data whenever they're asked for output, for their input. And so you can use them in real-time, but there may be some constraints we'll have a look at. There's the off night audio unit. This is a radio unit we defined some time ago.

also be conversed, in some cases not, but an offline AU can seek arbitrarily amongst say, well, I'm going to process 3 million samples. So it knows the range of samples that it's going to process, and it can jump arbitrarily around when it needs input as you're asking serially for its output. And there's examples in the Cordio SDK about how to use an offline audio unit, and refer you to that. If you saw the session this morning, Doug talked about audio unit generators being a new audio tab in Taga. And audio unit generators only generate audio. They don't have audio to say the same thing for converters as well, then it's potentially an audio unit that you might want to consider hosting in your application. If it doesn't publish an interface like the scheduled sound player, the slice player that we provide, then there's really nothing you can do. It's not designed to be used sort of with a UI, so you would skip that one.

one of the things that you can provide as a host of that is, well, what do I do with this guy? And that would be, well, you'd probably record the audio or provide an option to record the audio from that. Obviously, you'd route it to the mix. The reason I bring this point up is that a lot of hosts will... session, he gave a demo where he scanned through his disk and played files randomly and music concrete. So we've also built a UI for the file player, this is in With this UI I can define any arbitrary region that I want on the file.

So there's that. Now, because this is a generator audio unit, it's not, it's kind of, we're gonna bend the rules a bit. We've said that AU Lab is like this kind of real time thing, but actually, when I've got a generator unit, I'm gonna assume that I'm gonna be kind And so in this context we're actually also going to allow the presence of converters that it stretches time without changing the pitch, or changes the pitch without changing the time. And we'll show that to you. In order to do that, the real-time side of this is coming from below the converter. It's coming from the mixer. It's asking for 512 seconds. data in order to produce the 512 samples. Because I've got a generator there that's just generating audio, I'm assuming that I can ask it for these different numbers of samples and it'll be okay. So, let's see how that works. So, I'm going to slow this right down.

So, AU time pitch show, we're pretty pleased with the quality we're getting so far. We've still got some work on optimizing it, I can make this speed up a little bit. As you can see, I can interact with this quite a nice way. I can define a whole bunch of regions using the Firefly array you hear from transcribing music or learning something here.

I can go faster as well. I think that I've really got that bit done. I can make it go slower and I can change One of the things you'll notice here as well is that we have a standard and an expert tab here. What are unit parameters can also publish expert parameters They're saying, well, you know, I've got a whole bunch of parameters here. Some of them are really kind of easy, and some of them only an expert should really look at. And so we actually support that in the Cocoa UR.

sort of look through the validation tool. We looked at AU Lab as a way that you can use to develop your audio units and you can have some fun with it. It shows you how your AU is presented and you We used HECMAN, Native Instruments, DestroyFX, AudioUnits here today, the host applications that are supported by AudioUnits. information. There's basically if you go to developer.apple.com/audio, there's links to all of these from that site. We're also pretty active on the