Digital Media • 1:02:33
This session covers the fundamentals of the audio and MIDI architecture, where the important actions of getting the data in and out of the system take place. Threading priorities are detailed along with more complex systemic interactions to ensure the sound you create is the sound your customer hears.
Speakers: Jeff Moore, Doug Wyatt, Bill Stewart
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Good afternoon and welcome to session 507, which is about audio and MIDI. I'm Craig Keithley. I'm Apple's USB and FireWire evangelist. As I mentioned in the session yesterday, audio is one of my favorite topics, and so I'm happy to help with this session today. Some of the topics we're going to talk about are going to be the core audio, how, and I'm going to bring Jeff Moore up here in a second.
Before I do, I just want to say that some of today's equipment is kind of special. We work closely with Harman Kardon on a lot of areas. As you can see, we've got JBL speakers all around here. We've got a high-end Harman Kardon AVR-8000 that's doing AC3 or SPDIF to multi-channel decode. We like Harman. They do a lot of good things for us. Thank you very much. Jeff.
So as Craig said, my name is Jeff Moore. I'm an engineer with the Core Audio Group. I'm going to tell you a little bit about the HAL today. I hope to impart the anatomy of how things work and at least get you kind of familiarized with how the API is laid out. So as most of you know, the HAL provides the lowest level access to audio devices in Mac OS X.
The HAL supports pretty much any kind of audio device you can imagine, as long as there's a driver for it, from simple consumer gear to complex pro gear. The key things that we do with the HAL is multi-channel, high sample rates, and new with Jaguar this year is support for nonlinear PCM formats to be sent to hardware.
To get you started with the HAL API, you should know that it's a loosely object-oriented API, but it's a C API. Understanding the object hierarchy is important for understanding how the HAL works. One of the common things you'll find with all the Core Audio APIs is that we use what we call properties to provide access to a manipulation of state and behavior of all different kinds of objects. What I say about properties here are generally true for properties just about anywhere in the audio system.
So, property objects are addressed as key-value pairs. The keys will vary from object to object and API to API, but they will always include an integer ID. And the value is always an untyped block of memory whose contents is going to be predetermined based on the ID. In the HAL, all the get property info routines for the various objects provide you with the size of the property as well as whether or not you can actually set that property. A lot of properties don't necessarily require you to call get property info because their size is always predetermined. A good example is the buffer size of the HAL, which is always going to be a 32-bit integer.
So the HAL also provides notifications for when property value changes. The client can sign up for notifications for all properties that the HAL provides, and they just provide a property listener proc to the HAL through the appropriate add property listener calls. Jeff Moore, Doug Wyatt, Bill Stewart The other thing that makes this a little easier is that the HAL supports wildcards for the various different aspects of what you have to set up to get a listener.
An example is you might want to listen to all the properties on a given channel, or you just might want to listen to all the volume notifications for this device. The permutations can get pretty rich. Your listener proc is going to get called on the HAL's CF run loop.
The HAL provides a global property in each process to allow you to actually manage what this CF run loop is and what thread it runs on. For most Carbon and Cocoa-based apps, you're going to want to actually set this run loop to your app's main run loop to allow you to do things like draw when the property listener fires off.
Another thing to remember about properties in the how is that changing them is an asynchronous operation. You definitely want to wait for a notification to come back from the system before you assume that the value has actually changed. A common place where people have been falling into this trap is when setting the formats of a device. You definitely don't want to assume the format is just set when you call set property. You want to wait until you get the appropriate notifications back. So it's very important that you sign up for all the notifications that you need.
The first object that I'd like to talk about in the HAL is the HAL system object. It's addressed in the API for the routines that are prefixed with audio hardware. Its job is to manage the process global state for the HAL in that individual process. Its properties are only addressed by their four care IDs.
Some of the interesting properties in the global object are going to be the device list property, K-audio hardware property devices, the run loop property, which I mentioned previously, as well as a new property that we've added, which is something you should pay attention to. If you're using audio sporadically, you will probably want... New in Jaguar, the HAL has brought in a new property that will prevent it from unloading itself after a given timeout. there's a property that allows you to turn that on and off.
The next object in the how is the audio device object. The audio device object is the basic unit for doing I.O. and also for doing timing. The properties in the device object are addressed by the ID, the direction for the section that the property applies to, and that's going to be input or output, and what individual channel that you're speaking about.
Some important properties that you have for the device object are the buffer frame size, which allows you to change the size of I.O. that you do in each I.O. proc, as well as the stream configuration, which allows you to get the buffer layout of the device prior to your I.O. proc being called so you know what to expect when your I.O. proc is called. Then you can change the device's format using the stream format property.
The devices are broken down into stream objects as well. Stream objects represent a single buffer of I/O on the device. Their properties are adjusted just by the ID and the channel. The direction is implicit in the stream itself, because streams can only be either input or output. Some of the stream properties that you're going to be interested in getting a hold of is the starting channel, which tells you what channel number that that stream's first channel applies to. If you have a multi-channel device with multiple streams and you're looking at just a stream, you need to know this is channel 5 out of 8 or whatever.
Then you can also set the format for the device, the physical format, which is a little different from the virtual format. This is the device that the audio and MIDI system uses. This is the format that the device is actually doing I/O with the hardware on. That would be like 16-bit integer versus 24-bit integer, for instance.
So that pretty much wraps up kind of what the HAL looks like. I'm going to talk a little bit, I'm going to show you a little bit some of these concepts in a moment with some of our new stuff, but I'm also going to talk a lot, show you a lot about playing an AC3 file now with the HAL, sending it straight to hardware for decoding.
Before we get started, I'll tell you a little bit about AC3 bit streams. Each packet is 1536 frames in size. That's roughly 30 milliseconds of time at 48K. Each packet does vary in byte size. AC3 is implicitly a variable bit rate format. Each packet, much like MPEG, starts with a 16-bit sync word. To get AC3 into a format that you can put on a digital interface like SPDIF, you have to add an additional 8-byte header. I'm going to go in to show you a little bit about that as well.
Could we bring up demo four, please? Thank you. So the first thing I want to show you is a new application that we have in Jaguar called the Audio MIDI Setup application. It is a richer UI on the individual audio devices as well as configuring MIDI devices. I'm going to tell you a little bit about its audio support, and we'll hear about the MIDI support a little bit later.
As you can see, it gives you a dashboard interface on pretty much what a HAL device is. Here you see the built-in controller, and it's showing up exactly what it can do. In this case, the built-in audio controller doesn't support a global input and output volume, but it does support a global out/mute button as well as a global play-through option. Each of the individual channels has volume, as you can see down here.
So when we go to other devices, like the USB AudioSport MD-1, which I'll be using to demonstrate our AC3 playback, you can see that this device is a USB device and... Where is it? I'll show it to you. Here it is. As you can see, it's a very small digital output. It connects over the USB bus, and it provides an optical SPDIF connection.
So you can change the default device and you can get and set the device's format as well. As you can see right now, the device's format is set up for AC3. You can move it around. This device happens to support all sorts of different sample rates and a couple of different bit depths.
I'm going to be playing an AC3 file, so I'm just going to set it up for AC3 So to play an AC3 file, I've written a small command line tool that parses an AC3 file and doles out the packets to the hardware. The I.O. path for dealing with encoded audio is slightly different for dealing with normal linear PCM audio.
The key difference is, in your I.O. proc, when you provide data back to the output buffer, you also need to change the data size to say how many bytes you've provided. For constant bitrate formats like the AC3 on the digital interface, the number of bytes is always going to be the same. But if you're talking about a variable bitrate AC3 format, that number will change with every frame.
The structure of my little command line tool is I've abstracted out each device into a little C++ object. Here's the initialized routine. One of the other interesting things that we've added in Jaguar to the HAL is the ability to turn individual streams on and off for each individual I.O. proc. If you're not using the input section, you should go in and turn it off. I'll show you some code here in a second that shows how to do that.
Here's the C++ method in my device superclass that allows me to call the HAL API to tell it, hey, turn off this set of input streams. It takes an array of bools in to say which streams to leave on and which streams to leave off. The HAL API actually uses UN32s to carry those boolean values. It has to construct the list, which must match the stream configuration of the device. You can't have less streams. You can't specify more streams. It must be precise. It gets the number of streams, and it allocates a buffer. Then it stuffs in all those boolean values.
Then all you have to do to tell the HAL to turn off the streams for that IOPROC is to just set the property on the device. Here you see an example. I'm calling audio device set property, where it's passing in the device ID that we're working with, the IOPROC stream usage property for the output section. The channel number on this isn't really important, so you can put anything. In this case, I put one, which is a fine idea when in doubt. Then you just pass in the size and the pointer that has the struct containing the stream usage.
So once you've turned off all the streams, the next thing to do is to open the file. You parse the file. You'll get the packetization that I described earlier for the file. And then you register your I/O proc. And once you've done that, you've just got to feed the data back to the output buffers. And the way you do that is a little different. Here's the I/O proc for this. No, that's not it.
Here's the I/O proc for this device, or for this application. It comes in. Now, this code is a little simplistic, so it makes the assumption that the device only really ever has one stream, and that's the stream it's going to be doing I/O on. In reality, you'll probably want to change that based on the number of streams the device really has.
So, for here, we get the device, we cast it to a pointer to bytes, so that later on when we're going to copy the data, the AC3 data into, um, the AC3 data, we're
[Transcript missing]
The first thing it does is reinterpret the buffer as a 16-bit buffer, because AC3 is inherently a two-byte format.
Then it just starts stuffing in the header. For the digital interface for AC3, there are some bits that you have to fill out that are based on the actual values in the stream. You fill out the sync word, and then you fill out all the appropriate stuff for the header.
And then you just copy the next buffer into the output buffer. Now, a key thing to remember about the constant bitrate AC3 is quite often the data needs to be little-endian. On the disk, AC3 is almost always big-endian. So once you're done, you have to go through and byte swap everything so it gets put out on the wire correctly. And here's some code for doing that. And then once you're done, it just updates the file position and writes back out to the The data size that it just wrote.
Now I'm going to play some AC3 files to give you a feel for what's going on and show that it actually all works. So the first file is kind of a noisy file, but it'll show that there's sound coming out of each channel. That's a test file that just puts a different sound source in each channel. Here's another test file that's some more musical content.
To close this off, I wish I could turn this around. You can see that we've got this big Carmen Harden amp that's doing the decoding for us. You can see all the channel lights turning on and off. It's just coming straight over the optical spit of connection. To close out, I'd like to just ask all the hardware developers to please make every effort you can to conform to the relevant spec governing your hardware.
One of the key things that we've had bringing up the support is that some devices will say they can do digital I/O and some devices won't, even though they have a digital connection on them. It's very hard for us to do the right thing if you're not conforming to the spec. With that, I'd like to bring up Doug Wyatt, and he's going to discuss the MIDI support.
Thanks, Jeff. Good afternoon. I'm Doug Wyatt. I'm an engineer in the Core Audio Group. I work on various pieces of the system, but I'm primarily responsible for the Core MIDI framework. So today, I recognize there are some of you who haven't worked with core MIDI before, but there are probably a lot of you who have seen me give the same talk two years in a row, going through all the basics of the API. So as I figured out how I was going to organize this, I thought, okay, we'll just go through the API concept by concept, and we'll touch on the areas that are causing a lot of questions for developers and introduce some new features for Jaguar.
So architecturally, core MIDI sits on top of I/O Kit and the kernel. and we have higher level services in the audio toolbox that also use core MIDI such as the sequencing services and the AU MIDI controller for controlling music devices and audio units. And your client application, of course, can also talk to core MIDI directly for high performance access to MIDI devices.
So looking at the core MIDI implementation, Down in the kernel we have the I/O Kit, where we do all our I/O. We have a MIDI server process, which is in user space. The MIDI server loads MIDI drivers, which are CFPlugins. And on top of that, we have the core MIDI framework, which your client application links against. The core MIDI framework uses Mach interprocess communication to talk to the MIDI server and implement the API that your application uses.
So the first concept I'd like to go over in the API is the MIDI device entity and endpoint. These structures are hierarchical. Devices contain entities, which are logically distinct subcomponents. One example is a multiport MIDI interface like this Roland device we have on the screen here, or Edderall, I should say. This is a graphic from our audio MIDI setup application, which Jeff showed you, and I'll be showing you the MIDI pane in that application in a moment. So the big -- you know, the device itself has an icon.
These little nubs underneath are the entities, and an entity is simply a grouping of source and destination endpoints. And so we see this device publishes itself as having nine entities, each with one source and one destination endpoint. So these are driver-owned devices which are distinct from external devices, which I'll bring up later in the talk.
Okay, contrary to what Jeff was just saying about most of our property APIs using-- and So, in terms of the integer keys and void types, the MIDI API is a bit different. Our properties in the MIDI API have to use strings for their keys for some implementation reasons. But in any case, the concept is the same. Devices and entities and endpoints all have these properties.
and furthermore, since devices contain entities which contain endpoints, we have an inheritance mechanism in place for these properties. So, for instance, the device has a property saying that its manufacturer name is XCorp. But if you were to ask the endpoint for the manufacturer property, it would be inheriting that property from the device. and David But by contrast, the name property is set separately on each of these three objects. The device has a name, the entity has a name, and the endpoint has a name. So properties can either be inherited or set individually on the objects in the API.
So on these driver-created objects, the driver will set the properties. And as far as the standard properties go, your application shouldn't be touching those properties on the driver-owned devices. You can look at them, and that's what they're there for. They're there for you to interrogate. But you shouldn't touch the standard properties. But if you have some reason to store private information attached to a device or entity or endpoint, you can attach a private property. And the header file tells you how to do that.
One thing that has come up as people use the MIDI APIs is a bit of confusion about how to display the names of the source and destination endpoints. Since there are names at all three levels of the hierarchy, the simplest way to do this, which seems to be most reliable and provide the best user experience, is to first ask for the endpoint name. In the case of that Roland interface, which I showed you a moment ago, you might just see port 1, port 2, port 3, and so on.
In a lot of situations, if there's only one MIDI interface in the system, that will be a unique name, and that's all you really need to show the user. If it's not unique, there might be two of those MIDI interfaces, then you can prepend the device name to the endpoint name, and that should give you a unique name.
Another thing that has come up as people use the APIs, there's a bit of confusion about how to actually go and locate MIDI sources and destinations in the system. And there's two completely different ways to find these endpoints, and they each have their uses. When you're doing MIDI.io, The recommended way, and this is true for most circumstances, is the first method here, which is to just iterate through the source and destination endpoints directly.
There are the functions MIDI get number of sources, get number of destinations, MIDI get source by index. That's not the name of the function. It's just called MIDI get source, but you're passing it one of the indexes. And that's the best way to find endpoints. And you have to use that set of APIs if you want to see virtual sources and destinations.
There are some specialized circumstances where you want to walk through the device tree and see absolutely everything that's there. So you can index through the devices in the system and find out, and then you can walk through each device's entities and each entity's endpoints. And that way you'll see even more endpoints in some situations. But one reason you might want to do that is if you're doing some sort of studio. If you're doing a studio setup view, you might want to actually show the objects which aren't present right now.
But of course the danger in doing this hierarchical walk is that you will see some special objects. You'll see some offline endpoints. Offline meaning an endpoint for a piece of hardware that isn't present right now. So there's an offline property that you would have to look at if you're walking through the studio in this manner.
You may also see some objects that are private, which is another property that drivers can attach to their end points. And they're not doing that to try to hide things from you. There are situations where drivers may wish to create end points for their own applications to have private communication channels to the device for configuring it. So you'll probably want to hide those private end points. They're really not there to make sense to the user. and also if you do this hierarchical walk you won't see any virtual sources or destinations, as you will if you ask directly for the sources and destinations.
So everything I've spoken about so far is about the driver-created objects in the system. In 10.1, we introduced the concept of external devices, but it wasn't a fully fleshed-out set of APIs. There wasn't a studio setup editor where you could really manipulate them. But with our new audio and MIDI setup application in Jaguar, the user will have a way to create these external devices, and we fleshed out how those are going to look to the user and what the APIs to manipulate them are.
They use the same data structures as driver-owned devices, namely MIDI device, MIDI entity, and MIDI endpoint. These objects have the same properties. I'll be describing what those properties are in more detail later. But the user adds them to the system typically with our audio MIDI setup application. Although it's using public APIs, so if you want it to create your own, there's nothing we can do to discourage you from that.
And actually, we encourage you to if you have some reason you'd like to do that. You'll be manipulating the same global database because you're using the same APIs, and the two applications should present -- they should stay in sync with each other because they're both manipulating the same data.
Another thing I'd like to emphasize about these external devices is that they're completely optional. They're just there for the user to tell us what's there beyond the external MIDI connector. So instead of displaying port one to the user, you can display the actual name of the MIDI device.
So we have these driver-owned devices and we have the external devices. And the way that the connections between them are represented is with a property called midi-property-connection-unique-ID. Every object in the system, the devices, entities and end points, has a 32-bit unique identifier. So when we want to signify that a driver-owned end point has a connection to an external end point, what we do is we set a property on the driver end point, which is the unique ID of the external device's end point. As of Jaguar, this property can also be an array of unique IDs, so we can signify a fan-out connection. One MIDI port may be connected to, say, half a dozen MIDI devices.
So from the application's point of view, a bit more work's involved to look at these external devices and entities and endpoints and obtain their properties. um, if you like, you can skip doing this and you can just present the generic view of the world that you had in 10.1 with just whatever information your driver gave you, but we have this application so the user can tell you what's beyond the MIDI cable now. And so here's how you do it.
After you've followed the connection unique ID property to go find the external object, then you can use the external object's properties to override those of the driver-owned object. So if you want to know the name of that endpoint, you could say, okay, so external endpoint, what's your name? And you might see something like DX7 instead of the driver-owned object's name, which might be port 1.
But since there are a lot of possibilities here, that port one might be connected to, as I said, five different MIDI devices, each with different names. You might think that it might be simpler to just display port one if there's a lot of devices there. You might want to concatenate all the names. You might be able to say, oh, I know that the DX7 is only on channel one and the D50 is only on channel two. And so it's a user interface question. So we make you perform that look up yourself right now.
So just to go over the properties that you're going to find on the MIDI devices. Some of these have been around since 10.0, but we have a few new ones for Jaguar. Looking at a device or its endpoints, you can find out what MIDI channels it transmits or receives on. You can find out what MIDI event types it sends and receives. You can ask it whether it supports general MIDI or MIDI machine control. And there are a few more esoteric properties, and they're all described in the header file.
Some other things that we've done with properties that are new for Jaguar. Since 10.0, we've had a property that driver-owned endpoints can define to support scheduling in advance, which is good for those companies who have hardware that do hardware scheduling of outgoing MIDI. But for virtual destinations, such as software synths, we haven't had that schedule-ahead capability. So a soft synth would only be receiving its MIDI just at the moment it was supposed to be rendering it.
But now in Jaguar, the creator of the virtual endpoint can set this property saying, "I want to be scheduled ahead." and then the MIDI server will bypass its normal scheduling process in delivering the data to that endpoint. The soft synth will receive its MIDI as soon as the sender schedules it, which will hopefully be a little bit of time in advance, so that the soft synth can render in a sample-accurate manner.
Another new property, as might be obvious from our setup application, is that devices have a property on them for defining an icon or image. and the creator of a virtual endpoint can set its own unique ID. This was actually possible in 10.1, but we weren't doing any checks to make sure that it was actually unique, which could have been quite problematic. We now defend against that.
And for Jaguar, we've also defined some new properties that allow you to attach XML documents that describe the device's patch names. We had some properties along these lines in 10.1, and as we've worked on the spec, it became apparent that the property set needs to be a little more elaborate. And so the old properties have been deprecated, and there's a new one in place to do this now. And I'll be describing that in a little more detail later.
and David On our mailing list, some developers have been asking for a few new APIs, and you asked for them. We got them. One is MIDI Object Find by Unique ID. So given the unique ID of an object, you don't have to go trundling through the whole system to find out what it refers to. You can quickly just get back a reference to the object and a constant to tell you what kind of object it is.
Previously, you could only walk down through the device entity endpoint hierarchy from the top down. There wasn't any way to go up that hierarchy. So we now have APIs to fetch a device given an entity and to fetch the entity that owns One other thing that developers ask about from time to time is how to save references to the objects in the API, such as MIDI endpoints. What we recommend is that you save both the object's name and its unique ID.
That way, if the user renames the object, you'll still have the unique ID, which won't have changed, and you'll still get a reference to the same object. On the other hand, if that fails, meaning the object has been deleted, probably, you'll still have the object's name to show the user what it is that he used to be working with that isn't there now. Maybe he has to get his DX7 out of his garage.
In any case, you can show him the name, and he can use that to decide what he's going to do about it, choose a new device or say, "Oh, I actually don't need that sound anyways." Okay, continuing through the MIDI API, we have the MIDI client object, whose main purpose is, aside from just bootstrapping yourself into the system, its main purpose is to receive notifications of changes to the state of the MIDI system.
You can have more than one per process if you have a modular application. And that's just to answer a frequently asked question. And for Jaguar, we've changed the way the notifications work. In 10.1, we just had a single notification which said something in the world's changed, and you had to go figure out what it was that changed.
You pretty much had to interrogate the whole state. And in Jaguar, we have these new fine-grained notifications that say an object was added, an object was removed, or an object's property changed. Now, for backwards compatibility, we still have to send the world change notification, but if you're handling these new ones, you can ignore that.
The next object in the API is the MIDI port object. Just to clear up a common source of confusion here, the MIDI port that you see in the API isn't a five-pin MIDI connector. It's a communication channel to the server like a Mach port or inter-process communication port.
People seem to get confused about this at first every now and then. You only need multiple ports in your application if you are highly modular or if you need MIDI merging of your output. Unlike a five-pin MIDI connector, one of these communication ports can communicate with all of the sources and -- well, one output port can send to all of the destinations in the system, and one input port can receive from all of the sources in the system.
So another new API that we have in Jaguar is the MIDI through connection. This allows you to do MIDI throughing in the server process where it can be a lot more efficient than doing an inter-process communication message from the server to your client and then immediately back to the server just to do MIDI through.
and the rest of the team. Not only is it convenient, but it's more efficient. And we also provide some fairly extensive MIDI filtering and mapping functionality here. And the goal here was to let you do things like change the channels, remove event types, and remap controls, a few more things like that. I tried to make it as powerful as I could while keeping it able to be described in one data structure.
Earlier I mentioned our new properties for describing a device's patch names as well as its note and control names. We're working on a draft of a spec for this file format, which we're calling MIDI Name Documents. The whole goal here is to integrate into the system what OMS and FreeMIDI had 10 years ago, a set of name services, so users can see things in terms of their names instead of saying, "Program change 42," or "Control 7," for instance.
And this is just a brief little snippet of XML to give you a flavor for what it looks like if you've never seen XML before. It's not trivial to parse XML, but there are a lot of free open source XML parsers out there. Once we have finalized this format, we will be releasing a parser as part of our example code.
Okay, so there's the review of what's new in core MIDI and what's old. I'd like to go to machine five now. Is it awake? Yeah. And just give you a quick look at our audio MIDI Saw some screenshots from this earlier. This is--this loopback device is a driver which I use for debugging and I don't need it. and here's a Roland UM-880, Etch-It's-Etter-All. We saw that earlier. That's my MIDI interface. And I can define an external device. And I can double click on it and give it a name.
This is going to get a bit more elaborate with more of the properties I talked about earlier. And then having created this external device, I can show how it's connected to my MIDI interface. It doesn't have to be symmetrical. I can say the MIDI out from port one goes into my synth, and the output of my synth goes into port four.
So that's actually all there is to it. Can I go back to the slides, please? So in a moment, Bill Stewart's going to talk about some larger scheduling and threading issues related to audio and MIDI and how they fit into the rest of the system. But as a lead-in to that part of this session, I'd just like to touch on a few issues relating to the MIDI system and threading.
The core MIDI framework, which your applications use, is completely thread-safe, meaning that you can call any function from any thread. The core MIDI server framework, which only driver writers really need to know about, is not thread-safe. You need to make all of your calls other than your I/O calls, which happen on really high-priority I/O threads, but all your other calls have to happen on the main thread.
On the client side, when you create a MIDI input port, your read proc for that input port, no matter how many input ports you create, all of those read procs get serviced by a single high-priority thread.
[Transcript missing]
All of the notification callbacks that your client gets get called from the run loop or thread on which you first called core MIDI, which would typically be your application's main thread.
This is good to know because You can draw from that thread in response to the notifications. You don't have to worry about synchronization issues if you're accessing global data structures where you're mirroring what Core MIDI is telling you is there. Okay, so that's the end of this section of the talk about core MIDI, and I'd like to bring up Bill Stewart to talk about audio, MIDI, and thread priorities.
[Transcript missing]
So, first off, let's start sort of from the top and work down. Basically, audio devices themselves, and if we're talking about USB, MIDI devices, etc., they're using interrupts to deal with the actual hardware interfacing.
And so, for the hell part of this, we actually expect that the audio drivers themselves supply very accurate timing information so that the audio system can actually correctly schedule your I.O. And so, it's probably the single most important thing that a driver needs to do, aside from its actual handling of I.O., is to supply this timing information.
And the... The process of getting the data to and from the drivers is actually done in the time-constrained threads, including any transformation from a floating-point format to the driver's native format. And we call that mix and clip. And so, that actually runs on the context that your application is doing, that your application is seeing and doing the work that it's doing for audio.
And this sort of brings us to time-constrained threads. Time-constrained threads are fixed-priority threads. They'll be seen by the system as a priority of 96. By fixed-priority, I mean that the priority won't degrade over time, and the system will continue to use this as a very high-priority thread. When you create a time-constrained thread, you give it information to the system about some of the time conditions that you expect to be met by the scheduler when they're running that thread. And one of the most important of these is actually the timeout period that the scheduler will use before it looks to see if there's another thread to run.
The core audio I/O proc thread is a time constraint thread, as you probably know. The I/O threads of the MIDI server are also time constraint threads. This is done in order to meet the very tight and very real deadlines that I/O has, particularly in the case of MIDI, you're talking about not introducing any jitter into the transport of MIDI data. In the case of audio, of course, you don't want to get glitches or missed data.
Colonel tasks, if we're sort of going down now, the Colonel does all of its work at fixed priority as well, and that's a fixed priority of 80. And we're still not at your application level at this point. So there is going to be work being done in the Colonel, and one of the biggest jobs that we have internally in talking to the Colonel team is to make sure that they're not going to be introducing problems by doing more work in the Colonel and not introducing problems in the kernel. And so we spend some time with them, and they spend a lot of time just going through the system and making sure this works as we would like it to.
If we get to the application level, you have both fixed priority threads available to you and the timeshare threads. Fixed priority threads in Jaguar have got some very important changes in the way that they behave. Basically, they're given some special privileges. These special privileges are to preempt some of the activities that are going on in the kernel.
Otherwise, you tend to get, and I think we've seen this in some cases, you tend to get a problem of priority inversion. You may get some activity that you've made a call and you've made that call from, say, a 33 thread. That ends up having to transition to the kernel to do some work.
Then suddenly a 52 thread that needs to do some work cannot do that work because an 80 thread is running in response to a low priority request. The fixed priority threads are going to have this ability to preempt that work and to get to actually do their work.
Then that work will get done after that's done. In a very meaningful way, in Jaguar, the fixed priority threads will tend to behave very much like the real-time time constraints thread. They'll have some special abilities to run. The understanding is that they have tasks that really have some time-sensitive elements to them. The highest priority that's available to you, both for fixed and for timeshare threads, is 63. In the user space, except for the time-constrained thread, of course. Then you can go down from there.
The Windows Server has been an issue of much debate in terms of where it does its work and what priorities it uses to do its work. In Jaguar, we've been talking to them and giving them feedback from some developers about the fact that they are getting some of the work that the Windows Server does can actually cause undesirable effects in a 10.1 system.
So there's been some changes to the Windows Server, a lot of changes I'm sure that have been covered in other sessions. The one that we most care about is that in a Jaguar system there's going to be very little work done at the 63 thread that the Windows Server runs.
That 63 thread will basically just be an event type thread and then any work that needs to be done from that will be dispatched to a lower priority thread and that thread is running at 51. You can see multiple 51 threads if you're running a dual processor machine. You see 151 thread per CPU.
The 51 thread in the 10.1 system is going to be a very, very complex system. It's going to be a very complex system. It's going to be a very complex system. It's going to be is being used to do compositing by the Windows server, and it will now in Jaguar be used to also field requests from applications. A good example of this is calling findWindow.
FindWindow is a Carbon call, and on a Mac OS system, that was just to look up something in a global record. In a Carbon on a Mac OS X system, that ends up being a mark message to the Windows server, and it was doing that work at the 63 priority thread, so you'd get this unexpected behavior where it would just take away time from you. And then some of the work the Windows server does... Some of the work that the Windows Server is asked to do can involve transitions to the kernel, and so the need for fixed priority threads to be able to pre-empt that.
Fixed priority threads are used in the Carbon. There's three threads that would relate to interrupts on a Mac OS system. There's a time manager, the async file and the deferred task threads, and those are the priorities. They're published as fixed priority threads in a Fuma system, and that'll be the same in a Jaguar system.
MP threads, which are another Carbon API, are currently timeshare threads, and there will be some additional APIs in Jaguar to allow you to set the policy of those threads to being fixed priority. So just to recap how this all lines up if we go from top to bottom. We start off with the interrupts from the hardware, then we've got a scheduler running above that.
And then we have the time constraint threads that are used by both the audio and the MIDI system. Kernel tasks, the MIDI client thread, we're considering taking that out of the real-time band so that it doesn't compete with the I/O and taking that to a fixed priority of 63.
And then we look at the event thread of the Windows server at 63, fixed priority threads for Carbon, and then the Carbon MP threads will be user-definable, including their policy. The main problem, I guess I should have said, in a timeshare policy, and I'll show you this in a minute, the priority will degrade, and we'll see why fixed priority becomes a very important thing for you to use. And then you've got down at the bottom your main thread, which will typically be somewhere around the 30s. So if we can go to demo machine three.
Okay, so this is an app that we wrote because we were really needing to try and understand how threading was going to affect audio applications and how the system guys really needed to understand the kinds of activities that audio and MIDI apps are doing and the kinds of behaviors that we need from the system in order to get the guarantees that those apps need in order to do their work.
I'm going to launch this from the terminal. This app is called Million Monkeys and it's a reference to Shakespeare, I guess. The idea of this app is not to show you how to write a threaded application for doing audio. In fact, our main purpose of this was to create a very fragile situation so that we could look at the problems of threading, the problems of getting guarantees in order to run, and to try and understand what is causing those problems and get diagnosis from those problems and fix them.
This is actually on your Jaguar CD, so you can run this at home and in the office if you're not working at home. You can use this to actually help us to file bugs and help us to understand where there may be latencies being introduced into the system.
This uses a tool that is covered in some other talks called latency. Latency is an ability for you to see what the latency of the scheduler is. In order to run latency, you need to execute the application as root. That's why I ran it from the command line.
This is probably the most exciting audio content you'll hear all week. It's a sine wave and I thought I'd turn that off. What we're seeing here is that, this feeder thread is being run at a priority of 63 and it's not fixed. All these purple lines mean that that thread is missing its deadline. I'm telling the thread that I want it to consume 80% of its CPU.
The sine wave, the crackling that you can hear there is the fact that even though I think I've set this to the maximum priority I can, I'm not being able to do my work at all and I'm overloading like crazy. If I'm able to make this a fixed priority thread, then I'm able to actually maintain my priority and I'm not going to degrade.
When it's not fixed priority, as you take time, and I'm taking a lot of time here, the system degrades that priority. You can see just by making this a fixed priority thread, I'm able to actually get the work done and utilize 80% of the CPU in my second thread.
and what I'm going to do here now is to run this again using an execution trace and I'll show you what that looks like. So I'm going to start like this here and then I'm going to put this up to a 63 thread. Then I'm stopping it and what I can do is click here and this is the trace that latency gives me.
And I can step through backwards and I can see, okay, so I overloaded at this point. Now, was this a thread scheduling problem? It doesn't look like it. And so I probably know at this point that it was because at that point I was running it as a timeshare thread.
And I can go through and each, when you're doing the trace itself, each time the time the thread is running, you're running it as a timeshare thread. And we basically look at from the time the I/O thread ran to the time that our feeder thread got to run and we look at what was happening in the system between those two times. And we're trying to understand what is holding off my second thread from running if that's okay. So here's a longer trace and you can see that this took actually 365 microseconds for it to actually run.
And I can go through here and I can see what was it that from this time, this is the point here when I was trying to get to run, to this point here when I actually got to run, what was going on in the system? Well, the window manager was doing something.
At this point, we would probably turn around to the kernel guys and say, "There is a kernel task going on here." We'd turn around to the scheduler guys and we'd say, "There really is something going on here that we need to understand," and we'd give them the tool, and away we go from there. We'd really encourage you to use this tool to help us to track down these bugs because this basically will help all of the applications work very well. If we can go back to the slides.
So just to conclude before we do Q&A, there's a whole bunch of new APIs, so look for them in Jaguar. We've had some apps that are shipping already using Cordio and MIDI. We're seeing some device drivers coming out for both audio and MIDI devices, and we're very enthusiastic about that and very pleased to see that progress, and we hope, particularly with the changes in Jaguar, that we'll see a lot more of this.
And once again, I'd like to reiterate a point that you've probably heard from a lot of Doug's comments, but in particular earlier, and Jeff's, is that we really do listen to your feedback. We really would like to encourage you to continue to tell us things that you're missing, things that you like about the system, things you don't like about the system, and we do try to respond and make this as good a system for you to use as we possibly can.
Here's some roadmap. We had a session yesterday where we were doing 3D mixing and stuff. In the session after this one, we'll be talking about writing audio units and some of the audio codecs. There's a short discussion of AAC, which is implemented using the new audio codec component on Friday. We have, in this room on Friday afternoon, a feedback forum. If you've got any more complex questions or comments or discussions, then it might be a good forum to come to and bring that up.
and Craig Craig will be able to fill general comments and so forth. There's his email contacts. We have an API list which is public, list.apple.com. We have SDKs. All the code that we're doing this week we will endeavor to get out to the developers for you guys to download next week. Just keep an eye on that website. We'll put a link up for that then and do Q&A.