Audio Hardware and MIDI - WWDC 2004

Graphics • 1:04:35

This session provides both device and application developers with an overview of Mac OS X audio and MIDI device support. Emphasis is placed on audio transmission over high-speed serial interfaces such as USB and FireWire, along with techniques for extending the functionality provided by the Apple drivers for standardized interfaces.

Speakers: Doug Wyatt, Jeff Moore, Nick Thompson, James Lewis, Yoram Solomon

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good afternoon everyone. Welcome to session 205, Audio Hardware and MIDI. Please welcome Core Audio MIDI Plumber, Doug Wyatt. Thank you. Good afternoon. This session is in three parts. In the first part, I'll be going over a few things about the core MIDI framework. In the second part, Jeff Moore will cover what's new in Tiger for the core audio framework. And the last portion of the session, Nick Thompson will be covering some audio hardware and driver issues.

So for the core MIDI portion of this talk, I'll be covering very briefly some basics of core MIDI. There's some best practices issues I'd like to cover. And there's a new API for Tiger called the Core Audio Clock. So most of you who are working with MIDI already know what it's about, but for those of you who don't, a good resource is the MIDI Manufacturers Association at mma.org on the net.

And there's some good books out there about MIDI. Our APIs are documented in the headers, and there's some documentation and examples in the developer directory. We've got a very active mailing list, and there's, this is probably about the fifth WWDC that I've been talking about core MIDI at, and there's DVDs of the previous year's sessions.

So in the area of best practices with the existing core MIDI APIs, there's just two things I want to talk about. One falls in the area of user experience, and the other is performance issue with timing accuracy. So the first thing that I've seen in some applications is difficulty in giving the user a good experience with seeing the names of his devices. And here we've got a moderate-sized studio with a bunch of devices, as seen in the AudioMIDI setup application. And there's devices connected to about five of the eight ports there.

And what I've seen is that in some applications, they'll just show me the name of the MIDI port where the user would really like to see the name of the external device. I realized after I made this slide, this is sort of a confusing example because there's actually two different devices on port 3. So for output, I'd want to see repeater. For input, I would want to see radium, the names of the external devices.

So I just want to quickly show you a little application that illustrates a couple of approaches displaying MIDI endpoints in your application. So here you might want to, in the case of multiple devices on one port, and actually this setup that I have here doesn't have any examples of it, but here we see the names of the ports where there aren't external devices, and then we see the names of the external devices where there are some.

These menus don't have the names of the external ports in them. This may be a little draconian. It forces your users to go through audio MIDI setup. Some applications do that. It's okay. And this also illustrates that you can go through and obtain pairs of ports. So here we're only seeing the devices which I have a two-way connection to. Notice here that I have the radium as a source, the repeater as a destination, but it doesn't appear here. Wow, that's probably a bug. So the idea is to only show here the devices to which there is a two-way connection. So, back to the slides, please.

So what this quick and dirty program does, it's using some sample code from our SDK, but just to go over the process, you can iterate through the sources and destinations in the system with MIDI get source and get destination. Once you've found a source or destination endpoint, you can find out what's connected to it with the property connection unique ID. Then you can go find that object that is connected with MIDI object find by unique ID, and then you can ask that for its name.

But it's probably best if you use the C++ class in our SDK, CA MIDI endpoints. And if there's bugs in it, like in my demo, I don't know if that's in my demo or in the SDK code, but in any case, that's a really good place to start. It'll show you the sequence of calls. There's a few strange cases to deal with. And it'll give your users the names that they expect to see.

So the other best practices issue I'd like to cover is MIDI timestamps. I've noticed that there are some applications that send all their outgoing MIDI with a timestamp of now, which means that by definition, it's going to be late by the time it gets to the hardware. Not very late, necessarily.

But there's some good reasons not to do that, because Well, one I've got mentioned here on the slide is that there is an Internet Engineering Task Force proposal in the works for doing MIDI over IP. And with networking MIDI, the timestamps are going to become really important because we're going to see more jitter and latency than we would just on a normal local MIDI network.

The other is that we're starting to see applications or contacts where people want to use multiple applications on the same computer and synchronize them together. And you can send a MIDI timecode or MIDI beat clock very efficiently between applications using the IAC driver, which was introduced in Panther. And if those events are timestamped, then the applications can achieve really good synchronization between them. If they're not, then application A may not be able to synchronize.

It may be sending its timestamp, you know, sending with no timestamp now. And there's going to be a little bit of propagation until the time it gets to application B, maybe only a couple hundred microseconds or something. But that's not going to provide totally accurate sync. So there's no reason not to be using timestamps when you schedule and paying attention to timestamps when you record.

There are a few applications that might want to do MIDI throughing in real time and say, "Well, I just need to send everything as soon as I get it, and that's okay." I would just suggest that you measure your performance, and if you see that you're getting more jitter than you like, you can add a little bit of latency. Say, "Okay, play this two milliseconds from when it came in," for example, and that should smooth out most of the jitter that you'll see.

One major area of new features in Tiger for us is this API set called Core Audio Clock. It's actually in the audio toolbox framework, but it touches on core MIDI in a lot of ways. So it's being discussed in this session. So if your app has any kinds of synchronization needs, especially involving MIDI timecode and MIDI beat clock, whether it's coming from an external source or you want to sync to another application, this API will provide a lot of the grungy code for dealing with those MIDI timing formats.

It also does some other time conversions, interpolations between various formats, as we'll see in a minute. And if your application is already used in the music player APIs in the audio toolbox, this hasn't happened yet, but those will be put on top of -- well, there will be a clock object embedded in the music player. So if you're using the music player, then you'll get the ability to send and receive MIDI timecode for free. Modulo, whatever user interface you're using. face you need to put on top of it.

And so, yeah, what the clock does, it manages synchronizing your application between audio and MIDI timecode or MIDI clock. It's an extensible internal architecture, and at some point we may add other synchronization sources. And it's got math for, you know, but it does all the grungy math of dealing between SMPTE time formats, synchronizing as an audio device's time base and samples via the HAL. It deals with seconds along your application's media timeline, and if your application has a concept of musical beat time, it'll convert between seconds and beats and those other formats.

So this just illustrates kind of what's going on under the hood in the clock. In the top blue line, we have your application's media timeline, so zero being the beginning of time, for example. And in the green boxes, we have the hardware reference timeline. And so what the clock is doing is maintaining a series of correlation points between those two timelines.

There's this idea of a start time. If your application is starting itself internally, then the user might have said start 40 seconds into the piece. And so you set the start time, and that's where playback begins. If you are in external sync mode, the start time is the timestamp, for example, of the first MIDI time code message that was received. So it's the point at which sync was achieved and time begins moving. And then as time continues to move, the clock object continues to take new anchor points referencing media and hardware times, and then performs all subsequent time correlations and conversions using those anchor points.

So here we see diagrammatically how the different time formats relate to each other. On the left in green, we have the hardware time references, the host time base as used by the-- well, that everything's based on in the Core Audio APIs and Core MIDI. There's the audio devices sample time.

The HAL does the correlation between the host and audio times. On the right, we have different ways of expressing time along the timeline, the blue boxes. So seconds is the main way of describing these times. It's a floating point number. And from seconds we can convert to beats if you supply a tempo map and we can apply a SMPTE offset to get to a SMPTE time in seconds.

Excuse me. In the gray boxes on the right, this just illustrates that there are some auxiliary APIs in the clock for converting between beats and a display textual representation of beats. And similarly with SMPTE seconds away. Since SMPTE seconds can be floating point, we're actually seeing an example here of a SMPTE time that's going out to 80 bits per frame or however many bits you want, actually.

And the circle in the middle indicates how the clock is correlating between the hardware and timeline times using variable play rate that you can set. You can say, for example, play twice as fast. Okay, so with those concepts, I'd like to just give you a quick look at an application that uses the Core Audio Clock.

OK, so this document shows pretty much everything that's in a clock object, except what's below this line here is an audio file player, and I'll show you that in a minute. And so if I click go, then time starts moving. I've got a SMPTE time over here on the left and a bar beat time on the right. If I change the tempo, you can see suddenly we're at a different bar beat time and it's moving twice as quickly.

And I can create a second clock object. And I'll have this one send MIDI timecode to the IAC bus. I'll have this one receive MIDI timecode from the IAC bus. And I'll play them, and they're in sync. Let's have a little more fun here and we'll have this one play an audio file.

So I can very speed this clock, and this one's following along. And so I got all that running and I thought that's pretty cool. So what happens if I have two of these guys playing audio files together? So now these are both playing the same file. They're synced together with MIDI timecode getting sent over the IEC bus. And how close together do these clocks really stay? Let's turn down the volume here. And I'll go to my favorite portion of this song.

That's a little confusing because there is a very phased sound there. I don't know if you can hear the hi-hat. It's over to the left. In any case, according to my math here, these two players Despite being yanked around with various speeds, are within one or two frames of each other. So that's the Core Audio Clock. Okay, so next Jeff Moore is going to talk about new features in the Core Audio Hell.

So today I'm going to talk, as Doug said, about a couple of new features in the Core Audio HAL. The first new feature that you'll see is there are new I/O data formats supported by the HAL, including variable bitrate I/O, and also non-audio data like sideband data such as timecode and control data and other things that aren't actual audio samples. And the other new feature I'm going to talk a little bit about today is the aggregate device support.

So currently, when you're doing I/O with the HAL, in your I/O proc, you're always going to be moving the same number of bytes. Either you're going to be getting x number of bytes of input, or you're going to be providing y number of bytes of output. And further, the mixability of the streams is always controlled at the device level, meaning that if you want to switch to a non-mixable format, you have to tell the device, switch to a non-mixable format, and all the streams are switched to that way.

And currently we support linear PCM data and IEC 6958 compliant streams, and that's SPDIF for the alphabet impaired. That includes data formats like AC3, MPEG-1, MPEG-2, and other things that can be smashed into a format that can be sent over that digital interface. So now in Tiger, we're adding the ability to move a varying number of bytes through your IOPROC when it's called.

This is important for formats such as RAW AC3 where the number of bytes per packet varies for each packet. And you're going to be told that through the audio buffer list, the mdata byte size field. And you'll need to make sure you're always paying attention to that field and aren't just assuming that it's constant anymore.

And on output, you have to make sure that you tell the HAL how much data you're supplying. And you can see in this code example a very simple IOPROC that is doing exactly that. It is iterating through all the output audio buffers in the provided audio buffer list. And it is stuffing some VBR data into it and then telling the HAL how much data it's stuffing in there by assigning back to the mdata byte size field of the audio buffer.

So now with the new IO data formats, this basically is around other non-mixable formats. And with these, you have to be able to have a mixable data stream side by side with a non-mixable stream. Consequently, you're going to have to be dealing with mixability now at the stream level as opposed to the device level. And you can find out about the mixability in two ways, either by the mixability property, or you can use the format information that's provided by the HAL. In particular, the audio format flag is non-mixable will be set in the M format flags field for non-mixable formats now.

Variable bitrate and coded formats are also now going to be supported for input as well as output. Previously, they were just supported for output only. And this is going to be including, as I said, raw AC3 data as well as raw MPEG 1 and 2 and any other data that you have. And this can also be used to transport non-audio sideband data. For example, time code, such as SMPTE coming into the hardware or word clock time or other forms of synchronization. And it's also good for real-time controller information for devices that support that.

So before I talk a little bit more about aggregate devices, I want to talk a little bit about the problem that aggregate devices are there to solve. Basically, when you're syncing multiple devices, you have two problems to deal with. You have the different interpretation each device has for what the sample rate really is. That's also known as the clock drift problem. And then you have each device has its own amount of presentation latency in it. And you have to solve both of these problems if you want to do I.O. synchronized on multiple devices.

So to solve these problems, you can use hardware clock synchronization, and that's where you are running an actual physical cable between all the devices and sharing a clock signal among all the devices. And this can be done using digital audio interfaces like AES or SPDIF or ADAT interfaces. And you can also use things like House Sync, Blackburst, and other more high-end video-oriented Studio Sync situations.

Hardware Sync provides the best solution for the clock drift problem since it's actually synchronizing the hardware at the DAC level so that you know the samples are going to be within some very small amount of time of each other. But hardware clock synchronization doesn't solve the latency problem at all.

So in addition to hardware, you can also do the resynchronization in software. Now, doing it in software is an order of magnitude more complicated than it is in hardware because your software needs to be able to judge how fast each device is running with respect to each other and then compensate accordingly. And you use various kinds of resampling techniques to do that. But you still need to compensate for the latency differences even when you're doing software sync.

So aggregate devices are the HAL's attempt to solve all these problems in a way that makes it useful for your application. It gathers together any number of disparate audio devices on the system into a single cohesive unit for audio I.O. And it will perform synchronized I.O. to all those sub-devices regardless of what their sync situation is.

They can be hardware synchronized, they can be software synchronized, and the HAL will still be able to deal with that. And to do this, of course, it solves the problems I was just talking about of the different amounts of presentation latency, and it does the clock drift compensation.

So the user can create aggregate devices that are global to the entire system in audio MIDI setup. And I'll show you in a few minutes about how that works. Applications can also create aggregate devices that are either global to the system or local just to that process. And that's done programmatically through API calls in the HAL. And then the HAL will also, on its own, create a global aggregate device for each I/O audio device that has multiple I/O audio engines in it. For example, USB audio devices that have both input and output will now appear as a single unified whole.

So you can only aggregate devices that are implemented by I/O audio drivers. So some of you may have heard me talk about the need to write an I/O audio driver rather than a user land driver. This is one of the benefits you get by doing that. You get to play for free in the aggregate device world. So further, all sub-devices in an aggregate device have to be of the same sample rate. And of course, the sub-devices can't be hogged by another process. And all their streams have to be mixable.

So when you're looking at an aggregate device and its sub-devices, the ordering of the sub-devices that you set up either programmatically or through the AMS UI is important. And it determines the ordering of the streams in the aggregate device that you see in your IOPROC. So for instance, if you have two devices, device A and device B, and in that order, devices A streams will come before devices B when you look at them in your IOPROC.

and further, aggregate devices will retain knowledge about the sub-devices that they aggregate, even if the devices aren't present or have been deactivated because of some format conflict. And further, missing devices or devices that are in the wrong format will automatically just come back into being in the aggregate device when their situation is updated.

Each aggregate device has a master sub-device. The master sub-device defines the overall timeline for the entire aggregate device. This means that all the timestamps you see from the HAL when you call audio device translate time or in your IOPROC are going to be defined in the timeline of the master device. Further, all the timing numbers that go with the aggregate device are the ones reported for the master device. For instance, the latency and safety offset figures are that of the master device.

The final job the master device serves is to provide the frame of reference that the HAL uses to judge clock drift in the other sub-devices in the device. Now, when you're picking the master sub-device, obviously if you have a hardware sync situation, you already have in hardware a notion of a master clock, and the device that corresponds to that clock should also be the master device in the aggregate.

Now, barring that, you should always just look and set the device's clock. You can also look at the device that has the most stable clock. Now, you can kind of guess at that by what transport type the device uses. PCI and FireWire devices, for instance, tend to have much more stable clocks than USB devices.

So now to deal with the differing amounts of latency in the various sub devices, the HAL has to go through and look at all the sub devices and figure out what the maximum amount of latency there is going to be for each sub device. And once it finds out which sub device has the most latency, it then will pad out the latency of the other devices by padding the safety offset of the devices so that they all match.

And here you can see a little diagram showing three audio devices. Now, device C has the most combination of latency and safety offset. And you can see how device A and device B are getting padded out so that they all come out equal. Now, this is really important to do this padding because that's what ensures that all the devices start in synchronization with each other. Without that, you'll be all skewed all over the place and you'll never be able to achieve sync.

So once you've dealt with the latency, you also have to deal with the clock drift. Now, aggregate devices in the HAL will work regardless of what the clock domain situation is for each device. So whether it's hardware synchronized or whether it needs to be synchronized in software, the HAL's game for doing it.

Now, for each sub-device you have in an aggregate, you can set independently what kind of clock drift compensation to use. Now, there are going to be three basic versions of it in Tiger. First, there's no compensation, and that's the method you're going to use for hardware sync situations because, well, you don't have to do anything.

It's already in sync. And then there's going to be a very low CPU overhead sample dropping doubling algorithm who's going to be there to account for that one sample of drift over five hours of time that you're going to see. And that's going to be little CPU, very little CPU. But with a potential... If it has to run often to make the synchronization happen of getting some audio artifacts.

And then the HAL will also use the same high-quality resampling algorithms that are in the audio converter to do the full bandwidth, you know, quality really matters style of resynchronization. And just so you guys know, the software synchronization is not in the seed you have today. That will be coming, we hope, sometime in the future.

So now you know what agro-devices do do. Now there are a few things that they don't do. They don't provide controls. They don't do volume, mute, data source selection, and all the other sort of little doodads that you get on a regular audio device. And the reason for this is simple, is that aggregate devices are there to be an I.O. abstraction. They're not meant to be kind of the system console, if you will. You should always go back to the original device to do the manipulations of volume, stream format, and other things like that.

Now, aggregate devices also can't be hogged. That kind of plays into the fact that they can't be non-mixable either. And finally, an aggregate device cannot be hogged. So if you're going to use aggregate devices, you have to provide the means for your users to select them for your engine. They cannot be set as the default. Mostly that's to shield applications that don't want to have the impact, the performance impact of running on an aggregate device from just inadvertently seeing that. So now I'm going to show you a little bit about how aggregate devices work.

So in Tiger, in AMS, there's a brand new dialog that allows you to configure the various aggregate devices. Now, I'm here on a 15-inch PowerBook, and I've also brought with me a bunch of other kinds of devices just to show you kind of how it works. The first device I have here is a, let's show you this one first. It's a Edderall UA3 USB interface. It has input and output. It's stereo. Now, the reason this one's interesting is this will show you what happens when the HAL creates an automatic aggregate device, since this device has one IO audio device that has two IO audio engines.

and the rest of the team. So, you can see an AMS by looking at the icon what kind of device it is. You can see the UA3, the normal UA3 is here and you can see all its controls and stuff. And now we go to the aggregate UA3.

Now, an AMS kind of shields you from some of the implementation details but, you know, those of you that have been fighting with this for a while know that this is really when you look at the UA3 in AMS you're looking at two independent audio devices. And so you have to run two separate I/O procs in order to do I/O with that, to just do pass-through I/O, for instance. Now, with the aggregate UA3 USB device, you can now run one I/O proc and do pass-through and in-place processing and all the things you would have had to do more complex management before. Now, I've also brought an Echo Indigo PC card.

Just kind of showing you the interface. Okay. So, I've got a few things here. I've got a few things here. I've got a few kind of show you something a little more exotic. It's a little stereo two-channel device, and it pops up, and it's all good. So now I'm going to make an aggregate device out of it.

So you open the aggregate device editor, and you start out with no aggregate devices that are user-made. The automatically generated aggregate devices don't appear in this dialogue because there's nothing you can really configure about them. So you click on the plus button to make a new one. You can give it an interesting name. Whoops.

And then you go in and you can see down below all the devices that you have that you can add to the aggregate. Now I'm just going to click on all of them. And then once you have them clicked, you can move them around. Because the order is important, so you can drag them around.

And then you can use the clock radio button to select which is going to be the master device. And that's pretty much it. And you can see now that we've created this aggregate device, it now shows up in the pop-up for-- well, it did show-- this is it. The name isn't updated. That's a bug, too.

So, but there you can see it. You can see now this aggregate device has, you know, four input channels and four output channels and, you know, and it's, when you do I.O. with that, you're going to see, you're going to be doing synchronized I.O. across all the devices that were in the aggregate. And that's pretty much all there is to it. Next up, I'm going to be bringing up Nick Thompson, and he's going to talk about driver development in OS X.

and other devices. So I want to talk about developing audio hardware devices for Mac OS X. I wanted to look at the kinds of devices you might do for built-in hardware with a standard Macintosh, and I want to look at how you can expand on the built-in capabilities that you find in every Macintosh computer by using high-speed serial interfaces. You'll have noticed that music and audio have become very important to Apple. And what we're talking about in this part of the session is how to get audio in and out of your computer.

So the key thing here when we're thinking about this is when you develop a product, you want to hit as many people as you can. So USB and FireWire, if you're developing an expansion product, are in every Macintosh computer that we ship today. There's also good opportunities for using the built-in analog connections, both for input and output for audio peripherals.

When you look in Darwin at the driver stack, you'll see a bunch of things. The two things you really want to look at are Apple Onboard Audio and Apple USB Audio. Apple USB Audio gives you kind of a basic template for how to write a USB driver. The Onboard Audio is a little more complex because there's a whole bunch of stuff that we have to do. So when you're looking at the source code, you'll actually see a whole bunch of plug-ins. So I recommend if you want a template audio driver, USB, or the examples in the Core Audio SDK for the Phantom Audio driver are the place to start.

So the technologies available are basically, you can use the stuff that's built in, and that's usually analog, but we also supply digital output on Power Mac and Touch G5s, but basically you're limited to two channels. If you want to go multi-channel, you're going to need some kind of audio peripheral.

And if you look at the list of things that you can do here, PC cards, PCI, I mean, they're good solutions, but you're only going to hit a subset of the market. And if you're going to develop for the platform, it's a good idea to hit the most people that you can. So we really recommend USB and FireWire for the development of audio peripherals.

So we kind of think about audio devices in this kind of continuum, for want of a better word, of audio devices. And at the consumer level, we kind of see built-in audio. It's cheap to do an analog peripheral. You can do some really attractive things with that. USB is a good approach for hobbyists. You're a little limited in the number of channels that you get, but it's a good connect solution. And we look at FireWire as being a kind of prosumer, pro solution. So let's kind of dive in and take a look at what's available for built-in.

Basically, you've got two kinds of things that you'd be looking at here, input devices, output devices, or little mixers. They'd be analog connect. The codecs that we ship in Macintosh computers have great noise specs, and they actually measure better than a lot of USB devices out there. So this is a good way of actually connecting peripherals if you only need stereo.

The other thing that's important to think about is optical SPDIF, which is available on the Power Macintosh G5. You can develop peripherals that can do things like AC3 if you want to do multi-channel data by encoding the stream. You can also use this for getting audio in and out of your computer digitally at a very low cost. So consider SPDIF if you're looking at development of this type of peripheral. So kind of summing up for built-in, it's basically stereo, and there's also optical support. So there's some good opportunities there for peripheral development. Moving on kind of to USB.

The important thing to emphasize here is if you develop a device that conforms to the USB Implementers Forum specifications for audio devices, it's just going to work with Apple's driver. This is really important because you want to minimize your development costs when you're bringing a new device to market. So always consider trying to make a class-compliant device if that's possible for you.

We put the audio class driver in Darwin. It's open source. But what we've seen in the past is that developers will take a drop of the Darwin source code, start working with it. We go fix a bunch of bugs, improve the parser, add new features, and they're kind of stuck on this two-year-old source base. So if you can develop a class-compliant device, it's going to work with our driver. If it doesn't work with our driver, please let us know, and we'll fix our driver.

The other thing I wanted to call out here is the Audio Device 2.0 spec. The current spec is, I believe, the Audio Device 1.0 spec, and that's several years old now. There's a device working group working on a 2.0 spec, and we're tracking this work. We're studying it. If you're developing a USB 2.0 device, please talk to us because we really want to make sure it works with our driver set.

The Class Drive is full-featured. One of the things that we want to call out is there's actually a small API in there for doing DSP plugins, and that may seem kind of weird because you've probably heard a lot about core audio plugins. This is kind of a different thing.

This is if you're, for example, making some speakers and you have a proprietary base enhancement algorithm for your loudspeakers, and you don't want everybody to get access to that code. You can use the plugin API to basically match to your device and only your device so that your code will only run with your device, and that's an important thing.

So summing up on USB, basically we see USB as a very good solution for consumer applications and low-cost applications. But the thing that's problematic about USB is that customization of your device may actually require a custom driver. It's a lot of work. The other thing is that the bandwidth for USB 1.1 devices is relatively limited. You're looking at basically eight channels of input or output. But it's a good solution for a limited channel count, and you can also do MIDI support.

The thing I really wanted to dive into today is FireWire. We're really excited about FireWire. If you're developing a FireWire audio device, you essentially have multiple options. You can develop a custom device, and a lot of developers have been very successful with a custom device. The problem with developing custom devices is you've got to figure out the protocol for getting data to the device.

You've got to write firmware for the device, and you've got to write a device driver for the device. Now, the people who've done this have come first to market, and they've got great products. But it is something to think about when you're considering how to implement your device.

A better way to go is to develop your device according to some of the TA specs out there. And there's really two specs that you want to look at-- the audio subunit and the music subunit. They kind of overlap, and it may actually be necessary in a music subunit device to have an audio subunit so that you can get at some of the controls, such as volume controls.

The other thing we stress is if you're developing a FireWire device, join the 1394 Trade Association. They're the umbrella organization for people developing FireWire devices. They look after some of the specs. And there's a lot of good information there, particularly in terms of getting access to draft specifications.

However, we recognize that there are some challenges in developing a FireWire audio device, and we kind of looked at why it was difficult to do this and why there weren't more FireWire devices out there. Really, it falls into three main areas in terms of the problems. There's a lot of standards. If you go to the 1394 TA standards page, you can spend 10 or 15 minutes trying to figure out how everything fits together, and then you can spend several weeks actually downloading and reading all of those specs.

Also, compared to USB, the cost of silicon has been perceived as really high. We're going to talk about that, and I'm actually going to bring up a couple of vendors that we've been working with who've been looking at much lower-cost solutions. And then the other difficulty is the software development. You've got two kind of areas here, the problems in terms of developing the firmware for the device and also the problems for developing device drivers. So let's try and clear some of this up.

In terms of a roadmap for standards relating to audio devices, this slide shows the kind of things that you're going to be interested in. Really, 1394 defines the base electrical spec and packet format on the bus, and the 61883 specs kind of cover streaming. In a sense, when you're dealing with your device, the stuff that you're going to be sending and receiving from the device falls into two areas, isochronous transfers and asynchronous transfers.

Isochronous transfers for guaranteed bandwidth, and for every 1394 packet, a certain amount of that packet on the bus is reserved for isochronous data. Asynchronous is data that you kind of want to get to the device and you want to have it acknowledged, but it doesn't necessarily have to go right now. It turns out you use ISOC for streaming MIDI and audio, and it turns out that generally you use async for querying the capabilities of the device.

So this slide-- can you actually read this? This slide kind of covers the specs that you really care about. At the top, audio and music subunit, you kind of need to decide which is most appropriate to your device. Generally, audio subunit devices are simpler devices. They're appropriate for speakers. They're appropriate for simple I/O devices. Music subunits are usually devices where you have a number of audio streams and you also want to embed MIDI data.

So when you're considering developing a FireWire audio solution, there's essentially three main components. We've talked about the hardware, the software, the hardware, the firmware, and the device driver. Firmware is basically what's going to run on your embedded system. The device driver is what's going to run on the Mac to communicate with your device. The key point here is if you develop firmware that's spec compliant, you don't have to do device driver work. And that's a really big issue in terms of the cost of development of your product.

So let's look at some of the resources available for developing audio devices based on FireWire. There's a number of silicon vendors out there who have products. They range from relatively expensive to relatively inexpensive. We recommend that you do some research and have a look at a couple of vendors when you're choosing a solution.

BridgeCo were pretty much the first out of the gate shipping standards-compliant silicon. And they have a solution called the DM1000, which is in a number of the devices that were announced earlier this year at NAMM. The interesting thing about BridgeCo is they have licensable firmware which can be customized for your application.

and we've been working with a number of vendors who are bringing their products to market based on the Bridgecut solution. The first one out of the gate was the Roland FA-101. It's a cool illustration of it up here. A 10 in, 10 out device with MIDI support. A number of other vendors have announced support of this platform. And this platform is basically music subunit compliant. To talk about an audio subunit compliant device, I'd like to bring up James Lewis from Otsu Semiconductor to talk about their product.

Thanks Nick. We're here today to introduce some technology for bringing a very low-cost solution to FireWire audio for multi-channel applications. You can also see this on our booth downstairs and at Plugfest later on in the week. Now Oxford Semiconductor has a very strong background in FireWire technology through our mass storage chips, so most of you have probably heard of us.

And we're a very strong adherer to the 1394 standard, and also through our position on the 1394 Trade Association, we're actively involved in developing new aspects of 1394 standards. So we're going to give you a bit of a technology introduction into the chip and a demo to finish up. And I'm going to hand over to Andy Parker to do that. Thanks, James. OK, so we're all developers. Oh, could we go back to the slides, please?

Maybe not. Could we have the slides, please? Thank you. So we're all developers. What we're really interested in is what's in the box. And the first thing we turn to in the spec is the block diagram. The real thing to take home from this picture is that most of what you get on the 970 is actually contained within the device.

And if you want to implement an audio subunit which you can connect onto the FireWire or 1394, as we call it, bus, you only need really a handful of components. And in this case, we're talking about an external physical interface for the 1394 and also at the back end, an I2S audio interface.

In terms of the data flow from the bus to the output, we basically have a very short path from the link layer going through a queue selector, which basically filters out isochronous and asynchronous data that Nick talked about earlier, through a FIFO, which is just a small buffer, and then out onto the audio core.

We have an interesting application example which is basically looking at multi-channel audio decode. And in this particular case we have a compressed stream arriving over FireWire and is being transferred by the 970 and through a hardware decoder and then passed out to a multi-channel audio D2A converter stage. And this would be sort of typically applied for replaying surround sound on your system.

It's an interesting application because it exploits one of the more interesting features of the 970, which is that it's quite flexible in terms of the content that you pass it. The firmware actually transfers the data from the iSoconer side to the I2S port. So whatever that data is, provided it's compliant with standards, you can then transfer the data over and match the two formats.

We provide a developer kit which, as Nick has talked about before, implements an AVC audio subunit. And it uses the standard AVC command set to control and monitor the audio properties, so things like mute settings, volume control. And it also decodes the incoming isochronous data, which is, again, compliant to AM824 specifications.

It implements clock recovery, which is basically just matching the rate of data that comes in to the rate of data going out. Because if you don't do that, you'll get strange distortions. And it works with the existing Mac OS X FireWire audio driver. For the firmware development, we use standard open-source GCC toolchain.

And you can even develop that on the Mac itself. We support most... popular host development systems. We can also provide with the framework to basically customize the firmware to match your specific codec which is set on the back end. There's a full reference design schematics and evaluation board available.

In terms of the firmware itself, the whole thing, the important thing to note is the whole thing only comes to less than 64 kilobytes in terms of the operating size of the program. And we've crammed quite a lot in there. So we have... Mac OS X, ooh, in 64K. I think that may be a mistake. We have an operating system layer, which is not Mac OS X. And that just basically provides a sort of low-level startup code for the processor that we have embedded within the 970.

And then we have a standard 1394, or FireWire for the rest of its API, and some queue selector configuration. And again, all the queue selector does is it's just filtering out isochronous and asynchronous traffic. And then we have some higher-level handlers, which are handling the standards-compliant data and generating the right responses to keep the Mac side happy and compliant with the AVC specs. and I think we now have time for a quick demo. Could we roll the demo, please? So this is generating the surround sound over FireWire, being decoded on the 970 and played through the PA.

Yes, all these FireWire guys do is stand around and play Unreal Tournament all day. That's the Oxford board, the EVM board. We'll talk about this in a second, but we'll have both of the solutions that we're going to talk about today available for you to take a look at in the lab. We'll talk about that at the end of the session.

So one of the difficulties we talked about with USB devices is customizing your device. One of the really cool things about FireWire is it makes customizing device behavior way simpler. And the way that you can do that is to send AVC commands to the device. So I'm going to talk you through a quick example. It's not particularly realistic because, hey, we're sending a volume command, and usually you'd rely on Core Audio to send the volume command via the Apple FireWire Audio device driver. But it's a good example of how you can customize behavior of your device.

So there's actually a user client in the FireWire audio driver that allows you access to various services. And you'll see in this example that these services are preceded by the FWA prefix. So you can go count the devices on the bus, open one of them, check its vendor ID, and get its device name.

Now, this obviously isn't a particularly realistic example of how you would match to your device, so I would suggest that you go look at the FireWire SDK for much more comprehensive examples of device matching. But it kind of illustrates the point. The really cool thing about doing device customization in this way is you don't have to write a kernel device driver. And your customization code resides in user space, so you don't have to deal with kernel panics if you screw up. And it's going to help you to do that. debug your code.

To send the command to the device, you set up a command block, and this example shows how to set it up for an AVC volume command. And then to send your command, you simply call the execute AVC command, making sure that you check that your device actually accepted the command.

An example of using the FireWire audio client is actually the MLan implementation and MLan support in Mac OS X. The Apple driver creates and manipulates 61883-6 streams, but Yamaha supply an application which does device discovery and configures the network. While we're on the subject of MLan, some of the enhancements coming to future system update, multiple device support, so you're going to be able to use things like an ONX with Motif EX, and also external sync so the devices can sync to an external clock.

One of the things that I really want to talk about today is the reference implementation of AVC music and audio devices. Apple have been working on this, and we're going to release it later in the year as part of the FireWire reference platform. We're going to provide an audio subunit reference and a music subunit reference.

The way that this is all going to fit together is you have all this stuff that runs on the Mac. On the device, you're going to run some firmware. This is based on a real-time operating system, RTOS. And on top of that, we're layering the Apple FireWire Reference Platform. And the Apple Reference Music subunit will sit on top of that.

The other thing that's necessary when you're developing a device is the ability to update the device's firmware. There's really two sides to this. The firmware on the device needs to be able to accept an incoming ROM image, and you also need an application that can send the firmware image down to the device. Most people who have FireWire devices expect the device to be updatable over FireWire.

The other resource when you're developing your device are tools from the Apple FireWire SDK. Matt, our FireWire developer, said, don't develop a device without it. There's some really cool tools on the FireWire SDK. Firebug is a packet sniffer. An AVC browser will allow you to look at the device descriptor for your AVC device. So I'd like to bring up Yoaram Solomon, who's the general manager of the consumer electronics connectivity business unit at Texas Instruments, to talk about some exciting things that we've been working with TI on. Thanks, Yoaram.

That's only half of my title. If we used the entire title, I would run out of five minutes I have. Nick, thank you for having me here in California. Anybody else flew from Texas here today? Yeah, we just left a whole month minus five days worth of rain and tornadoes and everything. Believe me, you're not missing anything. Anyway, I'm going to spend the next 20 minutes, got you, three minutes, talking about what is it that we're offering.

We've been working with Apple for the past, I guess, several months, more than just several months, developing this platform that you can take off the shelf and we make it available right now and work with the SDK. It has been quite an experience. It's kind of fun taking the devices that we typically put in just those standard boring type end products and finally put them on a kind of fun device. Some of the... Am I going backwards or... Yeah, I'm going backwards.

I'll stay on my title again. Okay, I already told you about the cooperation with Apple. Texas Instruments has been focused, the 1394 development to some extent, in audio products where they have a lot of sensitivity to timing and all kinds of specifications. You know, one thing, three things I can promise you in this presentation. I'm not going to get too technical, that's one. Two is I don't have a demo, and the third thing is we don't have Mac OS in 64K either.

Our USB devices, we have USB devices especially for audio applications. You can see them whether it's audio DACs, ADCs, controllers, codecs. Therefore, as Nick said, kind of the mid-range, lower-end, cost-sensitive type solutions. FireWire is really where we shine. Nick focused on that, we focused on that. We have a device that you're going to see in the next slide called IC-Links, or as we like calling it, TSB43CB43A. And that's not including the package.

That's really a relatively high-quality device that you should see. It's reasonably priced. The board that Nick talked about is a board that's going to be demonstrated in the lab tomorrow. The device that I'm going to emphasize now, this is the IC-Links. Right top corner is where it's really at. This device is one-stop shop for everything $13.94 to a relatively high-quality audio platform. That's all I have. Thank you.

So we're absolutely delighted to be working with TI on this. And it basically gives you a low-cost development platform for adding high-speed serial into an existing device. We see this applicable to things like synthesizers, digital musical instruments, digital effects units, kind of MI products. The key thing here is by basing your audio device on Apple's Music Subunit Reference firmware, you're going to really reduce your cost because there's quite a lot of effort in actually just doing the FireWire firmware. So by adopting this solution, you're going to be able to very easily work on the parts of your product that differentiates in the marketplace rather than doing infrastructure work. And we think this is very important.

The other thing is you'll be able to work with our device driver, and this is going to save you development time on the driver side. So summarizing, FireWire Audio, it's a big pipe. There's more bandwidth than USB. There's a number of other advantages. We'll list some of them here.

So FireWire is a good solution where you need a lot of bandwidth and a lot of channels. You've got MIDI support in there, and it's extensible by using AVC vendor-specific commands. So if you're considering producing a new device or adding high-speed serial to an existing device, we really encourage you to look at developing with FireWire. We've got resources available from multiple vendors that are far lower cost than the products that are currently shipping, and we're going to save you money in terms of development time on your driver set.

That kind of covers our continuum of audio devices. So just to summarize, there's a ton of opportunities out there. You can look at low-cost garage band peripherals. You can look at very high-end FireWire solutions for the music and audio production environment and everything in between. Analog solutions are great for built-in audio on Mac and such computers. And for hobbyist and prosumer solutions, we recommend that you look at high-speed serial. The key thing here is I really urge you, when you're considering developing a new device, look at standards-based devices. You know, you'll be saving yourself driver work, and your customers will be way happier.

So in terms of who to contact about stuff, I definitely recommend if you're a hardware developer that you build a relationship with Craig Keithley. He's an IO Technologies evangelist. His email address is here. There's also a great mailing list with a lot of traffic on it and a lot of very cool people, both inside Apple and outside of Apple, answering questions on the Corotia mailing list. And there's mailing lists available for FireWire and USB developers.

We also have a fairly considerable reference library. There's a good bulk of information about Core Audio and developing device drivers on Apple's website. Generally, a good jumping-off point is developer.apple.com slash audio, and there's a reference to most of these resources there. There's a Core Audio SDK, which you should definitely be looking at from developing at an application point of view, and there's sample drivers in the Core Audio SDK. And the audio web page is there. If you're developing a FireWire device, you should also look at Apple's FireWire reference platform, because that can save you a lot of time and effort in firmware development.

The other thing is, the thing at the bottom is the important thing on this slide. There's a couple of trade groups, USB Implementers Forum, 3940A. Check those out if you're developing either a USB or a FireWire device. And then finally, tomorrow at noon, the audio driver team will be available in the Graphics and Media Lab. We'll be showing the TI board. We'll, I think, have Oxford represented there, so you can look at their board. Hopefully, we'll be able to answer any questions that you have about developing firmware or device drivers for audio devices. So thanks a lot.