Sound & Networking for Games - WWDC 2001

Mac OS • 1:03:26

Sound and networking are critical elements of almost any successful modern game. This session focuses on the wide range of technologies available to achieve world-class sound and robust networking in your next title.

Speakers: Todd Previte, David Hill

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Welcome. We're going to have session 136, Sound and Networking for Games. I'd like to welcome you all. David Hill and Todd Previt will be giving this presentation. We're going to do a tag team event right here. I think David's going to lead it off talking about some sound and specific sound and games issues and talking about both core graphics and sound manager. Then Todd will follow up. We're talking about networking, talking about all the various networking services we have across the OS for games. Without further ado, David Hill for of sound.

So as Jeff mentioned, Todd and I will be splitting this session. I'm going to start out, give you kind of a brief overview of the APIs that we have available on Mac OS X. and cover some of your options, cover a little bit about how to set up and initialize each one. Then talk briefly about how to play back, you know, sound sample, that's actually going to be in core audio, and talk about MP3 playback. And for that I'll leverage QuickTime to do the one-page MP3 player.

Then Todd's going to come up a little later and talk about a number of different networking issues, BSD sockets, addressing, setup and initialization, and some of the network technologies you might want to look into using for your games on Mac OS X. And finally, finish up with some tips and tricks.

So as far as the technologies we're going to cover today, as I said, I'm going to start out with the Carbon Sound Manager, tell you a little bit about what's changed and what hasn't. I'm going to talk briefly about QuickTime and Core Audio, and then Todd's going to be covering Open Transport, NetSprocket, OpenPlay, and some BSD sockets. BSD Sockets So the Carbon Sound Manager.

We brought the Sound Manager across from Mac OS 8 and 9 to help make your jobs a lot easier. It's callable both from CFM and Mako, which makes it really nice since you can create a CFM app that uses the Carbon Sound Manager that For the most part, with no real changes at all, we'll run on 8 and 9, and it will also run on Mac OS X.

It's also accessible from Carbon and Cocoa. So, Cocoa has its own NSSound class, but if for some reason you already have Carbon Sound Manager code or you need a little bit more control than NSSound gives you or what have you, you can actually call it from Cocoa. That works just fine.

One of the new things we've added as far as Mac OS X is the Carbon Sound Manager actually now supports variable bitrate decoding. That's a big issue for MP3s and things like that, where they start to try to actually vary the bitrate to improve the compression that you get with your sounds.

However, we have had to remove several APIs for one reason or another. Probably the biggest one that we've heard people complain about is the SoundPlay Double Buffer. A number of people were using that one to do a real easy ping pong back and forth between two sound buffers. That one is no longer present in the Carbon Sound Manager.

In order to help you, for those of you that were using that API, we've got a very good sample called the Carbon SoundPlay Double Buffer. I encourage you to check that out. It's on the SampleCode website, developer.apple.com/samplecode, probably under the sound section there. It's a pretty good sample.

It explains how to use the Carbon Sound Manager and some other APIs to do pretty much the same thing that the SoundPlay Double Buffer did before. We've also removed some of the disk-based APIs, the ones that play directly from a file and things like that. And wavetable synthesis is no longer there, for an example.

For the really curious among you, there are some other changes under the hood between Mac OS 8 and 9 and Mac OS X that your app probably shouldn't care about. But I thought I'd mention them for those of you that are interested. On Mac OS 8 and 9, the sound manager was this big global thing.

It was actually only one on the whole system. And the hardware model was provided by the sound manager itself. It had to know about all the different sound hardware on every single machine and figure out what machine you were running on and do the right thing. On Mac OS X, however, the sound manager is per process. And in talking to the audio guys, it shouldn't make any difference to your app.

Pretty much you're either changing, say, the volume on a sound channel, which is local to your app, or you're changing the global volume, which is still global. So you shouldn't really care about that one either. Another interesting point is the hardware model on Mac OS X, the Carbon sound manager is actually a layer on top of core audio. So that will introduce a slight performance hit, perhaps. But again, your app shouldn't really care one way or the other.

So I thought I'd step through just real briefly for those that haven't seen the Carbon Sound Manager, want to know what it looks like. We can step through a little bit of code. Really just a few slides here. So one important thing you need to have if you're using the buffer and callback scheme is you need a sound callback. And I've prototyped it up there at the top. It takes the sound channel parameter and the sound command.

And then when you actually want to create a channel, the channel is the thing that you actually send your sound commands to to play your buffers. You pass it a pointer to a sound channel, you pass it, in this case we're telling it we want to send sampled sound, we want a stereo channel, and then we pass in that callback.

When you get ready to play a sound, you need to tell the sound manager what format that sound is in. So in this case, we're filling out a header structure. Then these are just some of the interesting parameters. You have a pointer to the sound data you're playing, how many channels are in that sound data, the sample rate, and we also tell it that this is an extended sound header, which lets you do some additional things. You can set the base frequency. I think number of frames is an extended sound manager thing. Samples per buffer and sample size. So we're playing 16-bit stereo, 44 kilohertz sound in this case.

So once you've got the channel set up and you've got the header set up, you need to actually send those commands to the sound channel. And the first block of code there, we set up what's called a buffer command. So the buffer command simply tells the sound manager, here's a block of sound data that I want you to play. And we pass the header, pointed to the header that we've just created that describes in detail the format of the sound. And we pass that to the sound channel with the sound do command.

The second block of code there, in order to know when that buffer of sound is completed, so that you can queue up the next buffer of sound, you can pass in a callback command. I believe we have a bunch of samples that do this. QuickTime does this kind of sound playing.

You pass in a callback command, the parameters don't really matter, and you pass that to the sound channel through much the same as you did before. One interesting thing to note, and this confused me at first when I first started looking at this API, you're giving it a callback command, but you're not telling it what to callback.

If you remember, we actually passed that in when we created the channel. One sound callback is set per channel. It's not per callback command. You have to have all your callbacks on a particular channel go through one actual callback function. And then once you're done with the channel, after you've sent a number of buffers, played whatever you want to, don't forget to dispose it.

So moving on to QuickTime. I'll put in a little plug for QuickTime here since this is a game session. QuickTime is really good for complicated sound data like MP3s and MIDIs. They do all the heavy lifting for you. But it's also really good for still image loading. Anybody that saw Jeff's 2D, high performance 2D OpenGL talk, he uses QuickTime to do the image loading.

So he doesn't care what format it's in. He just says QuickTime, let me load in this image, whatever format, JPEG, BMP, whatever format. PNG, whatever. And QuickTime does all that work for him, puts it into a buffer. Then he can hand that directly to OpenGL as a texture. So it's really good for that.

It's especially a good way to get some things prototyped and up and running. If you want to do your own graphic format later, you can. But at least to get you started, QuickTime is a great way to get started. And of course, movie playing. Of course, QuickTime is famous for that. But I thought I'd throw it on there anyway.

So as promised, here's a one-page MP3 player. To get QuickTime started, enter movies. And in this case, we've passed it in a file spec of the file we want to play. And in my demo code that I'll show you in a minute, we actually just use Nav Services, get us a file, pass it in here.

We open the movie file, say create a movie from the file. And that will give us for that refnum that we get out of open movie file, it will give us a movie. And then we'll fill out the movie structure that we can pass to the other APIs. And then deploy it. Really simple.

We start the movie. As long as a movie is not done, we call MoviesTask. Then when we're done, we dispose the movie, close the movie file, and exit movies. We're done. It's all it takes to play an MP3. Really simple stuff. I'll prove it. We can go to the Demo 2 machine. This is a real simple app, essentially the basic Carbon template from Project Builder. Threw in pretty much that code you saw there.

Now there are some more in-depth samples on our website again for how to play MP3s. If you want some more control over how it's decoded and how it's played, check out our sample code site. I know there's some that more directly use the Carbon Sound Manager and QuickTime together to play some more detailed things. Let's go back to the slides. Thank you.

Let's talk about the new guy, Core Audio. So the number of Core Audio sessions, I think one has already occurred and two more are coming up. I'll tell you about them in a little bit. But I wanted to go over kind of a high-level look at Core Audio so that you could get a feel for what it is and whether you should be looking at it for your game.

So Core Audio is a pretty low-level interface. It provides a number of capabilities built into the architecture. Some of these are ready to go now. Some of them haven't been exposed yet. But for one, audio device sharing. Of course, you want to make it easy for different devices to share, different processes to share the same device.

If you have, say, an MP3 player playing and your game comes up and wants to be playing audio and then the system wants to beep to tell you something important happens, everything has to be able to share it. In the example code that I'll show you in a bit, we just pretty much ask Core Audio for the default output device. We say, whatever sound goes to by default, that's cool for us. There are other calls that you can use to get more interesting info, actually get a list of the different audio devices available, things like that.

Core Audio also provides some facilities for real-time audio inter-process communication. And that would essentially let multiple processes sort of pipe audio data between them. So if you had one app that was generating some audio and you wanted to send it to, say, an app that was recording that in some format or whatever, you could actually chain them together.

And that's on a process basis. They also have a lower level, what they call the audio unit and component model, where you can actually have audio processing units and string them together within your app. So you might, in your app, create a reverb unit and some other units together, chain them together, and then start sending audio through it.

One of the other interesting things about Qua Audio is it's designed with low latency. It's supposed to be able to handle really low latency and also synchronize output between different sounds. And one interesting note there is you can actually control, for the most part, the latency in your app.

Unlike the sound manager, with core audio you can actually set for a given device what you want the buffer size to be. And on 9 there was always a big problem between VM on and VM off for games and the Carbon sound manager, or actually just the regular sound manager. With VM on, the system would make the buffer much bigger.

And so if you decided to change what you were playing, you'd have to wait for that buffer to complete before the next buffer got going. On core audio, on Mac OS X, that's not a problem. You can set the buffer as big or as small as you want. Now one gotcha with core audio is that core audio uses floating point samples.

Now that does greatly simplify your application code, especially in the demo I'll show you where we're generating tones and things like that. It's really easy to generate, but it does make the drivers work a little bit harder because the existing hardware we have right now sends integer data to the cards.

Also, if you're coming from, say, a conversion where you have wave sounds and some things like that coming from Windows or some other platform that are 8 or 16 bit integers, You're probably going to have to convert your samples on the fly, or you can pre-convert them and store them on disk.

[Transcript missing]

So, in the Carbon Sound Manager, you got sound samples into the system using the buffer commands and the callbacks. That was your basic mechanism for getting sound into that system. With Core Audio, You deal with these things called I/O procs. And so for a given device, you can install one or more Ioprox.

And your Ioprox, when they get called, get a number of very interesting parameters. They get a number of timestamps. As I mentioned, Core Audio was designed so that you could actually synchronize audio. Well, if you're going to synchronize audio, you need to know a lot of stuff, like what time is it now, as far as the sound system is concerned, and when are you going to play this sound that I'm getting ready to hand you. You've asked me to fill this buffer.

When's that buffer actually going to get played? So you get that information. You'll get some input buffers back. As the name implies, it's both input and output, Brock. So you get some input buffers if the device is actually needing to send you data. That's where that will come in. And then there's a chunk of output buffers that you can use to put your samples in and send the data out to the card.

That last bullet there, if you need to know what the stream is like, you need to know how many channels or sample rate and those kinds of things, you can actually make a call to get the KAudioDevicePropertyStream configuration. Yet another API with really long names. You can make this call and it will fill out a structure for you that will give you very detailed information about what this I/O proc is going to get and what the stream is like.

Now for the people that are really curious, we can go into how the actual audio gets to the device. For each device on the system, there's one real-time Mach thread that's pretty high priority. And it takes care of actually calling all the IOPROCs when the system needs to move stuff back and forth.

And essentially that device starts that thread up with the first IOPROC you install. As soon as you activate the device with the first IOPROC, the device is up and this thread's running and calling your IOPROC. And the thread stays running until you actually remove your last IOPROC and then it shuts back down again.

So let's look a little bit at what that thread looks like. This is some real high-level pseudo code. Essentially it loops until it's done. It sleeps until it has something to do. It can tell when the next input or output needs to happen. It sleeps until then. Then it looks to see if the device has input coming back into the system. If it does, it will calculate exactly when that input was sampled and then copy that data into the input buffers to get ready to hand to the I/O procs.

Then it says, okay, does the device have output? Has somebody installed an I/O proc to send data out? Okay, well, let's calculate when that output is actually going to be sent. Clear out the output buffers to prepare for calling the I/O procs. And then call all the I/O procs that are installed for this device. As I mentioned, a particular process can install more than one and you can have multiple processes sharing a device, so there might be a number of I/O procs actually installed here.

Once it's called all the I/O procs, you've got, say, a handful of I/O procs that have filled out their buffers. Somebody's got to mix them to send them out to the output stream. That's when Core Audio comes along and actually mixes all those I/O proc buffers together. And then it sends it out to the card and loops back around. So it spends its entire lifetime just gathering the input, clearing the buffers, getting the-- out and calling I/O procs and then sending the data out.

Let's look at some code. This is the basic idea. If you're a Carbon CFM app, You've got to load Core Audio by CFBundle. Core Audio is not available from CFM. So there's some good sample code on the web at our sample code site, and actually in the CarbonLib SDK as well, about how to call CFM, or how to call Mako routines from CFM.

Once you've got access to the Core Audio system, you can make some calls to gather device information. You can set up the device properties the way you want them, make sure the buffer size is correct and everything. You add in your IOD proc, and then you start the device. And as soon as you start the device, it starts calling your I/O proc asking for data.

So a brief diversion into the bundle APIs. If you have a URL, say, to the Core Audio Bundle, and this applies to any framework bundle, you can get a reference for the bundle using CFBundleCreate. You load the executable for that bundle to make sure it's in memory. Then you can get a function pointer for, in this case, some non-existent CA func name is the name of the function. You get a function pointer for that using CFBundleGetFunctionPointerForName. Hard to miss what that does.

Once you've got access to that API, then you need to start calling them. So you can call audio hardware get property. And in this case, we're asking it for the default output device. Core Audio guys recommend you actually use the default output device, unless you have a specific reason not to.

If you want to actually look at the APIs and query for the different output devices you can. But this one will be a real easy way to just tell you, "Okay, what's the basic output device that I need to use?" This will give you an output device ID.

From there you can get and set some properties on it. As I mentioned before, you could get the configuration for the stream and things. In this case, all we need to do is, we just want to say we need k samples per buffer and each sample is a float, so we set that buffer size in there. I don't even remember what I used in my code.

Then we have to install our I/O proc. And the I/O proc is actually mostly input parameters. The top two-thirds there is the parameter list. And as you can see, you get a device ID. So you can actually have the same proc installed on multiple devices. So this will tell you which device is actually asking for data. You get the timestamp from the audio system to say, here's what time it is now. The input data and the input timestamp when that data was actually sampled.

Then you have the ones that I've highlighted there that are more interesting for our discussion. The output data, the time when that data is actually going to be played, and some client data that you can actually pass in when you install the I/O proc. And then down there at the bottom I put just a real simple, you know, for each sample frame, you know, the frame being left and right stereo, whatever. For each sample frame, for each channel, either the left channel or the right channel, fill in that output data buffer there.

Once you've got your I/O proc installed, all you have to do is start the device. So you say start the device with that I/O proc. A caveat here, as soon as you call this, it's going to start calling your I/O proc. So beware, make sure that you don't do this and then expect to be able to set up a few more structures or whatever for the I/O proc. It's already going to call you, so be ready for it. So let's take a look at a simple demo here. And my colleagues have voted this the most annoying demo at WWDC. So we'll show you the simple tone demo. Let's make that a little softer.

So here we have... So this is a simple Cocoa app that I wired up in... Not too long, big plug for Cocoa here. Really easy stuff to play with. Good for prototyping This one we can play a little bit more with some other things you learned in music theory class.

Simply generating the waveform on the fly, some simple sine waves, square waves, sawtooth. Really easy stuff. We'll be posting the sample code in the next week, probably, so you can take a good look at how to use core audio, get some real simple stuff going, and annoy your neighbor. Especially for those of you that work in cubes.

I was actually working on that app a little bit during Jeff's OpenGL early bird session, Okay. So to wrap this up, real quick, high level, you know, we've talked about several different interfaces. Core Audio and Sound Manager are great if you've got raw sound samples that you just want to throw out, get out to the card, play them.

If you've got something more complicated like MIDI or MP3, QuickTime is a really easy way to get that up and running. Take a look at the API. It's really simple. If you want to do the heavy lifting, you can, but QuickTime already has done an awful lot of work to get that nice and neat. So you might want to take a look at the QuickTime API.

And coming up, actually, luckily, there's still two audio sessions coming up in this very room at 3.30. We've got the audio processing and sequencing services. And then for those of you that are more interested in some of the more powerful MIDI services that Mac OS X provides, at 5 o'clock in this room, we've got 2.10 MIDI on Mac OS X.

And then since this is a game session, if you have games feedback for us, if we're doing something well, if we're not doing something well, if there's something you'd like to see more of or less of, what have you, please come to the feedback forum in J1, which is next door, and give us our games feedback on games technologies. And that's it for my part. Now I'll turn it over to Todd. Where did Todd go? There he is.

Thank you. He wasn't kidding about those demos, was he? Hi, my name's Todd Previt. I'm the 3D graphics and de facto networking DTS engineer. So I'm going to recap one of the things that one of David's slides went over, which is what we're going to learn about networking today.

I'm going to go through BSD sockets in the most detail, including accessing it from both CFM and Carbon and from Cocoa. Then I'm going to go through some simple setup and initialization code. I'm also going to discuss some of the networking technologies that are available on 10 that are better suited to gaming as opposed to things like URL access, which is more designed for mainstream applications. I'm going to go through some tips and tricks for network gaming. It's kind of a misnomer. It's more like a porting guide, kind of give you some idea of how things map over to 10.

So what do we have for networking on 10? Well, first of all, there's Carbon Open Transport. As that name would imply, Open Transport has been carbonized for OS X. It's all there, all the functionality that you need. And that is one of the options you can use. NetSprocket Open Play, we've gotten a lot of questions about this very recently. And NetSprocket and Open Play are now also available on Mac OS X. And of course, there's Sockets, which which is available in the BSD layer.

So OpenTransport's available on 10 mainly as an easy path for existing applications to move over to Mac OS X from OS 8 and 9. Again, as I said, all of the functionality of OS 9 is there, so you don't have to worry about, well, what are the differences? The whole thing has been taken and just pulled right over directly. The OpenTransport frameworks are all built on top of sockets now, as opposed to whichever underlying layer they used on 9. I don't know offhand. This has been included for API compatibility only. There's no additional functionality that you get from OpenTransport on Mac OS X.

Open Transport uses threads to emulate any of the asynchronous mode stuff that was available on OS 9. You do incur about a 10% performance hit for using Open Transport as opposed to using sockets directly. And the protocol subset that is supported by Open Transport on 10 is much smaller than it was on 9, supporting only TCP IP, DDP, ZIP, and NBP for Apple Talk.

OpenPlay Netsproket is a derivative, well, not a derivative, but it's the evolution of Netsproket from traditional Mac OS. Apple open sourced it some time back, and since then it's kind of developed into two different areas, that being OpenPlay and Netsproket itself. They are both open source APIs. They are now cross-platform, available Mac and Windows. I believe there was a Linux Unix version that was supposed to be coming, but I couldn't find much information on that. And it also has been carbonized from Mac OS X.

So with Netsprocket, what they did was they made it the high-level interface that is now, instead of built on top of OpenTransport, I believe is what Netsprocket used to be based on, it now sits on top of their own API, which is OpenPlay, which I'll discuss in just a few minutes.

It does maintain most of the API compatibility with NetSprocket, although as I understand there are some differences between the two. No new documentation is available for NetSprocket, but you can still use the 173, I believe, is the current version of NetSprocket documentation that is available. That is still the most current.

So what exactly is OpenPlay? OpenPlay is the low-level interface that they've developed for use on both Windows and Macintosh. It's sort of akin to sockets in that it doesn't have a lot of... One of these days I'll think of the words that I'm trying to come up with here. It's fairly streamlined, fairly simple, and there's just not a lot to it as far as being overly friendly with a user interface. They left most of that functionality up to Netsprocket.

It's been called a network module manager. So what exactly is a network module manager? It's a protocol manager. You tell it which protocols you want to use and what you want to do with them. It sets them up and then you can access them either directly through the open play calls or you can go through the high level NetSprocket interface and access them that way.

provides three basic services, which is configuration, data transfer, and enumeration. Now, in the documentation it says it also provides human interface and miscellaneous functions, but neither one of those I could find a whole lot of information on. I think the user--the human interface they're referring to was Netsprocket, but again, couldn't find much out on that.

Configuration-wise, just as I said, it'll set up the protocol stacks for you. You can do all your initialization right through Open Play, just as you can through sockets. Data transfer, it's what communicates with the drivers, with all the network drivers, sends it down to the hardware and spits it out on the wire. And enumeration, it will go through all of the available network interfaces, including any sort of serial dial-up or Ethernet interfaces.

It will enumerate them for you so you can select whichever one you want to use. This is kind of a graphical representation of how OpenPlay and NetSprocket function together, with the high-level interface sitting on top of the Network Module Manager, and Network Module Manager calling down into the various protocol stacks and device interfaces that are available.

There's a little more information on OpenPlay. The URL is for Apple's open source website. You can access it right there. There is a complete set of documentation. Well, okay. Not exactly complete, but there is a set of documentation that goes along with OpenPlay. It is freely downloadable right now. I believe the version is 2.0.

and David Hill, who are the founders of Sound and Networking, are the founders of Sound and Networking. And there are two sample apps. I believe MiniPlay and MiniTest are the two apps that I saw in there that I was able to look at. Again, it's all been carbonized, so they should all just build and compile right out of the box.

This is going to be the real meat of our discussion here, which is BSD Sockets. It is the standard Unix networking API. It's available across every implementation of Unix out there. The interface itself has not changed in many, many years. However, there have been some supersets of it made for Windows, and I thought SGI with Irix had another implementation of Sockets that had more of a higher level user interface to it than Straight Sockets did. One of the key things about Sockets is that all of the functions are synchronous. They are blocking calls.

Anytime you actually call one of these things, you're going to have to wait until it completes. As you'll see later on, with certain functions such as DNS, when you do get host by name or get host by address, you're going to sit there waiting for a while if you've got some slow DNS servers in your pathway.

This is sort of how a Sockets application will look from a high level. Both the server and client will both call Socket to create Sockets for themselves. Sockets are just file descriptors, is really all they are across any operating system. So they're pretty much going to function the same way any time you're accessing Sockets. For a server, it's important to call bind.

What bind does is bind takes a local--the local address of the server and associates it with that Socket so that any incoming connections to that server will say, "Oh, here's the server we're looking for. Here's the address I want. You're the one." And that's how a client and server can establish an association. The server will then call listen, as listen is what sets up the socket so that it will accept incoming connections.

And that, it would be on the client side, the reference for that would be connect, where the client actually goes out over the network and says, all right, this is the server I'm looking for, where are you? The server that's now in listen mode will hear that incoming connection and then provided, of course, that the connection is correct and the server is set up to handle a connection from that particular client, it will call accept.

Accept basically tells the server that, yes, or excuse me, accept is the server's way of saying, all right, client, you're good to go, I'll accept your connection. Accept will create, as its return value, it creates a new socket in a connection-oriented mode. It would actually create a new socket that is now associated with that end of the, with the incoming client's socket. So you've now got an endpoint-to-endpoint connection.

Once you've established that association between those two sockets, you're now free to send and receive data back and forth between client and server. When you're done, once all the data has been transmitted that needs to be transmitted, both sides call close to get rid of the socket. Or excuse me, that will close the connection that will not destroy the socket. How you do that, I'll show you in a moment.

So how are we going to initialize the Sockets interface? Well, CFM Carbon applications are going to need to load the Sockets framework through CFBundle. David showed you how to do that in the sound portion. I've also included a URL up here at the bottom of the page that if you look for the sample, call Mako framework, that will tell you how to load a Mako framework from a Carbon CFM application.

Mako Carbon Applications, all you have to do is include the system framework and it's right in there. If you wanted access to the actual--if you wanted to see what the header files actually look like, they're in the SIS subfolder of the system framework and there's SockIO, Sockets.h and SocketVar.h are the three header files that you're going to want to look at mainly if you're curious.

Cocoa applications have two options. Naturally, you can use the NSSockets framework, or you can load the system framework and access them directly as I just described. Either way works just as well. The NSSockets framework, I believe, is obviously set up for an Objective-C interface. So it's really six of one, half dozen of the other, depending on which one you want to use.

So creating sockets, as I mentioned before, both servers and clients will need to create sockets in order to communicate over the network. Per connection, you need one socket. Now with multicasting, multihoming, and a couple of things like UDP, for instance, you can actually, because it's a connectionless protocol, UDP is capable of sending messages to different hosts on the network.

For instance, with UDP, if you had four servers out there and one client, if all of them were using UDP, the client would be able to send a message to server one, server two, server three, and server four without establishing a connection in between all of them because UDP is a connectionless protocol. It uses what's called a datagram that includes in it the, or as a parameter to the send function or send to function, it includes the address of the server that you want to talk to or the destination that you want to talk to.

So every time you send a message, you're telling, "Okay, I'm sending this one to server one. Now I'm going to send this one to server four." That's already included right there in the send to function. The socket function returns a socket descriptor, as I mentioned earlier. The socket descriptor is just a file descriptor. If you really look at it, it's just an int. So, very simple data type.

So how are you going to create sockets? This is a little-- this code here, you can use this code directly, and you'll create sockets of various types. The top one is a SOC stream that's used with the protocol TCP, which will create a connection-oriented, data stream-oriented socket. These are useful for things like doing file transfers, if you're going to send contiguous blocks of data.

So if you wanted to stream audio out, you'd probably want to use TCP. TCP is a reliable protocol, which means it does its error checking and correcting. It will do resends. There is a little caveat to that. TCP is under no obligation whatsoever to let you know that it's doing resends or let you know that it's dropped packets.

So what does that mean? Well, you can be sending your data happily to a TCP socket, and it will be humming along just fine. The other end won't be receiving any of it. You won't know about it, and neither will they. All they see is no more data is coming through. TCP will just stop resending.

That's it. That's all you get. So there's really no way to make TCP/IP, or excuse me, TCP tell you that it's now doing error checking and correcting and trying to reestablish its connection. There's no functionality for that whatsoever, which means it can be kind of dangerous to use if you have real-time critical data.

So what's your option? UDP. UDP is connectionless, but it is unreliable. What you have to do is implement a reliability protocol on top of UDP. Fairly easy to do. All you have to do is, as you're sending messages back and forth, you just make sure that both the client and server have some method of saying, all right, I sent a message. Did you get it? Client responds, I got the message. And then you just kind of loop back and forth on that.

Whenever one of them does not receive a message, all you have to do is say, all right, resend, and you retransmit the same data. It does require a little bit of work, although most of the time, if you're talking about network gaming, where you really need to be sending data back and forth, and it doesn't matter, if you miss message five and you're now on message 11, message five is probably irrelevant out-of-date information anyway, so there's no need to resend.

You drop it, you interpolate between where you were and where you are, and you move on. For instance, that would be for something like movement. The last thing is an ICMP, which is Internet Control Message Protocol. You would use this for any sort of, well, Internet Control Message. Things like pings. Pings are all done through ICMP.

So establishing connections. Well, the first thing you need is you need an address. And Sockets defines the SOC ADDR structure, which has a name, it's got a port, and an address. Those are the three things that you need to establish the association that you need for a connection. That applies to both TCP connection-oriented and UDP connectionless associations.

Generally speaking, you're going to need the length too, which is why you do the size of the address, just for the purposes of sending it to some of the initialization and sending functions that Sockets has. You'll see int backlog. That's used down there in the listen function, where when the server calls listen on a particular socket, it needs to know how many connections you want to queue before it starts dropping them.

If you specify that to zero, it will process connections as they come in. There's only one problem with that, is that if you're processing a connection, if you're in the middle of processing a connection, then it's not going to be paying attention to what's going on in that socket.

So a connection can come in and it will just automatically drop. So generally you want to set it to something like two or three. If you have an extremely busy server, you want to set that up higher so it will actually queue the connections up and you don't lose any of them.

Bind, right above it, what bind does is it will associate the socket address that you've specified with the socket. So for a server, what does that mean? As I said before, it means that now the server has an idea of who it is. So when there's connections coming in for that specific address, the server knows that it's referring to itself and will then continue to call it. It will continue to accept incoming connections because it now has some identity. You can call bind on the client side.

That's useful for a couple of different things. If you have a UDP connection, you have a UDP connectionless association, on a client side, it's kind of strange because a UDP socket will accept input from anywhere. So some random guy on the net sends a package out that happens to have, or sends a datagram out that happens to have your address in it because they erroneously specified it. Unless you're locked in, your client's going to pick that up and it's going to read that data in.

So how do you prevent that from happening? Well, you call bind. What bind does on the client side is it says, "All right, I'm only going to use this socket to transmit data to and from this address." It will not accept data from anywhere else. It will only accept datagrams from the address that you specify.

This does not mean that you cannot re-associate that socket later if, for instance, your client wants to communicate with a couple of different servers. There's a little more work involved. You'd have to actually shut down the connection and destroy the association, recreate your socket, and then continue to send or continue to communicate with a new server. But it can be done.

So again, onto the Accept function. This is only used on the server side. It can be used when you're talking about a client-server architecture. If you have a client that is also a server, you can use Accept, although it's usually easier if you're doing client-server to just go peer-to-peer as opposed to having the separation there.

If you don't call Accept, excuse me, if you call Accept from a server, it will create a new socket. So now you're going to have a wider array. What you would end up with is you'd end up with a whole bunch of sockets that are associated with the same client. So you'd kind of get this crisscrossed network that ends up being, you'd end up with peer-to-peer at that point.

So Sockets, excuse me, Accept takes the socket that you want to accept the incoming connection on. It accepts, it also takes the address and the address length. It will have that information because when the server receives the incoming connection from the client, the incoming address from the client is actually transmitted to the server. That's where you will get that information from.

Now, when the accept function returns, it does return a new socket. If you want to communicate with the client on the other end of that socket, that is the socket that you use. The listening socket on the server, you shouldn't be transmitting more data on because it's sitting there just waiting for incoming connections.

So here we'll go through the connect function. This is again for connection oriented. For UDP it's slightly different. The socket that you pass in there is the client's local socket. The address that you pass in is the address that you want to connect to, not the address of the client.

I suppose you could connect to yourself if you wanted to do a loopback, but that would be entirely application-specific. You also send it, as I mentioned earlier, the address length. This is one of those functions that requires that. It needs to know how much data you're actually passing into it.

Connect only returns--Connect does not return anything but error codes that will return zero if no error occurs. You can use getLastError, I believe, will give you the last socket error that occurred. You have to be careful with that because that's not updated. It's only updated when an error occurs. So if you call getLastError, if--when a function returns zero, you get the last error that actually occurred.

One of the little caveats about Connect is that Connect assigns the local address if you don't call bind. Normally that's fine. Unless you wanted to assign your client a different address, there wouldn't be much of a problem with that. The reason that it doesn't do this, I believe, is for when you wanted to do dynamic address updating. That's getting way, way off topic. That's not something I want to go into.

So sending and receive data. You've got a couple of different functions. For connection-oriented sockets, you'll be using send and receive. There's also two other functions that are send to and receive from. Those are mainly used with UDP because one of their parameters is the address that you want to communicate to or from.

With a connection-oriented socket, both functions perform identically. There's no difference. You're simply passing in an address, but you've already got a connection. The connection that's established is already associated with that address, so the extra input is just discarded. You can use both of them synonymously with a connection-oriented socket, GCP, streaming.

which is usually not a bad idea if you're going to be changing protocols on the fly. You should probably use send to and receive from. There's slightly more overhead involved with them, but it's not significant enough to worry about. So as parameters, send takes the socket that you want to send on.

For a client, that would be the only socket that it has. For a server, that would be the socket for a streaming server, for a TCP server, that would be the socket that is associated with a client you want to communicate with. If you have a wide array of clients and you want to broadcast a message to all of them using TCP, you'd have to iterate through each socket and send the same data on each socket before you could move on.

It also takes a pointer to a buffer, which is just a buffer of bytes that it will read in and splatter out on the network. Bytes is the byte count. That's how many bytes of data are in the buffer at the time. And you just pass that along with any flags that you might have. Flags are not terribly important. Only if you're going to be sending things like out-of-band data, which is beyond the scope of this discussion, is flags really important. Most of the time you can just pass zero.

Receive, on the other hand, is the socket that you want to receive the data on. Again, for a server, you're going to have to iterate through all of the sockets that you've created with Accept in order to get all of the data that's incoming from the clients, which can be kind of slow.

This is why it's recommended that for real-time games, for real-time data like that, you want to use UDP. It gives the server one socket that it has to read data in on. Granted, the buffer is large, but you can read data in a lot faster from a single large buffer than you can from 300 buffers having to loop through each one and call the same data input routines every time.

Receive takes the same number of parameters and the same kinds of parameters as send. The receive buffer, however, is not a pointer to the buffer that the data is in. It is a pointer to the buffer that the data will be transferred into. If you do not have a data buffer that is large enough to accommodate the data, this function will fail. It will come back with the, forget the error message that it fails with, but it fails with something similar to your data buffer is not big enough, so reallocate it.

To get around that, you can do a, if you pass in as one of the flags to receive, I think that you can pass in message peak, which will tell you exactly how much data is in the buffer. So you can do that before you read it in every time and allocate your buffer dynamically like that. So shutdown and cleanup.

All these functions take, or close, close just shuts down the socket. That's all it does. And it just accepts, it only takes as a parameter the socket that you want to close. Very simple. Shutdown, on the other hand, is a little more in-depth. With shutdown, you can tell it what you want to do with any residual data that's remaining on the send or receive buffers.

A lot of times you'll want to flush and clear any residual data. You want to get those last few bytes in or out. So you want to use shutdown. A lot of times, the extra control that Shutdown does offer is quite useful. I call Shutdown in my network applications more often than I just simply call Close.

Your port number is going to be assigned based on your address. Both of them are actually going to clear your port. Once the socket is closed or shut down, that will free the So a few little tips and tricks for Mac OS X. For open transport applications, what can you do? I know there's a lot of existing open transport applications out there. So there's two options. You can use the OT shim that's there.

It does work. But your other option is to port it directly to sockets. As I said, OT on X sits right on top of the sockets layer, and you do incur about a 10% performance hit to use OT as opposed to going directly to sockets. I'm going to go through a couple of slides in a moment that will show you how OT maps to sockets if, in fact, you do want to port your OT app to sockets directly.

For Windows and Unix applications, Unix apps are going to be pretty much a straightforward port. I don't think there's any semantic differences between our implementation of sockets and the standard Unix implementation. Winsock applications, there are some notable differences which I will go through momentarily. And there's also Direct Play applications which aren't going to really translate across to anything, but I can give you some suggestions as to where you might want to look.

So using OTCM is again the easy way onto 9, or from 8 and 9 onto 10. Porting to sockets would be my recommendation. So how are you going to do that? Well, here's some of the, on the left-hand side of this slide, you'll see some of the Open Transport Functions. On the right, you'll see their sort of brother that lives in Sockets Land.

So OT endpoint, that's just a socket as far as we're concerned here. Bind and connect, map over to bind and connect for sockets for an active OT connection. Now I'm not exactly sure what the difference is. I'm not really an OT guy. These two obviously map pretty well right over to Sockets.

So for sending data, again, there's an extra send, which is send message, which is datagram-oriented. It's a function call that I'm not too familiar with. I tend to stick to send and send to myself. But I'm assuming that they're going to be fairly simple and probably more similar to send to than they would be to send. That is, they take the address parameter and they'll take probably the length of the address, I would assume. And receiving is the same way. So disconnecting and cleaning up.

As I said, you can use Shutdown. One of the interesting things about this is the SetSockOp SO Linger. What exactly does that do? Setting that SOC option, S-O-L-I-N-G-E-R, means that the socket is going to stick around until you actively close it. and until any remaining data is read or received from that socket. That's just all it means. As opposed to if you simply called shut down or close, you can actually just kill it right there and not read any of the data in.

So I mentioned before a little something about synchronous operations. These two functions, getHostByName and getHostByAddress, they're both blocking functions, which means that they're going to stall until they actually get a return. There's a way around that, which is by placing these in their own threads and letting that thread go off and wait however long it wants to while you continue with your application. That's the recommended way of emulating async mode on Mac OS X or on any Sockets implementation, for that matter. Another point to note about these is you can't cancel them. So once you go in, there's no going back. You've got to wait until it finishes.

So with Direct Play, Direct Play is obviously specific to Windows directly. There's no other API like it out there, say for something like NetSprocket that is kind of the same high-level type API. You can port Direct Play to NetSprocket. I'll be perfectly honest with you, I've never looked into doing that, so I don't know how much work would be involved, but I would assume that it's probably not going to be the easiest thing to do, but it is an option if you're willing to go down that path. Winsock is much more interesting. Winsock mostly maps onto Sockets directly. By mostly, I mean that there are some functions in Winsock that are specific to Winsock itself.

So what are the differences? Sockets has no asynchronous functionality. I just mentioned before how to deal with that little caveat. There are also no predefined startup or shutdown routines because they're not necessary on Mac OS X or on any Sockets implementation aside from Windows, I believe. Like WSA Startup, WSA Cleanup, WSA Async Select is another one that's used a lot to generate the asynchronous message modes that Winsock uses. None of these things are available.

and David Hill, are not part of the standard Sockets API, which is the standard cross-platform Sockets API. They are only available on Windows. Any functions that are prefixed by the WSA, those are all specific to Windows itself. So if you're looking for cross-platform compatibility, those are things that you'll want to try to use less.

The last bullet on this slide is kind of interesting. Windows defines a data type of socket, all uppercase. The standard socket data type is just an int, it's a file descriptor. So why does Windows redefine the socket? That's a good question. Because I don't rightly know. I do know that it causes a lot of problems, Oak, because when I was first writing network applications, I was using the socket data type very heavily until I found out that you can just cast an int to the same thing, and it still works. Yet, I don't have to go through the effort of recoding every time I move my Windows code to another platform. So just one of those little interesting tidbits. I stay away from the socket definition.

So kind of as a summary, I went through the networking technologies that are available on Mac OS X. Those include OpenTransport, which is fully carbonized, OpenPlay and NetSprocket, again, which are carbonized and are also open source. And I also went through Sockets in relative detail. I went through the initialization of it and how to set up your Sockets application and kind of give you a general idea of how a client-server application is going to work.

Went through some of the little tips, tricks, and caveats for networking on Mac OS X and bringing your network applications to OS X. And one of the more important things was illustrating the differences between an API such as Winsock that is specific to the Windows platform and the Sockets API, which is a standard Unix definition.

So for a little more information on networking, there's two series of books that I would highly recommend to anybody who's very interested in networking. The top one is Unix Network Programming by Stevens. Excellent series of books. The first one, Volume 1, goes into a lot of detail about how the networking subsystem operates on Unix. It goes into the whole basis for TCPIP, and there's a lot of great information in it. Volume 2 is more about inter-process communication, which is using sockets as pipes for talking between applications on the same server or on different servers.

The TCP/IP series, it's called Internetworking with TCP/IP by Comer, is another great series on networking. The first one is the principles and practices, the second one is... Anybody know? I don't remember. Actually, I don't remember what two and three are. But they're really good, I know that. Because I own them all. There's also some things that some... Mailing lists.

That one escaped me for a moment. There's three mailing lists in particular at Apple that are maintained at Apple that you'll want to look at if you're interested in more networking stuff. The Open Play Developers List. That one obviously relates specifically to Open Play and that open source project. Not terribly active, but it seems to be generating more interest now. The Open Transport Developers List. I'm not sure how active that is. As I said, I'm not really an open transport guy.

From what I understand, it's still pretty busy. Then there's also the Darwin Developers List. Make sure you have a lot of room on your hard drive if you're downloading this list because it is busy. Probably on the order of 100 messages a day. That's where a lot of the kernel level networking stuff gets discussed if you really want to delve that deeply into it.

So my roadmap, well it's not really a roadmap because it refers to everything that happened in the beginning of the week. But I put it up here just for anybody who wants to reference the sessions once the DVDs finally come out. 300 was the networking overview. You might recognize some of the content from one or two of my slides from the 300 session. We kind of had the same things there.

The extensible kernel networking services, again that's going to be more for people who are interested in the Darwin development stuff and the kernel level stuff. Networking configuration mobility, that's more user level network configuration as opposed to programmatic network configuration. Network services location. I don't remember what that one was about. Oh, check it out on the DVD. Somebody let me know. And AFP is 3.0 and Apple Share is just that. It's AFP on Mac OS X.