Darwin: The Foundation of Mac OS X - WWDC 2001

Mac OS • 1:05:29

Darwin is the powerful, open source foundation of Mac OS X. Based on BSD UNIX, Darwin is a robust technology engineered for stability, flexibility, and performance. This session introduces each of Darwin's components and the services they provide, and functions as a prelude to the Mac OS X File System, Networking, Kernel, and I/O Kit sessions.

Speakers: Brett Halle, Joe Sokol

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

It's been a great year for Darwin and open source, starting with Mac OS X shipping on March 24th, and we have great contributions and signed ups for Darwin open source, and we'll be sharing a lot of these accolades and the numbers with you as well throughout the week. So without further ado, I'd like to kick it off by introducing the Director of Core OS Engineering, Brett Halle.

So good afternoon. First session after lunch. Hopefully you guys won't fall asleep on me. We have a lot of technical material in this session, so we'll try and get over it. The focus of this session is to basically give an overview of what Darwin is all about. I'm sure you've seen this picture numerous times just this morning.

Our focus again really is on that bottom layer of the system. There's a huge amount of technology that exists there. As we talk about it today, I think it's important to note that we will use some terms quite interchangeably. Darwin is the Core OS of Mac OS X. We talk a lot about Darwin, we talk a lot about Mac OS X, they are the same thing.

They are not two different piles of technology, they are one and the same. From that perspective, it's important to note that Darwin is also open source. As we talk about the open source aspects of Darwin and the technology, it's the same pile of code, the same technology that ships as part of Mac OS X. Again, they are one and the same.

So how in the world do you describe the Core OS or the operating system of Mac OS X? Interestingly enough, it's technology that tends not to be very visible. If we're really doing our job well, you don't really see us. So you might consider that we're certainly kind of the foundation of the system.

We kind of deal with the plumbing and the piping and the wiring, if you will, that holds Mac OS X together. Again, if we're doing our job well, you don't see the Core OS. You as the developers will interact with it at a very high degree of interactivity, but the developers should rarely, if ever, be aware that the Core OS is there.

The architecture of Darwin is very unusual in that we've taken basically best of class technologies and brought them together to bring the base of Mac OS X together. And there's a number of different technologies that we'll talk about in this session. This technology comes from a number of different areas: academia, the open source community, and certainly some significant history at Apple and Next. And there's probably thousands of person years worth of development and testing and usage in much of this code base.

We've taken all that technology and we've again kind of brought it together in a very unusual combination, but tuned it toward the customer base and developer base that's unique to Mac OS X. So you've certainly heard a lot of focus around things like UNIX or certainly the heritage Mac OS. We've actually brought aspects of these technologies together to support both of these environments and to really focus on our customer base directly.

From that perspective, there's an awful lot of untapped potential that exists in the system. We'll talk a lot about more of that during the week, of course. But in terms of providing the modern operating system base, this is it. This is the base that really enables all those capabilities. And because of the power that exists there, I think we're enabling you to create applications that you couldn't do on previous versions of Mac OS.

From a feature perspective, you could consider that Darwin or the Core OS is responsible for the preemption aspects of the system, the memory protection. Basically, it's responsible for managing and coordinating the multiple address spaces that represent the system. Each of the applications in the system exists in a separate address space. It supports the application environments, provides all that plumbing and wiring and stuff that allows the application environments to exist on the system and to be able to do their, export their particular brand, if you will, of memory management and such up to the application.

The Core OS is inherently device independent. And what I mean by that, and that seems kind of an obvious statement, but that we don't actually have any particular preference to any I/O or device architecture. We realize that technology in the I/O space is evolving at a fairly furious rate, and we've tried to make sure that we have a design that will enable us to move forward, and be able to take advantage of the fact that we make the whole widget, and to be able to enable new hardware and new technology together, new software technology.

It's a very flexible base. As we talk about the pieces of the Core OS, you'll see that they were designed to support a number of different kinds of categories of application spaces, and hardware spaces, and such. And we've really kind of tried to leverage that flexibility of the system to give us a base which we expect will last for the rest of the system. the next 10, 20 years.

The CoreOS is also designed to be scalable, and you can interpret this in a couple of key ways. It's scalable from the hardware's perspective, in that we support certainly various classes of processor architectures, G3, G4, but also be able to support multiple processors in a system, so support for symmetric multiprocessing from that level of scalability. The CoreOS is also designed to be scalable, and you can interpret this in a couple of key ways.

It's scalable from the hardware's perspective, in that we support certainly various classes of processor architectures, G3, G4, but also be able to support multiple processors in a system, so support for symmetric multiprocessing from that level of scalability.

[Transcript missing]

and being an OS, and things like networking and all that, there's more Budsworth than you can imagine that's a part of the system.

So we're going to spend a little bit and actually do a tour of the Core OS and the various components that are part of it. And first of all, this kind of gives you a bit of an exploded view of that Darwin layer of the system, and there's a number of pieces that make it up.

We're going to start first by spending some time in talking about Mach. Mach is kind of the actual foundation layer of the system. It's designed, in fact, as an OS foundation and abstraction layer, if you will. And it's responsible for managing the processor and memory resources of the system.

It's basically the piece of the system that actually schedules the processor, a given application or thread in the system for being able to run. And it's important to note that at this very, very basic level of the system, the threading is, the threads are considered to be, you know, an essential building block of Mac OS X.

That scheduling and the various services that go with it are actually supported all the way down to the very base of the system. It's not a tacked on the top concept. Mach is also responsible for handling the memory protection, the memory management, basically, if you will, both the processor and memory resources of the system, all the VM capabilities and such.

Because Mac keeps everything abstracted as far as how memory is concerned and the various address spaces that exist in the system, it also has to provide some very key and basic services so that there's a way for user level applications and the kernel to be able to communicate back and forth together. And that's done through something called IPC, Inter-Process Communication, RPC, remote procedure calls, that is a fundamental, again, key building block of Mach. And it's the way Mach uses to be able to communicate with any other portion of the system.

Mark is inherently policy neutral. It's, again, it's designed to be a very much a, an abstraction layer to the processor and memory architecture of the system, and it doesn't actually have any concept of things like file systems or I/O or networking or anything, security policies, things like that. That's not Mark's job. It really is designed to be a very low-level abstraction layer.

The Mach we use in Mac OS X is based on, for those of you that follow this technology, is based on a Mach 3.0 plus a bunch of stuff that has been added to it by the various players that have been participating in its development. It's derived from a Mach that was developed at CMU. Avi Tavenian, of course, was one of the original team members of the Mach team at CMU.

But it was taken on by the OSF Research Institute a number of years. In fact, Apple funded a lot of that development work at the Research Institute. And when we took it in and incorporated it as part of Mac OS X, of course, we've done considerable amount of work to enhance the 3.0 base beyond even those starting points.

It, as very much from its original design point for Mach was to actually be able to support things like SMP. And some of the work that was done in 3.0 and later also, has architectural support for being able to deal with various mechanisms and facilities for supporting real time. And we'll talk about that a little bit more and certainly more in the Kernel session later this week.

The wall is talking to us. The Mac OS X also has built into it, because it's responsible for managing the processor resources, the capability of supporting multiple scheduling policies. For example, built into the Mac OS X is the ability to schedule, certainly either kind of the real-time scheduling or fixed priority-based scheduling, or time-sharing-based policy scheduling. And there's actually the ability to support other types of scheduling policies within the system.

Mark also tries to abstract the mechanisms for being able to deal with the other half of the virtual memory system. So VM certainly, if you look at the management of memory, there's the managing the actual, you know, parceling out of the address space that's available for a given application. But there's also the other aspect of it, which is what to do when you try and access more memory than you actually have. You have physical RAM installed in your system for.

And Mark actually interacts with a number of external interfaces to be able to support being able to store, to be able to handle the backing store for the OS, either on the file system or, and certainly is capable of other kinds of backing storage solutions. Mark is also, again, this is kind of repeating here, but it's very much designed to be a system that's designed to be able to handle the OS, and it's very much designed to be able to handle the OS. So, I'm going to go ahead and show you a little bit of the Mac OS X. So, it's very much designed to be very modular.

Mac was intended to have OS personalities put on top of it. Again, it's policy neutral. So, it's, you know, it's interface layer, it's abstraction layer is assessed, and it assumes that there's some other OS and other OS technologies that sit on it. And that's very much a key part of its design space.

Now this is all fine and interesting, this is all down at the very base of the system, but if you're writing an application for Mac OS X, what does Mach mean to you? Well, it's important to note that all process models of the system, in other words, the things that actually kind of handle the scheduling and the kind of environment which your application rests within, are all fundamentally built on top of Mach primitives. Those Mach primitives are tasks and threads. And whether you're using Carbon, or whether you're using Classic, or you're using Cocoa, all of those application environments are fundamentally built on these particular abstractions.

All memory management is also built on top of Mach Primitives. So at the very lowest level again, there's management of, if you will, VM objects and the handling of virtual memory and the paging services. And at your application, you're probably dealing with things more like Malik or other kinds of services, GetPointer or NewPointer, rather, depending on the type of application that you're using. But fundamentally, when it gets right down to actually managing the memory at the OS level, Mach is responsible.

And again, as I mentioned before, with Mach built on this concept of inter-process communication, that all kernel-to-user, and user-to-kernel, and user-to-user communication, and what I mean by user-to-user is user space from a more process architecture, processor architecture perspective. So that all that communication that exists in the system between applications, and between applications in the OS, are done, built on top of these Mach primitives. Basically, inter-process communication, RPC, or what you'll end up hearing a lot, and mentioned in a number of the documentation, is Mach ports.

Now to get into this a little bit more and give you a sense of exactly how the system actually uses some of these facilities, I want to invite Joe Sokol up on stage. And what we're going to do is show you that as you're running the applications on your system, that there's actually the ability to look at the system much more from a, if you will, a systems perspective to see kind of the impact of your application at the OS level.

Okay, so many of you are probably familiar with... With the applica- with TOP running on UNIX, which is capable of showing you a lot of stuff going on in the system. What we're going to do here today is just kind of take a look at a simple little Carbon app, because that's kind of the, a runtime environment that isn't quite so used to running on top of UNIX.

And to show you that, in fact, just because it's Carbon, right, it actually is making use of a lot of these Mac services. So you're going to see the fact that their threads, their ports, being allocated on your behalf. And we're going to take a little look at some characteristics of performance or lack of performance that we'd like you to avoid.

So the little app is called Wait Next Event Loop. And the first thing we're going to take a look at is what it means to have an app that is basically not only compute bound, but basically running through-- take a slightly different view here-- a lot of system calls.

Okay, so the Wait Next Event in the Carbon Runtime is actually implemented with a few Mach calls and a few BSD calls. But when you amplify that, if you're just sitting in the loop doing a Wait Next Event with a zero timeout, basically you get really, really busy on the system.

Now this has basically three bad aspects to it. The first one is kind of obvious. In a time-sharing system, our little test app here, his priority is diving way down. And it's going to stay down. Because he's never blocking. So you might not like the responsiveness you get in this app over time. Secondly, one that you might not think about so much is you're keeping a number of pages hot all the time because you're constantly executing them.

So those are pages that the VM system might have been able to, you know, steal and actually give to a foreground application that really needs them. So it's continuing to put pressure and competing for memory. And then the third thing is all of our The power management is actually driven and triggered off of our idle loop, which is an actual Mach thread that gets run when there's nothing else to run.

So that is where we make the decisions to put the system into nap or doze mode to reduce the power consumption. Not the deep sleep, but these other modes that allow us to cut back on power consumption. The application itself is basically implementing an idle loop. That does not allow us to enter into those modes. Another thing that we can look at with TOP is some of the other resource utilization.

So again, you can see that this is a very simple, dumb, hello world style Carbon application. And we've managed to allocate him two threads and 61 ports. So these are things that are happening as part of the Carbon runtime model. So another thing we can show is, let's go and cut down on the amount of, let me flip back here real quick, sorry.

This is a little bit easier to see. In terms of the CPU utilization and the number of system calls, we've now simply made the wait next event wait for one tick. So instead of it just being strictly pulled, even just going to a one tick wait really reduces the load on the system.

If we go to something that's a little more approaching infinity, then of course it goes real quiescent. So even if you can't construct the app to be purely an event driven model, if you have an event loop, if you can at least make the times that you're blocking for long, that really helps in terms of relieving pressure in the system. Alright, so now we want to show a memory leak.

So again, TOP can be used to see things like memory leaks going on in the system. You'll notice under the Rprivate, which is the resident private column, that all of a sudden we have a little plus sign sticking out there on the right. And what that's indicating is that that actually is a growing size there, so you can see the size going up slowly. This is basically a 4K leak once a second.

Now if we were to be leaking every time around that event loop, we don't really enforce any kind of real upper limits on the amount of memory that an app can obtain. So it is probably a good idea on your part to watch your app and make sure that it doesn't have any leaks.

Because over time, maybe the app stays up for a long period of time, of course that would just keep going larger, going longer and longer. Growing larger and larger over that period of time to the point where you might run out of paging space on the system. So an app can approach 3 gigabytes in size. Thank you, Joe. All right.

So. Back to slides here. So I think it's important to realize that even though you may be thinking of your application in terms of the Carbon API space, or if you're writing a new app in Cocoa, that there actually is a lot going on in the system under the covers, and that there are actually tools that you can use to be able to tell what the actual resource usage and the effects, if you will, of your application are on the rest of the system.

If you're interested in learning more about the Darwin kernel, there's a session later this week on Friday that I encourage you to go to over in the Civic Center. And this particular session will not only cover the Mach aspect of the system, but will also talk quite a bit about the BSD kernel as well.

Moving on in our tour now into the I/O Kit space. I/O Kit is an object-oriented framework that's basically designed to try and ease the process for developing and implementing drivers on Mac OS X. It has a lot of kind of native or inherent capabilities that are part of that framework. Support for things like plug and play and dynamic device management.

And of course, again, those things sound like, you know, mom and apple pie kinds of things, but it's very important to note that, again, from the very basis of the system, that the I/O system is intended to be able to support devices that can come and go as far as the system is concerned.

That the I/O system itself has no inherent policy in terms of the, you know, kind of the arrival or the departure of given devices on the system. That policy is actually managed up much higher, you know, in the system space. For example, whether the file system has a dependency on a, you know, volume that happens to exist on a particular disk.

But I/O Kit itself supports the ability for devices to come and go in the system and is a very inherent capability. Another important aspect of I/O Kit that's really there from the core is support for power management. It's that it's expected that all devices and parts of the system participate and are involved in the process of managing power on the system. And we'll talk a little bit more about that in a minute.

And as well, I/O Kit is intended to be very modular and extensible. We don't know what kind of devices may appear five, ten years from now. And we've tried to make sure that the basic architecture for I/O Kit can be very flexible as new I/O and device technology come on the scene.

In addition to that basic capability, of course, I/O Kit does support and provide the abstractions for all the common, you know, class of devices that exist in the system, things like SCSI and ATA and USB and stuff like that. And those are all done through another aspect of the framework called families, and we'll talk about that in a minute as well.

Again, I/O Kit is a framework. It's actually an object-oriented framework, C++ based. And it provides basically the common kernel services and facilities for being able to support I/O in the system. And it also has the facilities to be able to enable the communication from the kernel to the user space, so that the actual I/O can get up to where the applications can take advantage of it.

In many cases, that's done through other layers of abstraction, but the inherent capability for being able to do that is built into I/O Kit. I/O Kit, in effect, models the physical world, if you will. As you think about your computer, there's a motherboard, and there's chips, and there's potentially a PCI bridge, and potentially PCI cards, and a graphics card, or a SCSI card that's in there.

And I/O Kit actually is responsible for being able to develop that model, if you will, of all those devices. And then, once it has that model, being able to actually connect up the appropriate driver to that given hardware, so that the rest of the system can actually talk to it.

The protocol specifics for any particular class of devices is something we call families. A family basically is, if you will, a domain appropriate abstraction for a particular device. A good example of this is if you would compare something like SCSI and audio, they're pretty, very separate, very radically different approach, views if you will, on I/O. You're probably not as interested in setting the volume of your SCSI disk, for example. So the families actually provide the appropriate API set and capabilities for that particular kind of device space.

The actual device specifics themselves exist in drivers. So that's the part of the system that directly talks to the hardware. And they, in and of themselves, then communicate with the families, and then the families with the rest of the system. As I mentioned, power management is a very key part of Mac OS X.

You guys were at the keynote this morning and saw Avi's demo, and it's really quite amazing to see something like the titanium effectively be awake before you can open it. I've heard a number of refrigerator door kinds of stories of trying to make sure, is it really off or not?

But it's because of the architecture for power management that that actually becomes possible. It's integral to the I/O Kit architecture. And it's based on this concept of power planes. Much like I mentioned that I/O Kit tries to model the physical world, from the power management perspective, there's another model that models kind of the power distribution in the system. And they may not correspond on a one-to-one basis. Depending on how the hardware is designed, the interdependencies at a power level actually may be different. But I/O Kit's responsible for trying to keep that abstraction and that set of relationships together.

It's important to note that for Mac OS X, the way we are able to get the speed and performance out of wakeup is because of the full architecture of both I/O Kit as well as the rest of the system. That the kernel itself is multi-threaded, the system itself is multi-threaded, and that the process of waking up the system is actually done as much in parallel as possible.

And as a result, you're able to actually enable the parts of the system that are necessary to get that application up and going as quickly as possible. As long as it doesn't have any inherent dependency on, you know, the disk or a networking device which may take a little bit longer to come up to speed, you're able to get up and use and interact with your system at a UI level very, very quickly.

If you're familiar, certainly with Mac OS 9, in waking up, sometimes it can take a pretty substantially long period of time. And part of the reason for that is the fact that the system is very, very fast. The system is very serially dependent on making sure that all those components actually are fully awake before you can access any part of it.

In addition, things like AppleTalk, for example, have to go through the process of renegotiating. If you happen to have AppleTalk enabled on 9, have to go through the process of renegotiating node numbers and such like that, which end up having to be done before the system can fully wake up. On 10, again, because these things can be done in parallel, for those parts of the system that aren't dependent on having to use AppleTalk, for example, if you have that enabled, you don't have to wait for that particular renegotiation to occur before you can use the system.

From the application perspective, you probably won't deal with I/O Kit much directly. Most of the devices are actually handled through other layers of abstraction. Certainly, if you're interested in disks, chances are the way you're accessing those disks are through file systems, or networking through either your OT interfaces from Carbon, or from Sockets, or other network services that may exist.

From the application perspective, you probably won't deal with I/O Kit much directly. Most of the devices are actually handled through other layers of abstraction. Certainly, if you're interested in disks, chances are the way you're accessing those disks are through a network service. Certainly, if you're interested in disks, chances are the way you're accessing those disks are through a network service.

There are a lot of sessions for I/O Kit this week. So if you're interested in various aspects of it, be it FireWire or USB, or the storage drivers, or the basic architecture, I encourage you to go to that. And in fact, immediately after this session, there's another overview, because I couldn't possibly get into enough detail on I/O Kit here. But I encourage you to go to that session if you're interested in working either at the device level or interested in devices from an application perspective.

Moving on now in our tour to the BSD portion of the system. And this is where that power of UNIX comes into play. So there's two aspects that we look at for BSD, and it helps us kind of manage the different views and responsibilities that exist. The first is the BSD from the kernel perspective.

And the BSD in the kernel is one that's based on BSD 4.4, with a lot of intense integration that exists between it and Mach and it and I/O Kit. So if you look at the kernel environment for Mac OS X, it's basically the BSD kernel, I/O Kit, and Mach are the three key pieces that make up the kernel environment.

The BSD kernel is what actually provides the OS personality, if you will, APIs and services for the system. Much like Mac is completely agnostic as far as the policy is concerned, BSD's responsibility is actually to manage the policy of the OS. So it actually provides the application process model, the support for things like signals and tracking of file descriptors and things like that.

When your application dies or gets torn down, it's BSD's job to make sure that all those resources are all reclaimed and handled. It also is responsible for the basic security policy of the system. So, if you will, that's the concept of multiple users on the system, or even when you take it down to the file system level, being able to manage access to individual files or pieces of data.

That basic security policy is a very inherent part of the BSD system. It's a very important part of the BSD environment and is enforced, again, down at the kernel level. It's also important to note that both the file systems architecture and the networking architecture for Mac OS X basically fit inside the BSD kernel space. They are both based on the BSD implementations. And we'll talk more about both file systems and networking in a few minutes.

The system framework is basically how we try to describe or abstract the set of interfaces that BSD provides. And it's basically the system calls and, if you will, POSIX level interfaces and BSD system APIs, if you will, for the system. If you're accessing things like porting a BSD application, this is actually the APIs that you're most likely will write to.

The application environments themselves tend to be written on top of this API set. The system framework is also responsible for other kinds of services as well. Things like Pthreads, the math library, the basic C libraries. Basically, the very low level sort of APIs and services that you expect to be able to support an OS environment.

The user environment is kind of another, if you will, almost a fourth application environment that exists on the system. You've probably seen all the other architectural diagrams, and they refer to Classic, and Carbon, and Cocoa, or potentially even Java. It's important to note that in terms of the actual system architecture, there's actually one more, and it's the BSD user environment. It's where things like the shells and command line scripts and things like that actually run. It's where a lot of the network administration client tools actually execute. They're things that are kind of common command line tools and facilities that come from the BSD community.

It also has a huge number of various file tools for creating directories and copying files and that kind of thing, as well as tools for being able to kill other processes or top, like we showed you a few minutes ago. All these things are available and exist within the BSD user environment. It's also where some very key and important system services or network administrative services exist in the system.

Things like NetInfo, DNS services, Bind, Network Time Protocol Server, System Log, and things like Apache. It may be important to note here that if you, for example, are running the client environment and you happen to turn on personal web sharing, that it's actually Apache that ends up getting launched and run on the system. Probably the single most used web server on the planet. But all that stuff exists behind the scenes. and it helps support the system.

From a developer's perspective, it's important to note that you really want to be kind of aware of how BSD influences your life. Certainly, again, it contributes to the process model of the system. The support for things like signals and management of file descriptors and other important OS resources are all managed by the BSD process model. But there's also things like environment variables and things like that that are also available, kind of the standard things you would expect from a UNIX process. It's also responsible for the security policy, again.

And remember, those will impact the way your application behaves. And from that perspective, you should be aware that things that you maybe have been used to doing under Mac OS 9, you know, that the security policy of the system actually plays a significant role here under Mac OS X. Whether or not you can access certain files and places that you can write into the system, things like that.

It's also important to step back a bit from that, from the application perspective, and look at BSD from the developer perspective. There are a ton of tools that exist in this environment. Now certainly we'd try and discourage the BSD, kind of, UNIX user experience for our typical end users, but from a developer perspective, this is an incredibly powerful part of the system.

As you saw with Top, there's ways to actually look at the system kind of from behind the scenes. This particular setup we had was actually two systems we telneted into the system that was running the application to run Top, so we could observe it without impacting the UI in any particular way. Kind of being able to put instrument probes, if you will, on the system. But there's a huge number of other tools that are available there that might help building your applications, or just managing your development resources. and the environment that you work in.

In porting UNIX applications to Mac OS X, it's important to note that there's a lot of options that are available here. Later this week, there'll be a session that'll actually show you a couple of ways that you can approach this. If you happen to have your favorite UNIX tools or other things that you might want to use on Mac OS X, porting them over to X is actually fairly straightforward. In fact, you can actually use a kind of a stepping stone approach to getting applications on the Mac OS X.

If you happen to have one that even has a GUI, there's a third-party X Windows server, for example, that's available. So if you want to port over an X-based application and get that up and running very quickly, and then spend the time to actually wrap it in something like Cocoa to replace it with a more aqua-friendly, more user-friendly application. That's a user-friendly experience that ends up being a very easy thing to do under Mac OS X.

It's also interesting to note that it's also very possible, and something we actually do a lot ourselves, of being able to use UNIX tools from within a GUI application. Many of our control panels, for example, are actually, you know, execute command line tools in the background. They actually, you know, are able to launch and interact with a UNIX command line application. So you can actually use that in a very easy command line environment. Again, that's completely hidden from the user. Apache is an excellent example of that.

If you go out and get an Apache manual of how to do full system administration, there's some pretty thick books out there for administrating the various configuration files and stuff. Your user should never have to deal with that, and it's possible to actually wrap that kind of power in something that's very easy for them to use.

There's a number of sessions also that are related to the BSD aspect of the system. Leveraging BSD services later part of the week, we'll actually go through and demo porting some UNIX tools and wrapping them in a Cocoa application. But there's also a couple of other important sessions as well. Threading on Mac OS X, and I will talk about the Pthreads and some of the other threading options that you have available on the system. And as well, support for directory services and such on Mac OS X.

Moving on now to File Systems. Whoops, sorry about that. File Systems, of course, is a lot more complex than just this little blob on the diagram. It's actually a lot of very, very broad range of file systems that are supported within the system. And this is all based on basically the BSD file system architecture.

It's important to note that the file system's implementation is part of the BSD kernel environment, and it's based on the VFS design. If you go out and get your favorite BSD UNIX text, it actually is a great reference for being able to learn about the VFS architecture. But it's basically a stackable virtual file system model, which we've extended both from the basic POSIX interfaces, beyond that, to be able to support more rich file systems such as HFS+.

It can support virtually any kind of file system type. In fact, on Mac OS X, we support a huge number of file system types today. And there are third parties that are also been making file system services available. Recently, I read that the Andrew file system, for example, has been ported now to Mac OS X.

We've also extended those file system interfaces to be able to take advantage of Unicode. So you'll find that there are UTF-8 interfaces available so that you can be able to take advantage of localized file names and such. But it's also really important to note that from a way we've designed Mac OS X, much like how we've approached the I/O system, that we have no particular file system affinity at the OS level.

In that we don't particularly care whether or not it's HFS+, whether it's UFS. We are intentionally trying to design the architecture of the system so that it can be flexible across a number of different file system architectures. As you move up the application chain, more of those high level services and capabilities are exported through your applications.

The file system is responsible for enforcing the file system security policy of the system. Again, unlike Mac OS 9 in this respect, there's a concept of security that does exist in the system. Your ability to write to a given file, rename, even execute privileges all exist and are enforced by the file system. There's a concept of users and groups that is throughout the system, and the access control is all managed by this aspect of the system.

The application environments themselves tend to provide the abstracted level of interfaces that you'll tend to use. If you're writing a Carbon app, you're using the file manager, and things like resource forks and things like that are fairly fluidly part of that particular environment. If you're writing in Cocoa, you're going to be using something like NSFile, and where the use of resource forks is not the typical way of writing an application in this space. Or if you're porting a BSD application, you'll end up using the POSIX plugin. plus plus APIs. And what I mean by the POSIX++ is just the POSIX plus the extensions that we've provided for Mac OS X.

Now to talk a little bit more, give you a better idea of what the implications are of file systems in the system, has Joe to come back up again? And again, much like Mach, where there's a lot going on behind the scenes on your application, the same is true for file systems.

Okay, so we're going to talk a little bit about a tool called FS Usage. Let me get the top out of the way here. That basically is something that you're probably not familiar with. I don't believe it shows up on other UNIX systems. It's something that we developed here.

But it will give you a comprehensive list of all the files and directories being touched, and the size of the I/Os that are going on, the amount of time that you're waiting for that I/O to complete. There's also on the far left there, you can see a current clock with millisecond precision.

And then the name of the task that's actually causing the I/Os to occur. So it's both comprehensive, both in kind of a global manner if you're looking at the, basically allowing to look at all of the tasks that are running in the system. Or you can focus it in on a particular task.

But I find it more interesting to look at it with more of the global method. Just because there are lots of things that are happening on the system that you trigger when the application does something that you might not be aware of. Because a lot of the services are tied in directly to some of these higher level calls.

It's useful for exposing redundancies of access. Pretty obvious. And as I alluded to earlier, it can show you where your app is blocking or waiting for I/Os. So if the app feels sluggish, you can use this tool to determine whether or not it's doing something. It's due to lots of wait time for I/Os.

And it'll even show you where you're waiting for synchronous page ins to complete. So where the app's trying to page fault some of its code in or its data. And then one other interesting aspect, which you can, I would suggest reading the man page so you can see how to turn this on.

But there's a little back door trap that basically enables the display of the higher level Carbon calls. So for you Carbon developers out there who are curious as to what system calls, what Unix system calls we turn a lot of these higher level Carbon file manager calls into. FS usage will show you that. It will actually show you the encompassing or the calls being encompassed by these various higher level calls. So we do something like open the.

[Transcript missing]

So again, it's worth noting here that for much of the things that are going on in your applications, that down deep underneath that application, there's a lot of OS facilities in play to make those things possible. And there are a lot of very powerful and useful tools on Mac OS X that you can take advantage of to be able to see the system from this kind of perspective. If you're interested in learning more about the file system, there's a whole session on that as well on Wednesday in the morning. I encourage you to go to that. Clark is always quite entertaining. Moving on in our tour now to networking.

For networking on the Mac OS X, again, networking is part of the BSD kernel environment. And it's based on the, as Avi put this morning, the kind of network, the reference stack, if you will, for IP in the internet world. We're based on the 4/4 TCP/IP stack, which we've actually synced up with FreeBSD 3.2. And this is a Sockets API based environment.

It comes with quite a lot of powerful capabilities built into the networking stack. Certainly support for things like multi-homing, and routing, and firewall, and network address translation, or NAT mechanisms. These are all pretty basic services that exist in our networking stack. Another very key one, however, is this concept of auto-configuration. And again, Avi touched on that in the keynote this morning. But it's a very important part of our architecture and a key part of foundation that we're working on as we move forward with Mac OS X.

For the networking environment, the way that you actually go about extending that is through something we call NKEs, or Network Kernel Extensions. There's basically a model that we've developed for being able to extend or enhance the capabilities of our networking stack. Much like you can create drivers for a given piece of hardware, or you can write a file system kernel extension for being able to support new and exotic file systems, there's also ways for you to be able to extend the networking aspect of Mac OS X.

One important part of the architecture for our networking is actually how we support Classic. And a very key part of that architecture of the system is to be able to support so that the virtual environments above us, for example, like Classic, or maybe something like a virtual PC-like environment, can actually share the core networking services and configuration and setup. So it's not necessary to have multiple IP addresses for each virtual environment that exists on the system.

We also support access of networking through an open transport layer that exists for porting Carbon applications. Things like Internet Explorer and things like that all use the open transport services that we provide as Carbon as a way to get their app over quickly onto Mac OS X. We also support PPP and PPP over Ethernet as key services that exist in Mac OS X. And PPPoE is something that's new for Mac OS. The ability to be able to inherently be able to connect to your DSL or cable modem provider.

Also built into Mac OS X is the support for DHCP, both the client and server. So this is actually used in different situations depending on how Mac OS X is set up. Certainly as a full server environment, you can administrate and manage DHCP for things like Netboot and stuff. But it's also used for being able to do ad-hoc networking. Also built into Mac OS X is the support for DHCP, both the client and server. So this is actually used in different situations depending on how Mac OS X is set up. But it's also used for being able to do ad-hoc networking.

Mobility is a very key part of the direction of our networking architecture. Networking under Mac OS X is already incredibly rich and powerful through the multi-homing and other aspects, but one area that has not been historically been well known in UNIX networking is the ability to deal with mobility. And that's something that we consider to be very critical.

So we've actually spent quite a bit of time developing a very flexible architecture for being able to handle mobility, be able to support things like automatic network configuration, being able to deal both configuration and reconfiguration based on the link level detection that exists. So if you have a portable and you unplug that cable out of the back, it'll automatically notice the fact that you've removed the cable, and if you have an airport, it'll automatically and silently convert over completely to be able to use the airport if you're using DHCP configuration. And all of this support is very dynamic.

[Transcript missing]

But another key aspect of it is the ability to support application level notification. Our mail program is the first example of this use, where an application can actually be aware of the ability of network connectivity and accessibility to the network. So if you actually have had a chance to play with mail, if you remove the cable out of the back of your system or lose connectivity to the internet, mail will automatically go offline. It detects the fact that you no longer have a connection and does the right thing.

And when you plug it back in, again, it will automatically be involved in the process of reconfiguring itself and reconnecting it to the appropriate servers. And this is the kind of behavior we'd like to see as we move on and move forward with Mac OS X. Mm-hmm. There are a number of sessions for networking this week. Networking Overview exists tomorrow, and a number of other networking sessions tomorrow, both the Kernel extensions for networking, as well as the Configuration and Mobility session as well tomorrow. And then there's a feedback form later this week.

So take a little side trip here and talk about how the Core OS and G4 architecture actually play together. This is certainly kind of outside of the architectural space of thinking of the Core OS, but it's an important part and impacts actually our design space for Mac OS X in terms of how we get some incredible performance on the system.

First and foremost, of course, is support for the Velocity Engine for all the AlteVec unit of the processor. And there's a number of libraries that are available on Mac OS X. There's a VDSP library, which is responsible for basically signal processing and fast Fourier transforms and convolutions and things like that. And that library has existed on Mac OS 9.1, but is also available on Mac OS X. New for Mac OS X, however, is a new library called VBLAS, the Vector Basic Linear Algebra Subroutines, which basically handle support for things like large matrix manipulation.

Basically, this kind of thing is used to control the VBLAS. This kind of thing is used for things like MP3 coding and decoding, MPEG, speech recognition and image processing. Those are common places where this kind of library would be used. And this particular service is only available on Mac OS X.

It's important to note that from this library on a 500 MHz G4, this is where we get some of the really amazing performance numbers that are possible because of the G4s. 2.2 Giga Flops of performance. Again, this kind of stuff is used in things like iTunes and stuff like that.

There's also a number of other vector libraries that are available. Math, the VMathlib, which is basically a basic math library that's been AltaVec tuned. Vector Operations and BasicOps, which is kind of an extension to the basic instruction set, if you will, for the velocity engine. And Bignum, which is a sports, basically large, VLT engine. And that's where we get the most performance. you know, 10, 24-bit multiplies, that kind of thing.

The other aspect of the G4 worth noting, and how it plays with Mac OS X and the Core OS, is in multiprocessors. Mac OS X inherently supports true SMP. This is really differentiated from Mac OS 9, whereas if you have an application, and you happen to have it on an MP box, then unless you've specifically written your application to use the MP API, you won't actually benefit from any significant degree by having an MP system. The Mac OS X is very different in that the system itself takes advantage of the processors that are available, and scales automatically.

Each thread in the system is capable of running on a separate processor. Again, Mach at the very lowest levels of the system is the part of the system responsible for handling the scheduling and processor management, and will automatically make sure that each thread on the system runs on whatever processor resources are available. This means if you have a single application with multiple threads, each of those threads might actually run on a different processor.

And that's really important to note, because that really tends to be a great way of finding situations where you may not necessarily be doing your multithreading in the best possible way, and doing the locking and things that need to occur. I encourage, if you have a multithreaded application, that you find a way to test and run it on an MP system.

If you're using, and tend to want to be able to take advantage of that, there's a number of ways through the different application environments, that take advantage of it. Certainly the MP APIs for Carbon, an interesting enough classic, as well as NSThread for Cocoa, and the Pthreads package for BSD applications. So there's a number of abstractions that you can use to be able to take advantage of multiple threads.

It's important also to note kind of how the system itself deals with MP on the system. Again, Mach is designed very much from the beginning to take advantage of the multiple processors and to be able to deal with things like threads. UNIX is not quite as evolved in the same way.

From its heritage, it's actually designed to run more on uniprocessor-based systems. We've done quite a bit of work to be able to take advantage of MP on Mac OS X, and we'll be doing a lot more as time progresses. But it's important that if you're actually doing work on Mac OS X that you know that the BSD environment and how it manages itself across multiple processors is a key part of how the system is designed.

I encourage you to go to the session later this week where there's more discussion on this particular issue. Particularly if you're doing file systems and networking, you'll find that there's this concept of funnels that are involved for managing what parts of the system are running. system actually are involved and run on which processors.

Since we're talking about things in the kernel, there's this concept that I've used a number of times now of kernel extensions. And basically, these are modules of code, if you will, code fragments that allow, if you will, a plug-in model for the kernel. Allows things like drivers and network kernel extensions and file system plug-ins to actually be loaded and managed as part of Mac OS X.

Really important thing to note that a Mac OS X kernel extension is not the same as a Mac OS 9 extension. We have very explicit usage models for these particular kinds of extensions. If you're doing a driver, or you need to be able to create a new file system, this is the place that you would use a Mac OS X extension. Many things that were used, that were extensions in a Mac OS 9 are not appropriate to be done as a kernel extension. You really should only do them if you have to.

It basically is the plug-in model for the kernel. And we try and keep the structure of these things much like an application bundle, in the sense that we have bundles that have property lists and the binaries for the actual thing that gets loaded. And there's tools and stuff for loading and unloading these things in the system.

As well as some common APIs and things that are supported for this. The other APIs, however, beyond the basic loading and initialize and finalize capabilities, are all actual APIs that are very domain specific. That's why I said this is really not a general mechanism for extending Mac OS X. It's really only used for things like drivers and file system plugins and networking services. When you're developing inside the kernel, should you be one of these people writing a kernel extension, it's really important to note that the rules change a bit. One, you tend to have to deal with things through a two system debugging environment, two machine debugging.

The language features that you have available are much, much more limited. We don't support Objective C in the kernel. That C++ that exists there is actually a very strict subset of the full C++ standard, so there are things like multiple inheritance and exceptions and things like that are not the kind of thing that we really encourage or in some cases even support.

Obviously, when you're down at the kernel level, there is no direct user interaction. Some cases you're at interrupt level, you certainly aren't going to be able to talk to the user at that point. You don't have access to things like Aqua. There are interesting logging tools and other things that you can use so that you can get information to you while you're writing your applications, but it's definitely not a user interaction solution.

Resources in the kernel are considerably more costly here. When you're dealing with an application, if it allocates memory, even if you run out of the available physical memory, the OS itself will, you know, through VM and through backing storage, will do its best to try and support your application. If you're in the kernel and you use lots of memory, you're stealing away actual physical RAM from the rest of the system, and it's very expensive. So if you're writing in this particular space, you should always use the least amount that you possibly can.

Failures in the kernel are fatal. Obviously, one of the reasons we want to discourage development of things like kernel extensions is this is one area where you can introduce instability in the system. If you have a driver that goes wild, you can just as easily crash it. You can crash the system that way as you can under Mac OS 9. So that's why there are very explicit cases where we support extension of the kernel, but we discourage all other types of extensions from this level.

Moving on to the open source aspect of Darwin, I think it's important to note again that virtually all of the components that are available in the Core OS are open source. A couple of very rare exceptions, things like some of the drivers that are licensed from third parties, things like that.

And that we actually very actively are involved in actually feeding code back to our upstream providers. So it's not a, we've just taken code from the community and have integrated it in and are just running with it. We actually work for things like BSD and Apache and other things that if we find bug fixes or other things, that we actually feed those fixes back to the BSD community, their Apache community, or other providers. We are actually very actively trying to make sure that there's a community involvement here in open source.

Excuse me. And that's important that this community is very cooperative. For example, things like security fixes that come in. Some of the patches that were part of some recent software updates were actually fixes that came in from the external community that we rolled into these software updates. Or even platform fixes. We ended up getting from one of the bug reports that was submitted to our support group, actually included a bug fix right in as part of the submission.

So it's really important to note from a perspective of open source, this is something we take very seriously. It's not an experiment. It's very much a part of how we do business in the Core OS. And that our developers are involved in the mailing list in the community to actively make this a two-way process. And that you as developers can actively participate in this process.

It's important to note again that our repositories and things like that, actually many of them are outside Apple's firewall. That we actually have people outside of Apple that do not, are not under our payroll, don't work for us, who actually have commit access to these repositories. That people can actually be involved in the process of developing and involving the Darwin codebase.

There's a couple of sessions later this week that I encourage you to participate in. One is the Open Source at Apple on Wednesday, talking a little bit more about Darwin from an open source perspective and other open source projects, as well as a Birds of a Feather session tomorrow evening.

And finally, I'd like to give you a handful of things to think about as you kind of go from this point, hopefully off to all these other sessions. One is that, You know, it's our intention, even though we have all this power under the hood, that the user experience of UNIX is something that we really want to hide.

The goal is not to make Emacs the ideal user interface. From our perspective, this is really an optional environment for power users and developers. This should not be the typical behavior of Mac OS X. But this is something that you as developers should take advantage of. But always keep in mind, and I'll give you some examples of how we've actually evolved the system in this way, is that the concept of root under Mac OS X is something that we've really tried to move away from.

It's possible to actually set up your system so that you have a root user, but you have to go to some work, and you have to really know what you're doing in advance. But this is part of the model that we've tried to apply as Mac OS X, is how to provide as much power as a UNIX-based environment can possibly have, without having to create end users that are heavy. to be knowledgeable UNIX administrators.

Be careful that you don't repeat bad habits. Under Mac OS X, there's a lot of different ways to approach things, and you can probably solve virtually any problem you have with the mechanisms that are there, but think about them before you do it. Don't just jump into the only way you can solve a problem is by using shared memory, because you end up recreating the same set of problems of stability that have existed in the past. There are other ways to be able to communicate between multiple applications that are running on the system. Shared memory, for example, should be avoided.

And it's really important to look at the system, you know, to have a very systems view when you're writing your apps. Tools like what we showed with Top and FSUsers and stuff allows you to get a sense of how your application actually impacts the rest of the system. There's a lot of other threads and processes and things that are running on the computer.

And in order to be able to make sure that the system behaves well with local applications, you need to have a system that is able to communicate with multiple applications. And if you're able to do that, and you're able to do it well with lots of different things occurring, you need to take a few minutes and actually look and to see how your application affects the system.

Always use the highest level of abstraction possible. Again, there usually are very appropriate services to use, either the Carbon level or the Cocoa level or whatever, and that you should only get involved in using BSD services or mock-level APIs or services if that's the only way to solve a given problem. Don't just jump right in and use VM Allocate, for example, for doing memory management. There usually are a lot of other things in the abstraction layers that exist that are helping to manage those resources and make sure that they are handled appropriately.

And remember that open source is an incredibly powerful tool. This is something that we're doing which we think is pretty novel in terms of adopting open source in a very big way as part of the major aspect of the product that we sell. Again, this is not a little interesting side project.

Steve put it very clearly, we're betting Apple very much on the success of Mac OS X, and open source is a very key part of the lower portion of that system. Get involved in it, help participate in it. There's a lot of email lists and ability to actually contribute code and other things. Help make the system better.

And also remember that there's an incredible amount of power that exists here. One of the big changes between Mac OS 9 and 10 is this introduction of a modern operating system. Well, this is that modern operating system. The ability to do threading, and being able to do lots of things in parallel, and things in the background, important leverage UNIX applications, and multiple application environments. And so this is all an incredibly powerful aspect of Mac OS X, and something that you want to think about, because I think it will enable you to take that next big step.

And lastly, I want to remind people that there is a feedback forum. We'd love to hear your comments on Darwin and things, ways that you think we can make it better. So I encourage you to come to the, if you're here at the last part of the session, 5 o'clock on Friday, we'll be here.