Carbon: Low Level - WWDC 2000

Mac OS • 1:10:08

Apple engineers share a real-world understanding of how Carbon works to fine-tune your product. Issues concerning threading, memory management, performance, and file management are discussed.

Speaker: John Iarocci

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Ladies and gentlemen, please welcome Carbon Technology Manager David Wright. Thank you very much. Welcome to session 130, "Carbon-- The Lower Levels." Oh, OK. I didn't see that title. Carbon, low level in the book. Wow, I'm really excited to see how many of y'all made it. I know that lots of questions that are going to be answered in this session were asked during previous Q&As during the rest of the week. And I told you guys to keep coming here, and you did, so I'm really glad.

And then it's mixed with sadness, because this is the last session of the conference, so it's kind of weird. Ooh, I know. It's like Christmas ending. Well, for me, anyway. I don't know. I don't know. Has it been a good conference for you guys? I sure hope so. Yeah. I'm so glad.

Great. We have so much information to get out, and I'm so glad you were here to receive it. We've certainly received a lot of good feedback. Have a lot to go back to Apple with and work on to make bringing your applications to 10 a lot easier. So anyway, let's nail down the final details of what's involved for you as you bring your applications to 10 by Carbonite. We're going to have a little bit of a chat. We're going to have a little bit of a chat with the team.

Good afternoon. Wow. Glad everyone can make it on 5:00 on a Friday afternoon. So I'm going to take you through, as it says, Carbon Lower Levels. It's a pretty generic term, starting with a slide that I'm sure everyone here has seen. Anyone here not seen this slide yet? Here there's alternative designs up on the web.

Core Services. Okay. The technologies that I'm going to go through today are actually part of this box called Core Services. It's a little confusing because when we initially had a lot of these technologies up and running, they were part of the Carbon runtime, specifically. They were part of the Carbon stack, if you will. And what's happened over time is we've seen that We have some similar services that are available in all stacks, and we decided to factor these lower into the system and actually share them. And that's why you'll see some familiar-looking technologies down in the core services layer.

The other thing that you may not have noticed when you saw this kind of slide in other presentations is Well, the first thing you probably did notice is it's an architectural diagram. It's a layer diagram trying to describe the system. But it's also real in that it corresponds to the frameworks that we have on Mac OS X.

Just to make sure we're all on the same page here, framework is a term that's being bandied about a lot. It really tends to be used in the same way in which you'd say shared library. Other folks call this a DILIB. A framework is really a collection of files and folders grouped around a shared library.

It includes headers. It might include the resources for this. And I'll talk about frameworks quite a bit. So if you are one of those folks who has actually installed DP4 on your PowerBooks, and you look in, System Library Frameworks, that's where we keep most of the frameworks on the system.

In particular, you should note there, there's one framework called Core Services. Core Services is the lowest framework in the system above the system framework, or the Darwin layer. And it is intended for... Tools and processes that would use the system in a way, but probably not use any graphic services, not use any UI. An example of things that use core services is the res tool that we have on the system you see running in Project Builder.

Okay, so there's a lot of things that are in core services. I'm not going to go over all of them. I'm just going to go over a subset. You should notice here that most of the things that are in core services are things that don't deal with UI or graphics.

In cases where it seems like that's not the case, for example, say the alias manager when it brings up a dialogue, or some UI associated with the NSL manager, what we've actually done there is factored that out of core services into a framework above that in an upper layer, typically application services or even Carbon or Cocoa.

So what am I actually going to talk about in this session? This subset of technologies. And basically this list here is kind of roughly in chronological order as you would bring an application up from when you double-click in the finder and that gets launched via the process manager.

I will be going through these technologies and kind of giving you an update on what's been happening with them, but I'll also be looking at some of the things that are unique about them on 10. In particular, I'm going to be highlighting some of the core services design goals.

Performance is kind of a constant theme that we've seen throughout most of the core services technologies. Almost all of the slides to follow will have some aspect of performance, some impact of performance. Scalability and extensibility are essential for these services, mostly because they are factored low in the system. It really is essential to have these qualities at that level so that higher levels that build upon them can take advantage of that.

Stability is one of the major points of Mac OS X. That's both from the point of view of looking at the APIs that are in this layer, but also from the point of view of making sure the system itself, the actual underlying implementations of the technology are stable. Thread safety, this layer, this core services layer, is the foundation for our There will be a slide later on exactly what services are thread-safe, and I'll speak about that a little bit when we get to those.

This slide should look a bit familiar from the Carbon overview. I just wanted to stress that there are lots of different application environments that are taking advantage of and using core services, including the classic environment, which runs all-in-one address space, Carbon apps, Java apps. Java apps are a little bit different in that they may not actually use core services and Carbon things until they actually use a UI, so they're kind of a little bit more dynamic that way. Cocoa apps are somewhat similar, but there are some foundation technologies in core services that are used across the board. Okay, so now I'm going to get into the technologies proper, starting with code loading.

I don't think it's a secret. You've pretty much heard the terms throughout the conference so far. We do have two code loading mechanisms on Mac OS X. From an architectural point of view, this isn't something that we set out to do, that we design. This isn't something that's necessarily Something we put on our list, we need two code loading architectures. The contrary, it would simplify things to some degree to actually just have one, especially in the areas of tools. However, what we were faced with when we started OS X, we're talking about Carbon in the early days, we had a system that was predominantly a mock-o system.

The terminology there is the dynamic loader or DYLD, and it knows about these mock-o shared libraries, which we refer to as Dylibs. We thought about it for a little while and actually went into a certain amount of analysis on how much cost we would incur to change over to a whole new set of tools.

At that point in time, we had other things that we were planning on. For example, if you'll remember, two years back, we were just then talking about changing the microkernel from the earlier version of Mach. What we basically decided is that we did not need a whole sweeping change like this throughout the entire system.

This was key. For me, I've been at Apple for a very long time. This was one of those decisions that really was instrumental in letting us continue to develop and continue to have a live system. That's why you have DP 1, 2, 3, and now 4. Because we made decisions like this at that time.

It's not ideal from a technical point of view, but it is the right compromise. It's giving us pretty much the best of both worlds. We have a CFM world now that's very good, and I'll get into that a little bit later, and the mock-o world that we kind of started from.

Okay, just again on the terminology, because I know this can be confusing, the way you talk about the kind of output file format of the shared libraries for CFM, those are PEF shared libraries, and in general, we refer to the mock-oath shared libraries as DILIBs. Okay, so to go along with the code loading, There are differences that you'll find if you're developing Mako and CFM applications that are interesting to note.

For CFM, as you're already familiar with, you end up using dummy libraries to link your application. This gives you the important information of what libraries you need to get symbols from and what the name of those symbols are. After you've linked your application and you actually run it, the CodeFragment Manager makes sure and ensures that at launch time, right when you start your application up, that all of those libraries that you need and all of those symbols that you need are present right then and there. It's kind of flip-flop with DYLD and Mako. Linking an application, a Mako application, all of your libraries and all of the libraries that they depend on have to be available at link time.

And that's primarily because Mako supports what we refer to as lazy initialization at runtime. So, let me go through a scenario here. You have an application, you link a Mako application, you link it, everything's good, and then you're going along and running on a system, and all of a sudden, some symbol that you need is missing because you inadvertently removed that framework from the system.

That is something that can actually happen. That is by design with the code loader on a...

[Transcript missing]

I will get to what we've done with CFM there that actually kind of compromises between the two when I get to that slide. And then the other thing that I want to make sure is clear, as of right now in DB4, we still do not have support for two-level namespace in DYLD.

We're working on it. We expect to have it for the public beta. But the basic problem there is that we have a big, flat namespace, and we have some tools that you may have seen here in performance talks and things like that that actually rely on that flat namespace. So we're trying to figure out a way that we can actually preserve that investment in tools and move forward to a bi-level namespace that we know you need for your plugins and your shared libraries.

Okay, to go with the two code loading mechanisms we have, two runtimes. They are very similar, kind of at a high level. The real difference is what gets passed around when you're looking at the assembly code, particularly in terms of what a function pointer is. A function pointer in the CFM runtime, sometimes referred to as a TOC runtime, a function pointer is really a pointer to a little bit of memory. It has a function address, you know, the code address, and another word that's used for data offsets. In Mach-O, you actually have a function, a function pointer is a pointer to the code itself.

[Transcript missing]

And that brings us to Mixed Mode. The differences that we have between the two runtimes are actually dealt with in a slight change to the Mixed Mode that you probably know and use regularly in Mac OS in general. So basically what we did with Mixed Mode is we changed from macros in the headers to actual function calls. And this was because we had these two different runtimes to support. And this is key to having a single CFM application binary that can run on both OS 8, 9, and 10.

The differences come in when you do a new or a disposed UPP. And that's when we actually take into account the differences in between how you pass code addresses around. At invoke time, it's essentially the same. Specifically, the way that all gets invoked in your application is you link your CFM application with CarbonLib. The CarbonLib that lives on OS X has this little bit of glue that gets you from kind of a CFM world to the Mach-O world. "That Carbon lib that's on 10 is different in that regard than the Carbon lib that's on 8 or 9."

Okay, now onto the memory model that we have in OS X. This slide really is about what's not in the memory model. You notice right at the top of the list, I'm sure you've heard this before, but there's no low mem. The region corresponding to low mem is actually not even mapped. So if you access it, you'll take a fault, crash, or you'll break into your debugger.

That's by design. That's one of the key differences that helps with stability. It helps with cases where you have code you're unintentionally dereferencing nil or looking at low mem addresses and getting something that sometimes is valid and sometimes isn't, because sometimes it's the last thing you wrote there or sometimes it's something that somebody else wrote there.

This is one whole area of instability on the traditional Mac OS that we just don't have on 10. And that actually brings me to a good point. This is one of the real important reasons why you have to take your applications and try them out on 10. You have to qualify your applications on 10.

We're not saying that something that is perfect on 9 linking to CarbonLib is going to run unchanged on 10. As a matter of fact, I would say the opposite. I would say that if you run your app on 10, you're probably going to have a better app, a more stable app, that is very likely to run unmodified on 9.

Okay, so what else do you see in this memory map, or what else don't you see? Well, there's no system heap. There's no globally allocatable memory whose address you can rely on being around, who exists beyond the lifespan of the application. That again is a stability issue. I'll get to that a little bit more when we talk about shared memory. No process manager heap. It's really unnecessary. The tempmem functions that you may be familiar with are just mapped to the same as the non-tempmem versions. A handle, whether it's a tempmemory handle or just a new handle, end up being the same thing.

And then... Well, you don't have any other kind of heap. You don't have to worry about zones. And this is all mostly motivated by performance reasons and working better on a world where you actually have a very large address space, like we do on 10. And then, no direct device access is another thing that kind of comes with the map, the memory map. So what do you have in the memory map on 10? First thing of note that helps out a lot with performance, particularly performance of the system, is code and read-only data being shared.

These sections, from a VM point of view, code and read-only data that are shared, are the best kind of sharing that we have on the system because basically they're backed by wherever the original data and code came from. In the case of the system, most of the code and data you're talking about is coming from individual frameworks. And it's basically only in one place on disk, and it's shared and mapped into every address space that needs to use it.

In addition, other things that you'll find in your address space are going to be: Well, towards the beginning of the address space, you'll find code and data of your application. You'll find a stack for your main thread. You'll find more than one thread running on OS X. It's likely you'll find on the order of a couple in DP4 running your application.

Each one of these threads has their own stack surrounded by guard pages. That deals with issues like flowing under or over the stack size. Stacks are very large. I believe right now they're set at half a megabyte. That's something we have to look at and tune correctly. In general, we're talking about a very sparse address space. Lots of space, lots of holes in the address space. It's discontinuous.

The reason I'm going into that is that you shouldn't assume anything about that address space. You will find, if you're sitting there probing in the debugger, you will find a certain amount of determinism. You'll see certain things loaded at the same address. But that's not necessarily the case.

Because of the dynamicity of the dynamic loader and CFM, you will find things on the first run of the app. You'll find certain things loaded, code or data, at a certain address and then do this very same thing. And because between the two times something else got loaded, you won't be loaded in the same address. So don't make any assumptions about where you'll be loaded.

Then the other thing I'd like to point out, more on a practical note, if you're looking at DP4, there's two command line tools that really help you get a good grasp on this. The first one is just size. And basically size, when given a path to a shared library or an application, will tell you the sections that your shared library application has. Things like how big your code is, how big your data sections are, what portion of those are read-only, what portion are read-write.

One of the things that we hit a lot as we brought code over from OS 8 is we had lots of data sections that were marked as writable, which meant that they really didn't get shared. And that was simply a matter of, in the code, saying that these were constant data sections. At that point, they became read-only and were fully shared. That optimization is something I would look at with your own shared libraries and your own applications.

The other tool is VM Map, which takes as an argument either the name or the PID of the process that is running. And this gives you a snapshot, if you will, of a running process. And there should be a one-to-one correspondence between a size and VM Map. I shouldn't say a one-to-one.

Everything that you see in size you should see in the VM map. You will also in VM maps see the size of other things that are in your address space, other frameworks that you're using. You should be able to see read-only and read-write sections in there fairly clearly.

In talking to the tools folks, there's an internal tool that works where you didn't get ready for DP4, but it's basically a UI that lets you see this so you don't have to actually use command line tools. I thought I'd bring that up in case you're looking at these kinds of things in DP4.

Okay, so now we know about memory maps, but what does this actually mean to you when you're doing memory allocation? Well, the first thing is Memory Manager in general is fully supported on both pointer-based, handle-based objects. And I think what you have to keep in mind here is that this is one of these areas where we have some APIs in the Memory Manager that make sense on 8 and 9, that don't make sense on 10. They're still part of Carbon, so you can write a single binary. The best example of this is the free mem call. There isn't really a good answer for that on 10.

We could tell you there's four gigabytes. We could tell you there's half that. It's not really something that makes sense when you have a sparse address space. And that's why we came up with the Gestalt Memory Map Sparse Selector. That's what you should check when you're wondering whether you need to ask these questions, whether you're wondering whether you really have to police your memory allocation. Some other differences that kind of fall into this boat is purging.

[Transcript missing]

So this is actually a counterpoint to what I said earlier about where you should qualify your app, because this is an area that you could have something that runs fine on 10.

And you write some new code and you don't lock your handle and you never set handle size so that you can dereference that without a problem and you'll never find a bug on 10. But take that same code and move it back to 8 or 9 and you likely will find a problem. And that's because 8 or 9 has a limited, you know, fixed-size application heap.

The other thing I want to address on is kind of, we spent quite a bit of time trying to understand the performance of memory allocation on 10. And one of the things we found is that the assumptions are almost reversed in a sense. The memory manager on 8 and 9 made a very explicit design decision to favor handles over pointers. That was the right decision at the time because we did have a fixed size heap. Basically, the reverse is true on 10. In a VM-based system like 10 with sparse address spaces, The most basic representation for memory is a pointer.

One of the things that kind of sounds counter-intuitive, but I really encourage you to do, is to try to get your code set up so that you basically refrain from doing any kind of sub-allocation that you may be doing. It's very typical in your OS 8 and 9 code.

The reason being here is that a lot of the tools on OS X depend on you using malloc directly or new pointer and new handle directly. The tools have been revved to understand those two packages for memory allocation and the tools I'm talking about are going to help you find things like leaks, they're going to help you find cases where you've overwritten a block of memory underwritten. And this has all kind of gone over in one of the performance tool sessions.

But the point here is that at least in a debug version of your app you want to have your memory allocation factored so that it calls one of these routines directly so you can use the tools. The other thing I would encourage you to do is Do some measurements of your application and see what they're like just with the system allocator.

Either malloc directly, and you probably should look at malloc if you don't rely on any of the uh features of the memory manager with regards to pointers and handles and recovering handles and things like that if you're going to be interacting with handle based apis in the toolbox of course you're required to use handles there but for your own internal allocation needs start off try malloc there's a there's a big advantage to actually all of us using the same allocator and it actually is not something you typically see unless you're looking at the overall system performance the problem we initially had when we brought off brought over large subsystems early on in the Carbon effort on os10 was we had a lot of sub allocation going on every big subsystem had their own memory allocation and the problem with that is that we have essentially different high water marks in every subsystem that has allocations and that meant that we didn't really do a good job as a whole of get getting rid of memory and freeing it up in a timely sense.

Take a look at that. If you have issues specifically with malloc and its performance, both size and speed, let us know because we want malloc to be the system allocator. We want it to be as fast as possible. We want it to be as efficient in terms of speed.

I'm sorry, in terms of size. And that actually brings me to another point, which is the problems we've been dealing with in terms of performance and allocation have largely been size problems, not speed problems. And I think that's going to be the case with your applications and your shared libraries as well.

It's not just a matter of having a big sparse address space and being able to allocate a lot of memory, but It's a matter of actually being able to account for your space and understand the relationship between things that get allocated underneath a system framework compared to what you really intended to do above it. So, again, take a look at that if you have really good reasons or really good Performance observations, please share those with us because we really want to understand why you would think of doing something other than using malloc on this system.

Lastly, there's an issue that came up late in DP4. We didn't get a fix in for it. Basically, one of the changes we were doing to the memory manager to enable additional debugging support ended us up with a memory manager that doesn't allocate on a 16-byte boundary. In particular, this is important for AlteVec or Velocity Engine data, which ignores some of the lower bits. So if you run into that situation, there's a release note on DP4 with a workaround on how to do that, essentially copying the memory out. We'll fix this for the user beta. Okay, shared memory. This one has quite a story to it too.

So, you know, early on in Carbon we basically said, new pointer sys and the system heap, they have to go, there's stability problems around them, they're being used way too much, they're used between extensions and applications, and those things don't essentially have the same counterparts on OS X. We really looked at this closely, and we listened to some of the developers that wanted, that really knew what they wanted out of a shared memory system, they wanted particular features.

And after a lot of consideration, what we ended up with is saying, you know, the features that most developers want in this area are completely covered by the POSIX shared memory APIs, and it really makes no sense to try to come up with a system that works on both 9 and 10 regarding shared memory.

The two systems in this area are quite different. The single address space versus multi-address space nature of the systems makes it really hard to kind of abstract that away. This is something that we're trying to do. Yeah. that you have to be aware of if you're dealing with shared memory.

And then, of course, to use shared memory safely, you have to synchronize, and that's what the POSIX semaphores are for. Those are fully documented on the DP4 release. The shared memory documentation didn't get there, but I've provided a URL here at the bottom of the screen. One additional note: If you're a CFM application, since currently there's no way to get to services that are in the system framework, like the POSIX shared memory and Semaphore APIs, you're going to need to get to them through CFBundle or a plug-in.

Basically, there's two different ways of doing this. One, you could use CFBundle to ask for the address of a routine in the system framework, so you could inquire and then call that. The other is you could factor your application to use bundles that are specific to a platform, and we recommend doing this, so that basically you have a Mako plug-in on 10 that links with the system, uses these APIs, and then uses them to do the work. you would just invoke the plugin.

Okay, that gets us over shared memory and into the file manager. The file manager is probably one of the largest chunks of work that we had to do over the last year. The biggest disconnect, or the biggest missing feature from the file manager on 10 was support for these volume formats that I have up here: NFS, UFS, and other POSIX-style volume formats. Um... Now, why are these file formats important? Basically, this gives us a certain level of interoperability that we've never really had with a Mac before.

It's not just NFS and UFS, it's anything that complies to a POSIX API. It's anything that, if you went to the file system talk, it plugs into the VFS layer. And this is a bunch, there's a bunch of opportunities here in up-and-coming file systems that we really wanted to be able to take advantage of.

The other point is that this wasn't really an afterthought for us. We were coming from an environment that was heavily using UFS and NFS. We were coming from the Mac OS X server environment. And as a matter of fact, internally, we were using NFS and UFS very heavily. To this date, I keep my source code in my home directory that's on an NFS server.

That's the way that I personally develop. Other folks choose whatever they want to. I do that mostly because I tend to go from machine to machine, and I like to have the same kind of environment. Keep my sources there locally, they're backed up centrally. That works beautifully on DP4. But to do that, we had to really look at how the file manager talked to these kind of volumes. For one, the classic file manager APIs really kind of assumed a volume format. They were very HFS-centric.

[Transcript missing]

Two particular features of HFS. One was forks, resource forks, the like. Just the general concept that you have more data associated with a file than just that one data fork. And then the other was the idea of persistent file IDs, basically the backbone for aliases on HFS. You have something similar in UFS called an inode, but it doesn't work the same way. It isn't quite the same experience in terms of something that's persistent.

We knew we had to support these volume formats, and what we settled on was basically two different solutions in a sense. One, we knew that the current APIs were not very good for dealing with this, and we actually had a luxury that we rarely seen in the industry. We actually had some time to make sure that the set of APIs that we call now HFS+ APIs worked well with this environment. The other thing that we did is standardize.

We knew we had to store the extra data that we're talking about, the fork information and the file ID, essentially the catalog information from HFS. We had to store that somewhere, persistently. So for that, we chose the Apple double file format. By dealing with these volume formats by using Apple Double, basically what we get is the file manager API, both the classic and the new HFS+ APIs, deal with volumes transparently. Even if they're underlying, they may not have a catalog node or they may not have a resource fork, we store this in Apple Double format so that the HFS and HFS+ APIs work transparently across all of these different volume formats.

Of course, we have support for the volume formats that we're already used to in Mac OS 8 and 9. And almost all of these are on in DP4. The only exception is UDF, and that's coming. In addition to the different volume formats, we had to deal with something, two essentially foreign concepts. One was the idea of mount points, the idea that a volume can be mounted or contained within another volume. And the other is symbolic links from Unix.

This work actually made its way into DP3 and we've refined it since then. But basically we came up with a mechanism whereby both mount points and symbolic links are Show up to you through the File Manager API as aliases, that is files that have alias bits set. They're not really alias files. This information is synthesized. This is one of the rare cases in the File Manager where we actually synthesize something. Typically, we're giving you exactly what's on the disk.

The other exception to that would be, of course, the Apple double format. We're not actually giving you that information directly. We're providing the extra information. The way that you need to support mount points and symbolic links on OS X is just to respect the alias bit. If you intend to resolve them, to resolve them via the resolve alias file. You need to use that call as opposed to other variants because it isn't an alias file, really.

Now, for those that are interested, the reason we had to go this route is... A very pervasive assumption that most people have in dealing with file systems, particularly in iteration, is you're going through a directory full of files, You're not assuming that the VREF num, which is sprinkled throughout most of our APIs, is going to change. That is the case with AmountPoint, and it's potentially the case with the SimLink. So, that's what got us to a solution like this.

Okay, so the file manager is an area we put a lot of effort into, and I've told you about kind of the differences in the file system on 10. The one thing that I really want you to come away from this session with is there's really a lot of win in OS X to go and use the HFS+ APIs. The HFS+ APIs are very well designed for this system.

There's one notion in particular in the HFS+ APIs that really pays off, and that's the ability to be very specific about the information that you want from the file. So, using the HFS+ APIs and asking for as little information or exactly the information that you need is the best solution on these volume formats that don't support some of the things that you may have become used to in HFS.

Uh... and then in particular you really have to try this out for yourself on these volume formats. You can do that in DP4, you can format a volume uh... The first thing is to use UFS. You can also, if you have NFS servers in the general area, you can mount them. NFS or over a CD, the experience will be, you'll be able to see it a lot easier just because of the general performance characteristics of those volume formats.

Okay, so then the other thing I wanted to bring up because people keep on hitting this over and over as they bring their apps, as they carbonize and bring their apps to 10, are some assumptions that are out there. First is that VREF nums start at -1. They typically don't start at -1 on OS X in the file manager. Really need to treat those VREF nums as cookies.

We've run into a lot of code that basically, you know, it's quickly written code that basically works on 8 and 9. It's something like, you know, looping between -1 and -10, assuming that they're going to find all volumes that way. Use the volume calls to iterate through volumes.

That'll prevent that whole set of problems. The other is the whole notion of file IDs. File IDs... Like I said earlier, are not available on some of these volume formats. On some of these volume formats, what we have to revert to is a path embedded in the alias.

This is essentially like a minimal alias. You can treat them exactly the same way. And on these volume formats, with the current implementation, the way it works on DP4, they'll have some limitations. They won't behave like a normal alias. You can break aliases more easily on these volume formats. We'll be looking into things that we can do to improve that. But in particular, changing an element of the path, you know, renaming a directory somewhere in the middle of the path is going to break that alias.

Okay, so I spoke about this one concept earlier that really helps out. This is a tie-in to performance and just working well on OS X. And that's the notion that you can be very specific about the information that you want. When asking for information for a file using FSGit Catalog Info, the HFS+ API for that, if you just need the name of that file, Then use the bitmap as an argument and just say you want the name. That's a very cheap way of determining if the file is there.

The same kind of approach works in some of these other cases. When you're going through a directory, what is it that you're really after? Are you trying to just enumerate by name, or are you trying to find all files with an extension? Are you trying to find only the directories inside another directory?

These are all things that can be set in the bitmap, and doing so will really help. In particular, on a network volume like NFS, where each one of those file system operations ends up being some kind of packet over the wire, there's a certain amount of non-determinism to that on your typical network.

And volume iterations, same thing goes for that. Set the bitmap and ask for what you want. Let's see, this is actually an area where the advice we gave you earlier on ended up biting us a little bit in terms of performance. So long ago we basically said that the VCBQ is gone from Carbon. You can't just get this low mem and increment it and start walking through volume control blocks that way. The replacement technology for that was use the volume iteration APIs. Works fine.

When the volumes that you're iterating over are, for example, NFS volumes, it doesn't work so well. Again, from a point of view of performance. When you go through using volume iteration calls, you're asking for a lot of data that often you don't want. If you just wanted the name or you wanted to get to the next V refnum, You probably don't want the size of the volume, which can be very expensive to compute. The size of blocks, things like that.

Some of the other information that comes back in some of these calls has always been cheap on HFS and HFS+. Not so on some of these volume formats. Again, this is why you want to use fsgetvolinfo. In addition, there's a just getvolparams call, which gives you a subset of the same information.

Okay, I'm going to go through a little bit of code here. This should look familiar to most people. This is your typical, you know, go through a directory looking for files in the directory. The dot dot dot is just pseudo-code. That's a little bit of setup. Additional setup is not something you would compile and work correctly. You'll notice the main call in a directory iteration, this is the old file manager APIs, is pbgetcat info sync, sync in this case. You notice that basically as you go through a directory, you have to make this call for every item in the directory.

Here's the replacement loop. Notice first, same kind of structure code, right? A do-while loop, same kind of call, you're asking for catalog info. One big difference is it's a bulk call. So you can ask for more than one file information at a time. The number of files is up to you. It's actually a parameter for the FSGit catalog info bulk.

So this is something that's tunable. This is something that in DP4 works down to a bulk call in the file manager, but what we're going to do is actually move that even further post-DP4, make a better performance enhancement, and move that call all the way down to the kernel. That'll give us the best efficiency. So now I want to actually-- can we get demo one up on stage?

I want to show you a quick demo application that we brought together. Just as an aside here, This was, you know, in the tradition of WWDC, done in, I guess it was late last week now, in a few hours. But the surprise that we had was actually a lot of folks on my team, the low-level, the Advanced Mac Toolbox team, hadn't really had a chance to seriously use Interface Builder. We don't typically do those kinds of things that have to do with interface and bits on the screen.

And this was a... Basically, I was surprised to actually see Interface Builder generating Carbon App in almost no time flat without actually having to understand some of the new technologies that are there. Okay, so what do I have here? I have... Let me go and hide this... Can I hide this?

John Iarocci, thank you. Okay. The top progress bar is the classic APIs, the PBGetCatInfo sync call that we just saw. It's basically that loop. The bottom is the newer call, the bulk call. Before I actually say go on this, I wanted to bring up one other thing. I wanted to do this in a pretty fair way. I'll get to actually the way I went about doing this a little bit later. Basically, I decided to do both of these tests on an MP thread.

Basically, the only difference between the two executing is the kernel time-slicing them appropriately. Most of what these threads are doing shouldn't take that long in terms of... ...CPU processing. They should go in and either block very quickly because the OS detects that the data in question isn't there and they have to actually get that from the device, or rip right through that and just fetch the next one if it's cached. The other thing I want to bring up is basically both of these, the new and the old, are both iterating through two different directories. So that you don't get any caching effects from one disturbing the other. And with that, I'll just let them go.

It's a fairly small directory, roughly double the time. The results are 65 ticks versus 31 ticks. This is extremely exaggerated in the case where you go on a CD, a little bit less so on NFS, and this is UFS on DP4. And you see that doubling is something that you can easily do. One thing to note, though, is it really will depend on the contents of what's in the directory. And that gets back to, you know, are there resource forks? Are there things that aren't easily represented on the volume format? Okay, can we get the slides back?

Okay, so we've talked about the file manager again there. The biggest advice I have is to use the HFS+ APIs. Now we're on to the folder manager. The biggest change to the folder manager for OS X has been to support a concept that we call domains. OS X supports a notion where you have basically a system, what we ship to you, that's read-only.

It's not modified by the user. It's not typically modified after it's shipped. That's kind of like the base domain. That's what you start with. Above that is typically either a network or a local domain where you do install things in that are shared by all applications, by all users.

User Domain: User Domain is roughly just for the user. Things that apply to all domains are preferences, fonts, and fonts. You don't want to use your own fonts. Alternatively, fonts could also be installed in one of the other domains. Actually, domains have to do a lot with sharing. The folder manager has been enhanced to support these domains by using the first parameter of the defined folder instead of just supporting the refnums. It also supports some well-known domain constants.

It's important to note, though, that there are some types of things that the selectors that the folder manager supports that don't make sense in domains. The two examples that I have here, temporary items and trash. Temporary items, for example, is typically used when you want to do some kind of Well, the best case I have is when you want to use PBXchange files, you want to basically make sure that the file in question that you're going to save atomically is on the same volume, right?

Trash is another case. If that ended up being in some different volume, then you'd end up doing a copy for trash and that's not really what you wanted either. So to check for the support of this, you just have to check this Gestalt bit. Folder Manager supports domains.

The other thing that I wanted to bring up is you'll see a bullet item on a lot of these slides. Most of these lower-level technologies have up-and-coming FSRef-based APIs. If you had already started on HFS+ conversion, you may have ran into the fact that, yeah, you're using HFS+ APIs, but then when you have to ask the folder manager for something, it's in terms of an FS spec. We're slowly taking HFS+ APIs. My guess is for the public beta, you'll have variant APIs in a bunch of these technologies that let you use them more directly. Okay, resources. There's been a lot of questions about resources, in particular, are resource forks supported?

Very quickly, we'll go through the chain management APIs. These are things that have been there since CarbonLib 1.0. They allow an insertion and removal of resources in the chain. One thing to keep aware of on OS X is that we actually support file mapping of resource files. That is, we file map things that would be shared.

Things that roughly equate to the system file. And what that means to you is that

[Transcript missing]

This is very similar to a ROM on older Macintoshes. You might get ROM data in older Macs. You'll probably end up You can actually silently dereference a handle to ROM data on OS X.

You won't be able to do that. If you need that data, if you want to modify that data, you're going to have to use Detach Resource. And then the Resource Manager is going to continue the trend and have some FSRef-based APIs available. There have been a lot of questions about resource forks in CFBundle.

I guess the biggest thing here is that basically because we want to interoperate with other volume formats, we've introduced an app packaging specification that doesn't use resource forks. It uses resources in data forks. Now why is that a good thing? It's a good thing because that's kind of a lowest common denominator solution that transfers easily.

You can copy that off the web, off of FTP, you can copy it up to an NFS server, copy it back without having to modify tools, without having to understand about things like forks. So there are some APIs that enable this in the Resource Manager, and this happens for you for free for your application resources. If you're packaged, you'll basically get two DataFork resource files open for you, assuming you've localized your app, one, if you haven't.

And then just in case you haven't gotten to any of the CF talks, if you need to get to anything else within your application that's been packaged, just make sure you use the CFBundle APIs. Okay, Code Fragment Manager. There's been a big change we were actually able to get in.

We hadn't planned on making this available for DP4, but we got far enough along that we did get a preview of this technology in DP4. It's not on by default. You're going to have to go and enable it. In the release notes on DP4 in the documentation folder, there's a release note on the Code Fragment Manager on how to actually enable it.

One of the key features of the work we've done on this is it really helps with launch time. Basically, we were able to take a calculation that happened at runtime previously, make that occur at link time, so that CFM launches are a lot faster with this technology enabled.

The current default implementation in Code Fragment Manager does not support data exports. This new one does. In addition, because of the way in which the work was done at link time, we support lazy initialization in the same way that DYLD does. You do not pay the cost until you actually use the library in question.

There are two particular issues on OS X. Resource-based fragments are not supported. The workaround or the alternative to doing that is actually handling whatever you want to load from a resource yourself by calling the resource manager and then using memory fragments instead. And then, in general, shared data sections are not supported.

We already talked about the velocity engine issue. Okay, the process manager. The process manager is fully functional on the current release. It's compatible with app packages. There's been some confusion about this. Basically, if we preserve the illusion in the process manager that you kind of have in the finder.

If you ask for the process information of a process that is app packaged, the spec associated with it, the FS spec, is the spec to the .app, the .app. The .app is the package, not to the executable contained within it. However, we also support traditionally packaged single file applications, and that works pretty much the same as it has.

Thread Manager. Okay. There's a couple things I want to get through here on the Thread Manager. One is that there's a lot of code out there that goes through great pains to make sure that they use threads in a memory-savvy way. They're trying to make sure they're not using too much memory with their threads, managing the stack sizes, and things like that. Most of that is unnecessary on 10. Some of that you can't really do very well on 10. You can't really get the stack size of another thread, particularly if that thread happens to be running on another processor.

This is just one of those kinds of things that wasn't anticipated when the original Thread Manager came along. These threads and MP threads are all layered on top of P threads. In the Thread Manager case, they're still completely cooperative. So that is, even though they're P-threads, they all race for the same lock and only one of them can run at the time.

The other problem that folks have had with these kinds of threads is that generally the way that you schedule them is off of null events, which doesn't really work very well from a performance standpoint on 10. We've talked about in the Carbon event sessions and other sessions, we really want you to be blocked waiting for events to come in. If you're blocked waiting for events to come in, then you can't really be calling yield to any thread all the time. That's part of the problem with Thread Manager threads. And of course, no concurrency.

However, I do want to bring up that Thread Manager is often the best solution for your UI layer of your application. Why is that? Let me give this example. If you are handling four different windows with four different preemptive threads, and the user says, "Command + W," the shortcut for closing the window, to all four windows, what's going to happen if those threads are preemptive is completely random.

There's not going to be any order. You'll do it one time and all the windows will shut in one order. You do four command Ws again and it'll be different. This is not good for a decent, predictable UI. You do want cause and effect in your UI. The best way to keep that is probably something like cooperative threads or even a single thread for your UI and then have the backend of your application use preemptive threads.

And that brings us to the MPA APIs. I already mentioned that they're layered on top of P-threads. These are the APIs that we really want to push you towards. These are the APIs that work preemptively on Carbon, both on 9 and on 10. One difference, there's been some confusion on this, is that MP constructs are not system global. That is to say, on a 9, where everything works in an address space, an MPQ can be used to talk between processes. That's not the case on 10. The MP package is a fairly thin layer on top of P threads. It's a per-process or per-address space package.

Then I just want to refer you to another release note on how to use MP Create Tasks from CFM Maps. It's also on the DP4 release. And then I mentioned earlier different thread-safe services. These are the ones that we're talking about right now. Mostly file I/O networking, that kind of thing.

One more thing about threading is that the models between 9 and 10 are Particularly different when you're doing threading and I/O. Something you want to look at is if you're using cooperative threads and a synchronous I/O. That is not going to perform as well as synchronous I/O and preemptive threads on 10. It'll still work, but it's just not going to get your last little bit of performance.

Okay, Apple Events. Apple Events is basically the only solution that we have for cross-process IPC inside of Carbon. There's been a little bit of, there's been questions in the hallways about this and things like that. So if you need to do cross-process IPC, this is the solution. The other alternatives are only available on 10 are CF, Core Foundation allows for some level of cross-process communication, and then Mach messaging directly.

[Transcript missing]

Other than that, they're a foundation technology. They're being used throughout all the frameworks: Cocoa, Carbon, Classic. One of the things I didn't mention in the process manager is all the applications within Classic actually have an individual PSN, so they're each uniquely targetable that way. Okay, now I'd like to bring Steve Zellers up, who's going to do an Apple event-related demo.

[Transcript missing]

So here's my window over here with a bunch of predefined searches and things. And as I'm executing searches, you can see that the database over here connects to the server, performs a search, and then displays a search. Now, that's obviously not something you want your database to do all the time.

If this were a commercial database like one of you would develop, you would return the data without updating the UI. But you could do that in a background application without having to present the UI to the user at all. So over here, you can see the session transcript of what the Perl script actually told the command line tool to send to the database.

It's over here, and what gets returned is standard Apple script results, which are in turn interpreted by the Perl script and turned into HTML. Let's do all searches at once, because that's more interesting to watch over there. So as you can see, you can put together a real application, a real workflow, using Carbon and your applications. Okay, thanks, Steve.

Okay, we're at the More Information section. You've probably seen the Carbon documentation URL, but that's up there again. The two release notes that I referred to are on the DP4 release in System Developer Documentation Release Notes. CarbonCore.html is going to go over some of the issues that I mentioned with the Memory Manager and CodeFragment Manager. And then the CodeFragment Manager has its own release note on how to enable the new vector libraries.

And now that summary. So HFS+ APIs, that's the big one. Most of the applications that we've seen have just been ported. Not a lot of them have adopted HFS+ APIs. Now there's even a bigger reason to adopt these APIs. And in particular, using these APIs and using multi-processing APIs for a factored application is a very good proven combination that's worked well on OS X.

So to get back to one of the Carbon overview sessions, Scott Forstall was actually up there saying that he had some challenges for you. I have the same challenges and I'd really like to see the development community take us up on this. In particular, with these core services, we need feedback on things that may be impacting you in terms of performance, in terms of features. We need things earlier rather than later. Particularly these layers of the system, we can't rev at the very last second just because of the dependencies. So I encourage you to bring your app up on DP4.

Okay, now I'd like to bring up a few of the people on my team for some quick Q&A. Actually, John, whoo-hoo. We are out of time. I'll wait to do a few questions. I told them that you had a lot of information and you did. Okay. Actually, it looks like we have -- well, I better not -- I can't even start it because of the -- Time.

Yeah. Okay. Well, I can take some questions. We can probably go towards the back over there. Yeah. So thank you all so much for coming to this conference. Again, the feedback that you've given us is so great. I know that Apple engineering has been way encouraged by the show and hopefully you have been too. And we'll see you all next year. Thanks. ♪ ♪