Using DTrace on Mac OS X - WWDC 2008

Tools • 57:31

DTrace is a powerful software tracing tool introduced in Mac OS X Leopard, providing much of the data collection in the Instruments analysis tool. Learn how to interact directly with DTrace using command line tools. See how DTrace scripts can be used to understand the runtime behavior of an application, the kernel, or the entire system. A valuable session for all Mac OS X developers.

Speakers: Steve Peters, James McIlree

Unlisted on Apple Developer site

Downloads from Apple

SD Video (529.9 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good afternoon and welcome to session 930 using DTrace on Mac OS X. I'm Steve Peters. I work in the Mac OS X performance group. My colleague James McIlree will be joining me later for questions. He works in the performance tool group. And today, we'd like to tell you about DTrace, new to Leopard anyway, system observability facility.

DTrace joins a rich collection of Mac OS development tools, tools you've been hearing about all this week. And the point of having you here today is so that we can show you how you can use DTrace to gain greater insight into the behavior of your projects. We're pleased, as always, to acknowledge that DTrace is open source software that was developed for Open Solaris by Sun and ported to Mac OS by Apple.

So here's what we hope you'll learn today. Quite a bit about what DTrace is and a little bit about what it's not. How to use DTrace. You should be reasonably comfortable at the command line with some simple DTrace scripts and toolkit invocations. If you're following along in the audience, you'll be practicing it by the time we're done today. We'll talk in some detail about the language, the programming language for D.

And we'll talk about the DTrace toolkit, a prepackaged collection of commonly used scripts that can get you up and running really quickly. And finally, we'll have a bit of a rundown on the DTrace providers, the sort of subject area experts underneath the hood in DTrace that provide instrumentation information.

So the high order bit, this is actually a marketing slide, says that DTrace is a comprehensive, dynamic tracing framework. DTrace provides a powerful infrastructure to permit administrators, developers like yourselves, and service personnel to concisely answer arbitrary questions about the behavior of the operating system and user programs. So what are some of the advantages of DTrace as we see them? It's really a unique tracing technology in Leopard. A follow-up, sort of supplants the existing K-Trace and trace mechanisms available only in the kernel with a much more general mechanism, one that has zero disabled cost, costs you nothing if it's not in use.

It's entirely dynamic. There's no starting, stopping of the system, of your program, no recompilation with different debug tracing flag set or any reboot required. You can just go right to it. It's system-wide. You can see both user programs that you've constructed and stuff in the kernel. Every thread in the system is visible. Always available. It's part of the user installs gone out on every Leopard system we've shipped and will go out with Snow Leopard as well.

Comprehensive. You can see bits down in the kernel. You can see methods and structures up in your Cocoa application. And finally, and I think the most important, DTrace is programmable. It's probably the first of its kind facility, at least on Mac OS X, for tracing that lets you really sort of arbitrarily collect, compute, and display data, tracing data.

So the style that these advantages engender is one of following your nose. Pose a question, design a little DTrace script, or use one that you already know about to collect some data, answer the question. Perhaps it leads you to another question. You'll see that paradigm repeated here today. and DTrace also turns out to be an important enabling technology for the Instruments high-level GUI tool. And you can learn more about Instruments in the advanced DTrace and Instruments session, number 935, I believe tomorrow on your schedule.

So just a little bit about what DTrace isn't. It doesn't replace tried and true tools. You want to always use the right tool for the job at hand, the one that gets you to the finish line first. So for instruction level tracing and timings, careful timings of the hot loops in your program, look to Shark. That's a finely crafted application that's been around and is tested and an ideal tool for that kind of application. For memory leaks, there's the object alloc template in Instruments.

And if you're just interested in seeing, you know, where something's stuck in a loop, just point Sampler at it.

[Transcript missing]

There's no magic with DTrace. It's a wonderful, powerful tool, but you still have to use your debugging and engineering intelligence to really apply it well. You need to understand your problem in depth. Be able to generate sharp hypotheses about what's going on here.

System call time, am I blocking? Then create or choose an appropriate DTrace script to make a crucial data collection experiment. And finally, be able to interpret the results of that. I mean, these are all things that over time as software engineers, we should all be quite adept with.

So, sort of the second order bit, how does DTrace work? Well, there's this general separation into the pieces in user land and the pieces in the kernel. and close to that border are the core pieces of DTrace. On the kernel side, the DTrace virtual machine, a place where all the magic happens. Above the line, a framework, libdtrace.dilib, and it basically takes care of the communication with the DTrace virtual machine as well as the language processing issues that arise from looking at your scripts.

Underlying the DTrace virtual machine are these subject area experts. These are modules of code that know about a particular aspect of the system and how to expose interesting trace points and instrumentation data. So, for example, if you're interested in knowing where your program is making system calls, Unix system calls, you'll be talking mostly to the syscall provider.

If you're interested in understanding where I.O. starts and stops in time and in the volume of I.O. that's going on, you'll appeal to the I.O. provider. FBT is the function boundary tracing provider. It provides most of the probes in DTrace. They are every C function entry point inside the kernel. Kernel hackers love this stuff. Users may find it interesting at times.

You talk to DTrace using scripts, just like shell scripts, just like little C programs, as you'll see. And those are digested by the DTrace command and passed down the pipeline. Instruments also, as I mentioned earlier, has DTrace underlying parts of it, and it eschews the command line and just goes straight to the dialib and does its business.

It's worth pointing out that there's one really special provider, and it's not just because it was James' piece of work that I mentioned this. It's called FastTrap. FastTrap knows how to cross that userland kernel boundary for tracing. So if you're interested in some of the trace points that we've put in lib system, for example, in our Snow Leopard malloc implementation, you'll ask the FastTrap provider to go find those for you.

Similarly, in your application, if you want to put either pre-position probes or walk up to the machine and say, I need to look at this function right now, the FastTrap provider is capable of doing that for you. It's a jack of all trades and a really remarkable creation. So let's do a demo. I call this the ABCs of DTrace.

So if we just talk to DTrace, say its name, it tells us how to talk to it. It's a long, long list of options, but as you'll see, you can get an enormous amount done with just two or three. And we'll sort of stay around two or three options in the examples that follow.

Those subject area experts can tell you what probe points are available. We can get a catalog of those, and we get those by DTrace minus L, that's a lowercase L down there. And off we go, and there are quite a few. Maybe we should pipe that through more. Ah, that's better.

Let's look down at say line four. Yes, lockstat. Begins for lockstat. The column of numbers are just indices on the far side. Then there's the name of the provider, the subject matter expert. I didn't put lockstat on the previous slide. I could have. There were lots of little bubbles on the bottom. It could have been one of those. It instruments the Mach kernel. It tells you roughly where its interest is centered.

And then it tells us what function in particular will be providing this piece of tracing location. And in this case, it's lock mutex lock in the kernel. And there's a particular semantic character to the piece of information that it provides. It's about when an adaptive acquire is undertaken. And notice adaptive acquire appears a couple of times in that name column.

And those are different function points in the system where an adaptive acquire is done. And we can aggregate those together in a way you'll see. But there's sort of this mix and match quality, right? So sort of four interesting pieces of information about a probe. Its provider locks that in this case, the module, the function, and a name.

So I mentioned the FBT provider, the function boundary tracing provider. Here's the beginning of its catalog. FBT instruments the Mach kernel, and it starts out with AARP wakeup, the entry point. So if we set that probe, set that to fire, whenever the system called into AARP wakeup, we'd hear about it, and we could also set one on the next line, the return. And similarly for, I think there are 17,000 of these things. So let's move down.

Yeah, that's the last one. ZMap return. Okay. Let me go on to the syscall provider. System calls are the programming interfaces between user land and the BSD part of our kernel. Things like open, read, write, close, exit, fork, exec. All the things you learn about in Unix 101. And we have a, DTrace has a provider that instruments the entry and return points. So you can see when those probes fire, perhaps take a timestamp on the entry, see the return, get another timestamp, notice an elapsed time. Just as an example.

So VM info, there are a bunch of those in here, give information about the virtual memory system. When there's a page fault, when stuff's being swapped, in and out, so forth. Here's an interesting one, the last one I'll show. Yes. So at the very top of the screen, I hope everybody can see, there's mention here of the core data provider. That's one of our frameworks up in user land, core data. And a rather long Objective C-like method name for the function. And then the sort of semantic context is that this probe fires when there's a data fault, when there's a core data data fault.

And the name, the provider name way over on the far side, Core Data 152 is interesting. 152 refers to the process number of a process that's currently running. So here's a case where that FastTrap provider has known to, knows how to catalog the fact that it can find a probe up in Process 152 in the Core Data framework on this Objective-C message with the begin fault name.

Clear the screen. And so what I'd like to do is take a look at how to set a probe, set a trap, a DTrace probe point, and get some output from it. So here we go. Invoke the DTrace command. We'll use the minus N option, which says I'm going to give you the DTrace program right here on the command line, right between these two quotes. And pretty clearly we're going to talk to the syscall provider here.

And there's a whole bunch of colons which basically say wildcard to anything. So whenever a system call happens on this system, DTrace will let us know about it. So let's let that go. Let's click on the window. So there were 854 of those things, and we're off to the races, right? Well, let's slow that down a little bit and take a little bit more careful look.

Again, 854 probes. All right, so the first one that hit, hit on CPU number seven. Yes, Octo Machine. Hit a probe whose index number in that list before was 17701. It's the iOctal probe, and it was a return point. Probably the entry occurred before we fired up this script. I kind of suspect that iOctal actually came from DTrace itself, passing the program down to be run or starting it. What's kind of interesting are these sort of repeated patterns in the middle of SIGACTION, SIGPROC MASK, MMAPS. I kind of wonder what's that about.

And so that's kind of the follow the nose paradigm. What's the next thing we might want to look at here? Well, how about figuring out Which syscall is called how many times? So down here on the command line, we've added a little bit to the description of the probe. We're only going to ask for firings on the entry to that probe.

And then here's some stuff in curly braces. Well, maybe that's sort of like program code. Yeah, it's the action that should take place when that probe fires. And we've got something that looks like storing into an array. Sure enough, it's a special kind of an array. It's first of all an associative array. The stuff between the square brackets can be just about any key value.

and the at sign out in front indicates that it's an aggregate array. And we'll talk a bit more about aggregates, but aggregates basically solve the problem of probes firing on many processors all at once. How do you get a coherent collection of data? Aggregation solves that problem. And on the right hand side we're simply indicating that every time this probe fires we should increment a count in a particular slot of this array.

Which slot? The one that's indexed by the name of the system call it fired. Open, close, read, write, etc. Well, it's best just to see what happened here. While we were talking that data was being collected for us. And if I click in the window, we can see.

On the far side, we see the list of the system calls that were encountered during the time that I was talking. The count ordered from smallest to largest on the side closest to me. And wow, a lot of F controls, a lot of I octals, a bunch of SIGPROC masks. Hmm, I wonder if we can figure out who's making lots of system calls. Well, that's just sort of a variation on the script we just ran.

So off we go again. Same probe descriptor, pretty much the same action, but this time we'll use a different interpreter built-in variable. Every time the probe fires, DTrace notes what was the exec name, the name of the process that caused the firing, and that's available to the script as a special built-in variable, just like in the shell. So while we were talking, if I click in there.

Ah, DTrace made a bunch of calls. NTPD, oh, interesting. Windows Server doing a lot of stuff. Ah. Well, let's dig into the Windows Server. That's sort of always an interesting place to peer around in. And let's just count the system calls made by the Windows server. And that introduces kind of the last and final piece of deprogramming. You'll now have seen it all. And it's off and running. Same action as before. Same probes that we'd like to light up.

But there's this bit enclosed in slashes, and that's a predicate, that's a filter. And it says, look at the built in variable name of the process when this is fired. If it's identically equal to the string Windows Server, then go ahead and do this action. So we're filtering down on just the system calls that are being fired by Windows Server.

Maybe that's telling us a little bit about the original observation we made that, gee, sigalt stack, sigproc mask, mmap, and munmap are being hit a lot. Well, the Windows server is certainly doing a lot of that damage. All right, I think at this point I am back to the slides.

So those were DTrace first words for us. And we were writing DTrace programs at the command line. And here's just a quick review. And by the time we got to the-- Last example, we basically had introduced all the elements of a DTrace program. So let's look a little bit more carefully about how we express those, the language D that's used to program DTrace.

It's actually a language that's compiled to byte codes. It's passed to the kernel across that boundary we showed in the first layer cake slide and where the code is interpreted. Lib DTrace does the language processes like I mentioned. The language is lightweight, small, simple. Don't expect very powerful features from it. It's got to be interpreted in the kernel when very elementary events are occurring. So you don't want to spend a lot of time working stuff out down there. So for example, there is no control flow. There are no loops in the D language.

No user defined functions. There's some convenience features. You don't have to declare variables much like a shell or Perl script. And for the deep, deep details on the D language and plenty of examples, there are these, the DTrace heavyweight documentation on the SunSite. And they've also put together more like a quick reference card that's handy once you've got it.

So a D program is built up of probe clauses, and you've already seen probe clauses. They consist of a probe descriptor, predicate and closing slashes, some curly braces, and some computation within. The basic unit of the D language. As you've seen in our very first example, we only gave a probe descriptor and DTrace did something sensible. So it's lazy and it will infer actions for you.

Probe descriptor has four components. So they sort of map right onto that catalog listing we did at the outset. Provider, module, function, and name. They describe what probes we would like to enable. They support some wild carding by either omitting an orange labeled piece entirely or there's a simple globbing language that you can use. And it's the only required part of a probe clause. Every example had some sort of probe descriptor telling DTrace, who do I consult to get some tracing information?

We were able to list those using DTrace minus L, that's lowercase l. And here's an example that sort of narrows that down onto just the system calls that have the,

[Transcript missing]

Here's a probe descriptor that says light up all the system call probes, entry and returns that DTrace knows about. This is a little bit more interesting. Here's an appeal to the FastTrap provider to set a probe point on the return of the printf function in lib system in process 1017.

Since we do a bunch of Objective-C on our systems, Apple did an extension to the PID provider called the objcprovider. In this example, the objcprovider, which talks to FastTrap down at the bottom, places entry probes on all the instance methods that begin with the word draw in process 10.17. So you'll see a few examples of that in a little bit.

Predicates, enclosed in slashes, Boolean expressions. It's a condition that's evaluated each time a probe is triggered. We use predicates to focus, to limit the amount of data, to home in on just one particular place in the system that is causing us trouble or instance or... Some examples. Suppose we were interested in all the times Safari Open to file. Here's one way to get at that. We set the provider to be syscall open entry and then filter that down by using exec name.

Suppose it was interesting to see the fifth and subsequent open system calls. Well, we could keep a global counter and increment it within the predicate. When it came true, we'd go on and execute the action statements. Unless otherwise specified, global variables are initialized to zero, so this actually makes good sense.

We mentioned before that the interpreter maintains some built-in variables. Well, one of the things it does is when a probe fires, it collects the arguments to that probe and lines them up on arg0, 1, 2, 3, 4, and 5. So here in this predicate, we've called for arg1 to open. And if you remember open, it has two arguments, a path name and then a flags field, an integer of flags. And did a little Boolean arithmetic with a flag called ononblock. So this example looks for all non-blocking opens that go on on the system at any point.

Action statements. Expressions in the D language, they look a whole lot like C, the usual collection of operators. You can do assignments to keep some state around. You can make calls to built-in functions. We've already seen a little bit of this, or we will in a moment. D has its own version of printf, so you can produce interesting and nicely formatted trace output. It's able to take the stack, the kernel stack, with the stack operator and return that as a vector of numbers.

It can compute lengths on strings. It can do a very interesting and somewhat mysterious operation called copy-in and so forth. We've already mentioned special variables. We've seen quite a few of these: execname, probefunc, arg0 through arg5, timestamp. Every time a probe fires, a timestamp is collected, and for the duration of that action, the business between the braces, the timestamp is held and you can compute with it.

An assignment can create user variables. Just another convenience item. Just mention the variable assigned to it and it pops to life. So some examples use of the printf function. When Safari makes an open call, we grab that string argument from open that's held in built-in variable arg0, massage it with the copy inster built in, we'll talk about that in a minute, and then printf it.

Here we can take that same string and assign it to a global program variable, perhaps to save that name until a call is made, the return from the call is made. It might be interesting for some reason. and finally it might be interesting to collect the stack in user land where Safari made an open call. So let's look at a couple of examples of how that stuff works.

So here's a script whose probe descriptor says, I'd like you to place enable instrumentation at all the open syscall entry points in the system. There are a couple of variants on open, and I wanted to make sure to catch them all, so I wildcarded them. Open, open Unix 2003 for POSIX support, so forth.

And when that probe is encountered, do the following action. Printf the name of the process that did the open and the name of the file that's opened. And I put another argument here I wanted to introduce, the -q option, to suppress some of the labeling information. We'll just see the exec name and the name of the file that's being opened when we run this. So let's send this off to the races. And we can open address book. Oh, yeah, cool.

So in that short sequence of time, the system opened a heck of a lot of files. There's address book getting itself started, calls into core data, interesting, or at least opens the framework to get itself going. So now let's take a look at something that's happening up in user land with DTrace.

The user... Oh, I didn't finish the... There we go. All that for Hello World. All right. So anyway, here's-- Here's the familiar Hello World. It does the obvious thing. Might be interesting to know if, In producing that string output, was memory management called? So let's have a look at that.

[Transcript missing]

Library Call, Lib C Library Call is encountered. All right. Well, this program runs very, very quickly, and it would be a heck of a trick to get DTrace to connect, to start the program and then get DTrace to connect to it. So DTrace puts those facilities together, glues them together by the -c argument. So the notion here is DTrace will do all the language processing, compilation, and transferring down to the virtual machine it needs to do. Then it will start hello, pause it right at the beginning, the necessary probes and then let it run and we'll see what happens.

And so there we go. Hello WWDC was printed. Sure enough, the process exited and then DTrace produced the output that it saw. Oh, sure enough, malloc was called here and here's the stack that we collected. We used the UStack. and David Koehn, and the rest of the team. Thank you. It might be interesting to know just how much got malloc'd here. That's sort of a simple modification.

Semicolon, what the heck is C like? And we'll trace arg0, the only arg to malloc, which is how much we wanted malloc'd. And off we go. And yep, sure enough, there's our stack trace. And we were allocating a page worth of data, 4,096 bytes. Okay, back to the slides, please.

So a bit more about action statements. For kernel hackers, this is sort of a really interesting and important feature of the language. If you know the name of a kernel variable, Pre-seed it with a back quote, and that becomes a variable known to the D language. You can do structure and array access.

If proc 0 is known to be the name of a proc structure in the kernel, which it is, We can peer into it using the dot operator in just this fashion. If we hold a pointer to a PROC structure, we can cast it and chase that pointer. Here's a more elaborate example of chasing down a string of a chain of pointers and finally doing an array access on the CR group structure.

So in all of this, it's evident that DTrace must be symbol savvy, know where the symbols are, and actually understand the types because it knows how to sort of glom through a structure and figure out offsets and know how to dig in and find particular fields. In particular, DTrace has a very deep knowledge of the Mach kernel. It knows every type definition, the form of every struct and union, and every global variable and external symbol.

Perhaps you've seen on your leopard systems down on the root of the file system, /mockkernel.ctfsys. That's what that's about. That's basically where that information is collated for use by DTrace. In user land, up in user land, the PID provider, courtesy of the magic of the FastTrap business, can deal with unstripped binaries and find external symbols in your code. Just like GDB.

Furthermore, if you've got types and structs and unions, DTrace with the uppercase C argument will invoke the preprocessor and can bring in your regular C.h files. So you can include mytypes.h, for example. And then the D language program will have full access to the... Types, Unions, Structures in MyTypes.h. I believe we're on to... Oops.

Ah, yeah. So before doing this demo, I wanted to just set the stage a bit. Here's Hello World one more time, but using a little bit different set of libc interfaces. These are the file stream interfaces. So instead of calling the sort of the simple printf, we're calling fprintf using the global variable std error, which is declared in stdio.h to be a pointer to type file, uppercase.

And it turns out that within the file structure, there's a little integer field that we'll try to get our hands on and look at. And it'll basically demonstrate how we can peer into. .h files with the D language. So here's the script that we'll use to do that. It too includes stdio.h, sets a probe on the entry to fprintf.

grabs hold of arg0 when that probe fires. That'll be standard error, the pointer to the file structure. We're going to copy that into the DTrace virtual machine, get a pointer to it called pfile, and then trace out that small integer file descriptor that's held in the underbar file component of big file structure. Now we're ready for the demo.

There's the program. There's the D. And here we'll run this, again taking advantage of the dash C, twice taking advantage of dash C, both upper case and lower case, to both run the preprocessor, to pull in the definitions in studio, fire up Hello Printf, See the fprintf entry point encountered. Grab arg0 and trace the underbar file, file ID. And sure enough, that's two, right? Standard in, standard error. Standard out, standard error. Zero based. All right, back to the slides one more time.

So I've sort of been putting off talking about copy-in, and actually a good way to do that is to take sort of the broad principles of operation picture of DTrace, sort of the structure and interpretation of D programs. It goes something like this. We have a D program, a script, a file, or perhaps something right on the command line that we feed into the DTrace command line program. It does its language processing. passes that down into the in kernel DTrace virtual machine.

Where the actions are stored off, the probes are analyzed, The subject matter expert is consulted, in this case FastTrap. We're trying to reach up into user land and set a probe in libc. Everything gets prepared. HelloFP starts to run. The trap has been set by FastTrap. Printf gets called, passing as arg0 std::er, a pointer to a file structure.

At that point, FastTrap takes over. jumps into the interpreter in the virtual machine and the action begins to be executed. What's the action? Copy in. From the given pointer in user land, that file structure is up there in your address space, copy that somewhere into the D virtual machine, 88 bytes worth.

There we go. So now in the same virtual, the same address space, the same DTrace virtual machine as the action that's running, we now have the file structure. We now hold a pointer to it, we can dereference it and trace the value out, spitting it back out. If that made sense to you, you've just cleared the biggest hurdle in understanding how DTrace works. That's the model to keep in your head, and it will keep you out of all sorts of trouble, and I think make your development as a deprogrammer go much more quickly.

A few more words about language components. Variables. Globals like foo and bar, you just mention them, they come to life, they have global scope. Here we actually do declare one just for documentation purposes or whatnot, and for initialization. We initialize first open to be zero, and this probe will fire when the open system call is encountered by Perl, an execution of Perl, and it's the very first open that Perl has done. After that, first open gets assigned a timestamp, which is usually a non-zero value, and that probe will no longer pass into its action phase if fired. It'll be filtered out.

There are thread local variables. It may be interesting as threads course through the system, different system calls, different points in your code, to collect information and maintain that state through the lifetime of the execution of a thread. Those are denoted by the self arrow prefix. And here's an example. So again using the system call provider, when Perl calls open on a particular thread, a little piece of state that's forever associated with that thread called self numThreadOpens is incremented.

So there may be other threads that Perl spawns doing other opens. These will be kept sorted out. There are also probe local variables. These are introduced by the prefix this arrow. They're basically alive for the extent of the curly braces. After that, they disappear. Useful for intermediate computations. You just want to hold some computation, not have to repeat it. Just introduce a local variable.

OK, so sort of orthogonal to the scoping are the shapes. I've seen a bunch of scalars. The global foo, the thread local self bar, probe local call this baz. Talked a little bit about associative arrays. Use the familiar square brackets, but the key value there, in this case, FD, can be any type known to the D language.

And then we have aggregations. These look like associative arrays, but are introduced with the @ sign. A very special purpose in D. As I mentioned, hinted at earlier, the problem that needs to be solved We've got multiple processors running simultaneously. We could have identical probes fire essentially at the same time on all those processors. How do we arbitrate access to the state we're trying to collect?

How do we keep one coherent copy of the state? Well, we've kind of fudged it a little bit, and we use something that's familiar from big MP scientific programs, use a reduction operator. Basically, we can maintain copies of that state on each processor, not have to worry about arbitrating between it, but we can only do specific kinds of functions, make specific kinds of arithmetic updates to that state that can eventually be combined at the bottom to reduce, well, it's a reduction operator, to produce the values we're looking for.

These are the arithmetic functions that are permitted. They all have this property that they can be The DTrace is a powerful software tracing tool. Learn how to interact directly with DTrace using command line tools. See how DTrace scripts can be used to understand the runtime behavior of an application, the kernel, or the entire system. See how DTrace scripts can be used to understand the runtime behavior of an application, the kernel, or the entire system. A valuable session for all Mac OS X developers. See how DTrace scripts can be used to understand the runtime behavior of an application, the kernel, or the entire system.

So an interesting Unix function is MMAP. It asks the kernel to give us some new piece of virtual memory, perhaps backed by a file, perhaps just anonymous zero-filled memory. It's going on all the time on these systems. It would not be unusual to see MMAPs firing simultaneously on multiple processors. Suppose we wanted to just sort of sum up what all the MMAP activity going on on the system was. Here's one way to get after that. It uses aggregates, of course.

So the probe descriptor syscall mmap entry. We're going to collect in an aggregation introduced by the at sign the sum of the first argument passed to mmap. That's how much we want to map. And we're going to sort of index that by the name of the process that did the mmap. So off we go. Let's see. Let's open up address book again and quit that. Come back here and see what we got.

So again, on the far side, the name of the process where the probe fired. The side closer to me, the aggregate sum across all eight processors on this machine of the sizes of the M-MAP calls aggregated together and summed. Address book weighing in at a little bit more than 74 megabytes to get itself started. And I wonder if MDS also started up something afresh here since it has just about the same size. I suspect that's the mapping in of the frameworks mainly that we're seeing.

So there's a big number, sort of the big sum. Suppose we wanted to look a little bit more carefully at the distribution of MMAP sizes. Are we seeing lots of small ones, a bimodal, a trimodal mixture? Let's take a look. Clear the screen. And here we'll use the quantize aggregate operator.

Same probe points, collected the same way, but the function that we're using on each of the eight processors is collect the distribution, and at the end we'll bring those together to present them on the screen. So off we go. Address book again. Workhorse. Quit that. Come back here. Bloop. All right.

If you turn your head sideways, what you get is a little histogram with sort of binary, binarily growing bin sizes that tell us that, well, address book made about 49 MMAP calls for 131K chunks. It made 81 for 16K chunks and so forth. And it's actually sorted this out by all the processes that were running on the system and making MMAP calls at the time. It looks like address book was making the most. Sure, it was being launched. One might expect.

[Transcript missing]

Key value to an associative array can be quite complicated. Here it's the vector of addresses that, first six addresses that appear in a stack. So it's possible to focus in on where particular M-Map calls are being made. Let's look and see how that works. It's probably easier seen than spoken over. One more time. Let's use Safari this time. Maybe we'll see some places we recognize.

So we get this collection of backtraces, six deep, right? We asked for six deep user stack traces. Size organized by the amount of MMAP activity that took place at that stack site, at that call site. And as a bonus, the stack's been symbolicated. So here, looks like the LibC malloc has asked for about 16 megabytes. I happen to know that's two zones of a particular kind.

Cash FM Map Data, called into MMap from Carbon Core, asked for another guy who wants 16 megabytes of data, and so forth. So this is a very powerful facility, allowing... You know, sort of really interesting compositing of data across all machines, sort of filtered on a process of interest and displayed in a... Sort of immediate attention grabbing way, you know, you get to see the big numbers at the bottom right away. This is good. Sure.

So the key thing here was we were able to use something quite complicated as the key value to this associative array. Every time that key appeared again, every time that call stack was run through, made the Nmap call the identical place, the sum got added into there. Could have happened on any of the processors, could have happened simultaneously, it's all taken care of by the aggregation. At the end, all that's pulled together and we get these big numbers that order the particular chunks of stack traces. Back to the slides, please.

So a change of gears here. By this point, I expect everybody will be a fruitful DTrace programmer, at least understand the elements of the D language. A lot of work has been done in creating the DTrace toolkit to sort of move you along even more quickly. So the toolkit is freeware. It was made for Solaris, but we found that many of the scripts were interesting, useful, and appropriate for Mac OS X, so we were able to quickly port them for Leopard. These are just handy analysis tools for common investigations, just like the things we've been doing throughout this hour.

They consist of DTrace scripts handed directly to the DTrace interpreter, or D scripts wrapped by shell scripts that do some argument marshaling and pretty printing. They're easily customized. The source is present on the system. You can look at them, copy them, customize them. Browse and learn. They're a great way to learn sort of the deeper techniques in deprogramming. We've got 46 of these things on the system now. They sit in user bin. Got man pages, examples, and Here's the roster.

Not expected to remember that. What would be helpful to remember is that among the most useful of these scripts is somebody called DTrace. This is sort of a Solaris history around the name trust. It's a way to look at system calls. Works just fine on Mac OS X. Name of the command is DTrace.

You give it the -h argument it describes itself. And it's a way to get after system calls. And you can get-- information filtered down by process ID, by process name, or actually execute a command directly by DTrace and watch all its system call activity. So you know underneath the covers there's a DTrace minus C lurking somewhere.

You can drill down and look at just one system call. We can aggregate using the little dash lowercase c option to get call counts aggregated by the probe function, the particular system call. And here's a convenient way to catch times going in and out of system calls. The D, E, and O options. And finally, you might want to catch the stack back traces in your program where the system calls are being made from, minus S. So you know there's a stack action somewhere buried underneath this. Okay, last demo of the afternoon is to take a quick look at DTrace.

Let's see what it's doing. I've got to get that out of the way. Clear this. So here's DTrace. We're going to filter down to just look at what the doc is doing in terms of system calls. And at the end, we'd like to aggregate and just get a rough summary of where the system, how many system calls are being made. So let's make the doc do something. So tell me about, whoa, just even touching the thing, even touching it, sets off a wonderful collection of system calls.

Can you fan that up for me? Great. All right. So there's a bunch going on here. And if we quit this, sure enough, interesting, we see maps and unmaps as the most frequent system call made in that short interval of time working with the doc. Cool. All right. Back to the slides for a few more words.

Sort of a taxonomy of the DTrace providers, just to sum up here. The PID and Objective-C providers, we can see function entry points, returns, and

[Transcript missing]

It's possible to place prepositioned probes in your frameworks and your programs. These are called user static probes, much like the core data probe I showed at the very outset. There's a bit more to be learned about that. Go see the advanced talk in session 935. All those are courtesy of the FastTrap provider.

There's a profile provider that lets you do statistical sampling. Basically it fires off at a fixed rate and when the probe fires you can collect a stack for example and see what's going on in the system. collects some timestamp data of interest. PLockstat, the locking statistics for the userland Pthread library. And finally, in a variety of open source software packages, we're seeing prepositioned probe points to learn interesting things about the behavior of Perl, Python, and Ruby. So you can see how your Perl scripts are behaving and so forth.

Down in the kernel, we've talked a great deal about the system call provider. There's sort of a complementary facility for the Mach side of our kernel. Mach defines some traps like Mach message send, Mach message receive. Mach trap is a provider that can enable and show some information about those probes. And the FBT provider, which we didn't exercise at all today, but offers very deep, detailed information into the behavior of the kernel by instrumenting all the C entry points and returns, or nearly all the C entry points and returns.

The proc provider based in the kernel gives interesting insight into the life events of a process, process start, process exit, process waiting. I/O, start and finish times and volume of I/O being conducted, VM info, virtual memory system, lock stat, internal kernel lock. So to summarize, what I hope you've gotten out of this talk is that DTrace engenders an approach that's to tracing and diagnosis and observing that's incremental, it's exploratory, improvisational, quite a bit of fun. It's just great to dive into your system in interesting and new ways.

Aggregation is an enormously powerful tool. It very, very quickly Cuts and Dices the information in a way that makes it digestible. Often just looking at the bottom few entries of an aggregate can tell you quite a bit about a question you have about the system. You'll find all these DTrace facilities in Leopard and Snow Leopard. And for more information, consult these sources.