Fundamentals of Kernel Debugging - WWDC 2007

Mac OS X Essentials • 59:21

Learn about a variety of kernel analysis and debugging techniques including common types and causes of kernel panics, decoding kernel panic logs, and analyzing deadlocks. Find out what tools are available for kernel debugging such as two-machine debugging, kernel debugging macros, and the new single-machine live kernel debugger. Gain insight into kernel tracing and examination tools like trace, latency, and the new spindump and stackshot tools.

Speaker: Derek Kumar

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Hello everyone, I'm Derek, and welcome to Fundamentals of Kernel Debugging. And so, if this is not the session you were looking for this is your last chance to escape.

( Laughter )

And many of you just came over from the Kernel Programming session and I'm sure that it was very instructive. And one thing to notice are that if you don't keep all of the guidelines you were told about in that session in mind, this is what happens.

( Laughter )

I'm sure you will all agree that it is a very pretty screen and it takes up a lot of kernel wide memory, so I hope that you all appreciate it, but it's not a good user experience. One thing to notice if you enable kernel debugging you'll get something less pretty but more useful on the screen.

A lot of cryptic numbers which in this session you will learn how to recognize all of you know, that stuff on the screen and understand it completely and, you know, be an expert Kernel Debugger after, or so I hope.

( Laughter )

And so what we are going to cover today are several new topics. So first is how to get started with kernel debugging and Mac OS X. We do have a considerable matter of kernel debugging infrastructure on our operating system. And also how to go ahead and analyze kernel panics and hangs so techniques to do all of that, and also some knew technologies in Leopard.

Who should be here? So you know if you are curious about the kernel, fine, but people who will find this session most useful are Kernel Extension Developers, so if you're writing drivers, file systems or other things you know which operate in the kernel address space for Mac OS X, you will find this pretty useful.

And if you are an application developer interested in exploring how your application interacts with the kernel and if you notice any performance issues and things like that if you want to figure out how to exploit you know the kernel facilities better, this is a good way to get started, just to understand the interaction with the system as a whole.

And also if you are a system administrator and you are experiencing some sort of problem or you want to figure out how exactly the kernel works to better deploy your resources, this is a good way to learn how to do first order of triage or that sort of thing.

And anyone who's interested in understanding Kernel internals is a large and complex system and there are many players in the kernel address space, so if you are interested in getting to understand what's going on in the system, this is a good way to get started. And there is a lot of useful Apple documentation on the developer dot apple dot com web site, the ADC Reference Library and so on. And the WWDC Attendees sub website.

I am not going to go through and talk about all of these but you know you can consult the slides later to figure out you know which of these would be useful for you, but one thing I especially recommend is that we put together a head start document for the session a couple of weeks ago, and that is on the WWDC website, and it's a great way to get started with kernel debugging on this operating system.

So if you haven't yet looked at our document I strongly encourage you to go through and go through that document and it's sort of a training exercise cum intro, so you can walk through that with your Leopard System and you know really get a good head start on Getting Started with Kernel Debugging. And so there are a number of other titles on the ADC website which let you get started with kernel debugging. And these are some of them.

You don't have to memorize any of these you can look at them later. And one thing I especially recommend for anyone who is interested in debugging, either user or kernel is to look at this document called Mac OS X Calling Conventions for both PowerPC and Intel. It will really come in handy um not just you know for kernel debugging but any sort of debugging you do on the operating system. It's very good to have an understanding what exactly it is that the processor does when you do a function call or when you are you know are looking at arguments on the stack on so on. And it is a good bit of knowledge to have.

And also there is a pretty good manual for the GNU source level debugger which is what we will be using for most debugging on OS X, and I recommend that you take a look at that. And I also note that some of the technotes do need updates and those updates will be happening shortly I believe. And you know you can consult both this session and the Head Start document which will have the latest bits.

What's new in Leopard so the first thing that's new that's going to affect a lot of you who have been debugging on Tiger is the DWARF debugging format and this is the brave new debugging format which is replacing the old and tired stabs debugging format, which was the default for kernel debugging back in Tiger.

And so DWARF the name is actually a pun on an object format that we don't use at Apple, namely ELF. I don't know what it is about writers they come up with names like Mach-0, and ELF and DWARF but I digress. So there is a new command called Add kext which replaces Add symbol file which is what I used to use back in Tiger. And that operates on three files.

The first is the DWARF dSYM bundle that Xcode will generate for you if you tell it to, and then you need to generate the relocated symbol file for your driver using the kextload command, which is what you used to do back in Tiger as well, so there is nothing new there.

And there's a kext bundle with debug symbols which is again something that you should have had back in Tiger. So all of these need to be in the same directory and you will just pass the address of your kext bundle with this new Add-kext command. And it will do all of the magic to figure things out. And there's a detailed description of this in the Head Start document.

And one bit of breaking news is that there's a kernel debug kit with the debug symbols for the mark kernel file which you know is the operating system kernel. And there is a critical GDB update in that kernel debug kit which you should apply if you want to do kernel debugging using the Leopard Seed.

There were some last minute changes to the DWARF parts there and so and so forth which are pretty essential to kernel debugging. So please download that kernel downloading kit and apply that GDB update before getting started with kernel debugging on the seed. You don't need to do that if you are using the old 9A410 but obviously you want to use the latest and greatest Leopard.

And so this is a this is a screen shot of the Xcode bill settings preference pane and this is what you'll need to do to enable DWARF debugging. You need to select the DWARF with dSYM file debug information format. And you also need to insure that the generate debugging information check box is checked when you are configuring your driver project in Xcode.

And that will generate a separate bundle called your kext name dot dSYM and that's what you'll need to use when you're using the Add-kext command in GDB. And so new tools in Leopard, there is a new tool called the Live Kernel Debugger, it lets you look at the running kernel on the same system as it's doing its thing so it can be pretty useful in some situations. Especially if you are interested in nondestructively debugging the kernel you know while it's doing useful things, rather than interrupting it with an NMI.

And there is a fabulous new tool called DTrace in Leopard. There's going to be seven sessions describing that, but I'll just give you a brief over view of that. And there are some specialized tools known as stackshot and spindump which are also new in Leopard. And that if time permits I will describe those briefly.

And DTrace, I am sure a lot of you have heard of it but if you haven't it's very useful to get a good look at DTrace and see if it can help you with your driver or other kernel facility. It's ported from OpenSolaris and it's a dynamic tracing facility. That means that you don't have to recompile your kernel or add any instrumentation. All of the DTrace instrumentation can be added on the fly and that's your customized probes that you insert into kernel functions.

And these probes are dynamically inserted and there is a special scripting language, called the D Scripting Language, which let's you customize what exactly happens when those probes fire. And the DTrace model is that it's a hub of several different providers and there are several different kernel providers including the Function by Retrace provider, VM Info, Siskull and IO, and a few others. And so these are predefined providers which have predefined probe points within the kernel that you can use to explore VM behavior or system calls and so and so forth.

And I will note that Kernel Extension support is currently rather limited but you know it is still pretty useful. For example; if you are getting an error from some layer of the kernel and you want to figure out why it is that you are getting this error and which functions ultimately responsible for generating that error you can use DTrace. And if you want to figure out what exactly happens when you call in to the kernel for something, this is a good way to get started with that.

And also, I should note that some of you know these DTrace probes do have an observer effect, so you should keep that in mind. And there's a dedicated DTrace Session, Session Number 315, Tracing Software Behavior With DTrace which I encourage you to go to and there's also a very good documentation on DTrace so it's a good technology to learn and there's also front end to DTrace, called X-Ray, and it's a separate session to that and it's very cool looking and very useful.

And there's also something we've had for sometime now, Shark, and the CHUD framework, it's great for performance analysis, it lets you do a lot of things and not just limited to what I have here on this slide it's pretty complex. And you can sta, I should note that you can statistically sample kernel stacks as well as user stacks and this is something that not very many people know. So, if you are doing performance analysis, kernel performance analysis, Shark is a good thing to have.

And it also lets you examine process of performance counters and also is aware of select kernel events like system calls and pagefulls including copying rights, , page in page out and lots of other things and it's a great way to do performance analysis on Os 10, including For Your Driver. And there's a dedicated Shark Session, Session Number 316 Performance Tuning with Shark. And it's a good session to attend to get a feel of what Shark can do.

So the primary-interface to kernel debugging on Mac Os 10 on GDB, the open source new source level debugger. And we have, we also have a number of Kernel debugging macros to automate a lot and abstract add a lot of useful information of you know to present a lot of useful information without having to grabble through the internals of the system.

So it's a standard source level debugger and this is what you will use even if you are debugging user space applications with Xcode. And the GBD manual is a good way to get started with understanding this debugger and it contains a lot of useful tips and in the kernel debug kit that we supply you, we supply a lot of GDB and macros which you know introspect and what kernel data structures and let you let you look at thread stakes and the schedule stakes for all of the threads and Kernel wait cues and wait events and all of that.

And examine panic logs and the kernel system log ring buffer for I/O Log or print f style messages. And there is a comprehensive list of these macros if you type in Help KGM after loading them within KGB. And there is information on how to do all of this in tech notes that I described earlier.

And to examine the kernel, there are three primary mechanisms we have on Os 10. The first is the two machines or interactive debugging environment. And this is this is what you'll probably do when you're actually developing your driver, you want to do quick turn around debugging, you connect to The debugger on the target kernel using either FireWire or Ethernet.

And the kernel is interrupted at this point either because it's panicked or you press the NMI or programmers button on the system to force it into the debugger. And it can you can do this either either over the Ethernet or FireWire. And the second approach is to use kernel crashdumps which is essentially a Marco core fold which encapsulates all of kernel state.

And this can also be obtained either on Ethernet or FireWire. And the third one is something that is new in Leopard, which is called the Live Kernel Debugger. That is not for post mortem debugging but for examining the kernel as you go along as it's actually doing work. And it's pretty useful in some situations.

A brief over view of two machine debugging, as I mentioned it's it will quickly become a staple of your developing process. And it is an interactive debugger you can set break points and have step control over kernel so you can step over instructions and so on. So this is the kernel so you have to remember that you know that some there are some critical portions to the kernel which don't take kindly to this sort of thing so but most you know your driver code should be immune to this sort of thing but something to keep in mind.

And it can also do a post-mortem debugging, so if your system is panicked or if it is hung for instance you can connect to the system and examine all of the kernel data structures and structures within your driver. And there are some things the two machine debugging can do which aren't available; the other two approaches which is letting you examine physical memory directly so your positive physical address and can read the contents of the physical address.

And you can also set some options to let you look at memory mapped device space so you can if you want to look at the registers on your device for instance, you can do that using the two machine debugger. And also you can look at user pages so if you're if they're paged in, so if you want to look at the stack trace for a user thread that's called in to the kernel for instance to figure out so if you have a user competent to your driver, this is a good way to find out what exactly it was doing to cause whatever situation it was that you were interested in looking at. And this can use both Ethernet and FireWire.

And crash dumps, there's a pretty good tech note which describes how to set-up kernel crashdumps. So you can get again do this over Ethernet of FireWire. And for FireWire, you will need to download a special program from the FireWire SDK, called FireWire CoreDumper. And you will just need to run this on the host system which you are debugging from.

And for Ethernet you, there is a special daemon which will receive all of the kernel debugging Kernel crashdump packets from the target. It's called K dumpd and Leopard ships with a preconfigured launch d P list to enable this K dumpd launch d daemon. And on the target system what you would do is either enable crashdumps permanently via the kernel boot-args which are stored in RAM, or you can enable it indirectively after connecting with the two machine debugger.

So for example, at Apple we don't enable kernel crashdumps for every system on campus because that would not scale very well. But we do you know when people tell us that they have a panic, we do connect to that system with the kernel debugger and then interactively trigger a crashdump so that we can look at the information later. And you can enable this either using the NV RAM command or on Intel-based Mac you can use the boot dot P list file to enable this on a participation basis.

And so a brief comparison of these two techniques so crashdumps, so if you are receiving information from a customer's side for instance you can't easily do interactive two machine debugging although it is possible to debug a machine in Japan, it's not very pleasant. And plus they will need to keep it around for you for however long it takes you to look at the system.

So what you would tell them to do is set up a crashdump server and have them generate a crashdump and send it over to you. I will note that these files seem to be large, so you probably don't want them having them email these to you, so set up an FTP server or something like that.

And interactive debugging is primarily useful in the developing phase, or if you have an in-house QA team which is reproducing issues for you. It's really good to interactively debug so that you can look at your device registers and device stage and that sort of thing. And it also lets you separate points and you know step through your codes, so that is a pretty useful thing to have.

And in comparison of the two transports FireWire does have some advantages over Ethernet and some disadvantages in that you know you have to be physically in close proximity to the system. But some of the advantages are that it is available earlier in the boot process. Ethernet does have to be configured and the driver for the network card has to be loaded with Ethernet debugging.

So FireWire, the FireWire debugging facility is specifically tailored to load early in the boot process. And it also remains available longer during Sleep Wake. So if you are debugging sleep wake issues, FireWire debugging and the FireWire log in facility, FireWire K printf, those are good things to familiarize yourself with.

And it's also faster, FireWire is a faster transport, plus it's you know close to your machine so it's a sort of a given. And it may be an only option when you're debugging a network device driver, and that driver is bound to the primary interface or, if it, you know you don't want to perturb the state of the network stack too much.

And it may be preferable for security reasons because the Ethernet packets are being transported in the clear. Whereas FireWire is a point to point transport and if it's going over the cable that you have connected to your machines. And so, it's a I will also mention FireWire K Printf that's a high speed log in facility that is very useful for tracing events from within your kernel extension so that you can log things and look at them as they happen.

And there's also a specialized tool called FireWire GDB available from the FireWire SDK. This has no analog within the Ethernet transport and that's basically a facility which uses Fire Wire DMA to look at the state of the system after it has panicked. And this may be your only option in cases where your machine is irretrievably hung and you can't force the debugger using the NMI button, or if it's not able to enter the debugger for some reason.

For instance if the boot processor which is where Mac OS X services' all external interrupts is hung with interrupts mast you can't force it into the debugger so you might have better luck with FireWire GDB to look at all the memory on the system, because it's FireWire DMA.

One thing to note in Leopard is that you may need to add the high memory mode equal to one boot-arg. That's something we are looking at and it will be addressed in time for the release but just something to keep in mind if you have a 64-bit system width lots of memory. And FireWire GDB does disable other, the rest of the FireWire stack and that's not true of FireWire KDP, which is the regular two machine debugging transport and also the FireWire crashdump facility, that doesn't disable other FireWire devices either.

And so the live kernel debugger is something that's new in Leopard. I am looking for better names to describe this facility. So if you have any suggestions feel free to let me know. So it's minimally intrusive. It doesn't require the target kernel to be stopped and it's obviously not for post modern debugging but it's a good way to explore what's going on with the system right now without having to pause the system and, you know, disrupt whatever its doing and it only requires one machine.

So this is the way you would go about configuring, configuring this debugger. These instructions will be in a tech note at some point. So you would set the MB RAM boot or kernel boot-arg kmem equal to one. It does use the deb kmem interface to address kernel memory and any other kernel debugging bootargs you might want to add to that, to that line is, it's a good place to put those in.

And you restart and then you start up GDB, you're seeing the symbols from the Kernel Debug Kit that you will download from the attendee website and it does need to be started as a super user. Obviously you're examining kernel memory so it's a privileged operation. You would say target Darwin kernel as opposed to remote KDP for the two machine debugging case and then attach and then you're connected to the debugger on the system. Then you would load your driver symbols and kernel debugging macros as you would with the regular two machine or crashdump facility.

And most kernel debugging macros can be used except the one that we're actively altering state on the kernel. That's something that's not good to do when the kernel's running. Also something you should keep in mind when using this facility is that all the data you see is not, necessarily coherent because it might be changing actively while you're looking at it, especially with kernel stacks.

So if the thread is blocked for some reason, which is typically why you would start up this facility, to explore why something is stuck in kernel space, the thread stacks tend to be stable. One good way to ensure that things are stable is just to look at them two or three times in succession to ensure that it's not changing actively.

But a lot of, you know, facilities will be pretty much the same as two machine debugging and you can explore the state of the system using this, this kind of light kernel debugging. And at this rate only we decided to make that the case because, you know, you can seriously compromise the stability of the system using this facility. And unfortunately I have to go through these all over again.

And it's a great learning tool so if you want to understand how the kernel reacts to some external event or to some user initiated event that you can do side by side while having this facility running, you can just figure out what exactly the kernel does in response assuming your timing permits that.

And it also lets you analyze block threads and analyze kernel data structures, for example, to figure out what it is that's not releasing a certain rough count or not letting you unmount a file system, something like that. And it also lets you do low impact logging so you just have your driver log a lot of data to a ring buffer, for instance, in the kernel and you can pull that data out using this facility and it can be completely non-blocking and it's a good way to do, you know, non-disruptive logging and tracing.

And it also omits the need for you to write user interfaces to any statistics and other data you might want to export from your driver. So if you want to keep statistics on the number of packets you've sent or processed you don't need to write to write a an interface or ask another interface to pull this out. You can just look up kernel memory directly.

Obviously if you want to ship that to your user that's not a good idea to tell them to start at the debugger. So, you know, it's mostly for interactive, you know, debugging while you're developing your application. And it lets you, lets you introspect, you know, kernel data structures as they're actively changing non-intrusively.

So moving on to kernel panics. So, so let's go through and look at the type of panics that could occur on the system and what exactly panics are and what they mean. So what triggers a panic? The first case is a processor generated exception, which the kernel was unable to service. So this could include page faults, for instance, it's just like user space where you could get a SIGSEGV or SIGBUS signal when you're not touching, when you're touching memory that you haven't pre-allocated, you know, the analogy holds for the kernel.

And if the page is protected or not there you will get a page fault and the kernel will throw up its hands and say, Oh, you don't allocate this memory for me so this is probably an error, so here's a panic. And if you attempt to generate any legal instruction, on Intel you can often see panics, which are kind of like this but because of the variable instruction size you may end up at an instruction boundary, which actually looks to the processor like a valid instruction but it's not the instruction that you intended for the process to execute so you might get a panic further down the line. That's something to keep in mind.

And so of interest to driver developers, you might see machine check exceptions and that's typically an error in the memory subsystem like, for example, an error on the transport bus that you're using to access to your device. If your device is not ready and doesn't respond to bus transactions or load history from the processor in time you might get a machine check. But it also might signal a hardware error. Some exotic possibilities include L2 cache parity errors and things like that.

So it's a, you know, there are documents from the processor manufacturers like Intel and IBM describing exactly what these machine checks are caused by and it's good to look through those documents. And in Leopard we do log additional data about machine checks. So it's a good way to understand, you know, examining the data to understand what exactly could have triggered that machine check. Typically it happens when you access your device when it's not yet ready or if it's wedged for some reason.

And Intel-based Macs you do have a few other exceptions that could occur. So you can have division exceptions, division of load or divide by zero and things like that. You're probably not doing this in your kernel extension unless you're doing something like audio for instance, but it's good to know.

And general protection faults, those can be triggered by a variety of conditions. I encourage you to look up the processor manual to figure out what exactly could have caused the general protection fault. And you can also see, you could also see double faults in some circumstances. Typically that means you have overflowed your kernel stack. That's signaled by a double fault because we keep a guard page at the edge of the kernel stack.

You can also have assertion style panics. Typically this happens when the kernel detects some sort of inconsistency at one time and it decides it can't proceed because of this erroneous condition and it throws up its hands and says, Ok, assertion failed here. We can't continue any further. And it's the responsibility of the driver developer or whatever it is that triggered that assertion to fix that issue.

For example, if you, if you decide to acquire a spin log, for instance, we do use time spin locks in the kernel so if the time out expires on the spin log acquisition you will get a panic. And there are a variety of different lock manipulation errors, which can be signaled by an assertion failure type panic.

And Mac OS X does have a fully preemptable kernel but under certain circumstances kernel preemption is the disabled such as when you hold a spin lock and if you attempt to block from that context or from an error context, for instance, someone asked that in the earlier session, you might see this sort of panic and out of memory style panics, which are typically signaled by a zone allocator panic.

On Intel based Macs we must maintain TLB coherence using software generated inner processor interrupts. And the curious thing about this is that it almost acts as a software watchdog so if one or more processors are spending way too much time with interrupts disabled, you will see this sort of issue.

So here we come to the infamous panic log. A lot of you have probably seen this on the screen or, you know, from the panic report after you restart your machine that says, Hi, the system had to restart unexpectedly and here is this cryptic looking report. So we'll go through and, you know, analyze what exactly is there in this panic log.

So the first line on this processor generated exception style panic is the type of exception. For instance, this is probably the most common hardware exception that you will see. It's a page fault. And, as I mentioned earlier, that's typically caused by touching an allocated memory or a page zero, for instance, which we keep unmapped to catch pointer errors. And another thing to note is this is an Intel panic, there are analogs for all of this on PowerPC.

On Intel the CR2 register contains the address of the page that you were trying to load or store from when you got the page fault. So that can be useful in certain circumstances to figure out if you caused the panic by touching address zero or address something or the other or an offset from a certain structure pointer, which happened to be zero for instance. And the most important thing from this register display is the effective instruction pointer.

So that's the address of the instruction which triggered this fault. So in this case it might be a load or store to this, to the address contained in the CR2 register. And for all those types of panics you will, it's still pretty useful to know what the effects of instruction pointer was.

And so you will also notice that in this particular panic right below the backtrace there is a little display saying kernel modules in backtrace. And so the names here have been changed to protect the guilty but, So in this case com dot apple dot sum driver triggered, well could have triggered a panic because its addresses appear in the backtrace.

And you will also, You will notice that two addresses belong to the range described in the com dot apple dot sum driver line right below the backtrace. It doesn't necessarily mean that this particular driver was the approximate cause of the panic but it's always good, you know, to verify that your driver wasn't performing some sort of erroneous operation, which lead in turn to a panic being triggered. This is not, This is just a general rule of thumb.

And in cases of memory corruption and things like that you have to realize that the kernels have large shared address space so something else entirely could have triggered this issue, which shows up later as a panic with your driver in the backtrace. So don't panic if you see this panic with, you know, the your kext in the backtrace but do take a look and make sure it's not the case that your driver is performing some bad operation.

So here are a few assertion style panics. These are the panic messages that you will see when these inconsistencies are detected by the kernel. So the first one is something I briefly described earlier. It contains the string zalloc and the name of the zone, which was exhausted. And the kernel uses a zone allocator, which, And this is the allocator which backs much of the allocations that you're likely to be doing.

A lot of the allocate, allocators are backed by the kernel zone allocator. This is somewhat similar to a slab allocator, which you might have seen on other operating systems. And what it, Basically it's an out of memory solo session. the kernel couldn't find enough pages to satisfy this allocation request and it threw up its hands and said, Ok, out of memory and we can't proceed further.

And another common sort of issue that you might see while developing your kernel, hopefully not, in release drivers is this particular sort of cryptic looking message, thread invoke preemption level N, and N can be one or two, some low integer. And that particularly means that you or someone else try to block while kernel preemption was disabled and that's an illegal operation because something has previously signaled to the kernel that this is, you know, this is an operation that is uninterruptible by the kernel scheduler and then yet that particular thread is trying to block.

And one common possibility is that you've taken a spin lock and you're trying to proceed to acquire a kernel mutex or, you know, proceeds to do some sort of blocking operation that directly or indirectly while holding that spin lock or you could be trying to block from an interrupt context.

This could also be signaled by this particular style of panic and please pay attention to those because those are pretty serious errors and, you know, you may not always see this sort of issue, especially if you're trying to, say, acquire a mutex while preemption is disabled because 99 percent of the time that particular mutex might have been free and only one percent of the time it actually caused the mutex log retain to blocks. So it's a good thing to analyze thoroughly.

As I mentioned earlier kernel spin locks do have deadlines. So if you're acquiring a spin lock either directly or indirectly and for some reason that spin lock cannot be acquired you might see this particular type of panic, simple lock, deadlock detection And that means that the spin lock acquisitioned routine thought that the lock was not held for a pretty large interval, 100 plus milliseconds, and that could be either because someone's holding on to the log for too long or there's some sort of memory corruption, which has caused the log data structure to be corrupted.

And as I mentioned earlier on Intel we have to maintain a TLB coherence using inner processor interrupts. So, for instance, if one processor is stuck with interrupts mast for too long it will eventually see another processor which is doing some other operation, say, Ok, that processor isn't responding when I try to talk to it so this is probably something that's not good for the system as a whole, let's panic. And so one good thing to, one good tool to, you know, run every so often while you're developing your driver, especially if you have a routine that executes in the primary interrupt filter context.

So if you have supplied a routine, as Dean mentioned earlier, to design good shared interrupts, for instance, that's a typical use. But if you're actually doing work in the primary interrupt context it's good to run this tool called Latency, which will show you the systems interrupt latency as a whole. So if you notice that this Latency tool is showing excessive interrupt latency, typically, you know, more than 13 microseconds is kind of high.

It's good to figure out what exactly your driver is, if it is your driver, is causing that issue. And there is, Latency can generate traces from the kernel tracing facility to be triggered when the successive interrupt latency is detected. Latency can also show you, schedule a preemption latency. So it's a good tool to run once in a while.

So here's a typical analysis workflow to, you know, deploy the kernel debugger and you've detected a panic or a hang. So if the system is panicked you would typically try and connect the debugger assuming it is an accessible system. But, you know, if you're getting a panic log from the, from the field from your customer and you're customer hasn't set up kernel crash terms, all you may have is that little screen we showed earlier. And often that's insufficient but in many cases it might be sufficient to give you a handle on the problem.

So it's basically the backtrace for the thread that caused the fault. And as I mentioned it's a large shared address space and there are many threads operating in this address space. But, you know, if it's a page fault you can at least tell which thread caused the page fault and, you know, which line of code caused the page fault and you can eyeball the code and figure out if there are any conditions under which, you know, this pointer could be bad, for instance And if you, If that's not sufficient you can always go ahead and reproduce the issue in house.

And one little side note about debugging philosophy, if I may. So there's two, You know, you can, you can approach a debugging issue in one of two ways or both ideally. One is the top down approach where you figure out what exactly was going on at the time of the panic, for instance, and then try to instrument those code paths to figure out the data flow at the time of the panic and, you know, approach the problem from, you know, the user space side of things or from the initial point at which this issue could have occurred.

And the other approach is the bottom up approach where you have a panic and you use the debugger to backtrack from the time of the panic to what exactly the root cause of the panic might be. And ideally you would employ both approaches, especially for the harder sort of problem, which isn't readily apparent.

The best way to solve those problems is to get, you know, a complete understanding of the picture and use both the debugger and your knowledge of your own code to understand the panic. There is no silver bullet to debugging. It completely depends on your having an understanding of your code. I'm sure I'm stressing the obvious but it's good to note.

If the system is hung, in other words, you have a, you know, a system that's not responsive to, you know, user events like mass clicks and keyboard interrupts and so on and so forth and you can't SSH into the system, it could mean that there's a kernel level hang and, you know, it's always good to get those fixed as soon as possible.

So you would force the debugger, force the machine into the debugger using the NMI button. And depending on the system model at the NMI button, which is actually a bit of a misnomer, might lie at different spots but on most recent machines once you've configured kernel debugging you would just hit the power switch on the machine and that will tell the kernel to enter the kernel debugger and then you would attach with the kernel debugger.

One very useful kernel debugging macro, which you will find that the kernel debug kit mentioned earlier is the show all stacks macro and this will essentially generate a kernel, a stack trace for all kernel threads on the system. And so you can get a complete picture of what all kernel threads were doing at the time of the panic or hang.

And one thing to note is that, you know, we are a Mach descended kernel so we do employ this technique called kernel continuation. So some threads can discard their kernel stacks in favor of continuations. This helps, you know, save wide memory. And also, you know, if at all possible that you can restart an operation when receiving a certain signal without relying on information on the stack. You can tell the scheduler that you're discarding your kernel stack. And typically those threads with continuations are not very interesting from the hang or panic analysis perspective. If you're every puzzled about what's going on, that's what happening. I'll be happy to explain that in greater detail later.

And so I hope this is visible to you. So this is what you'll see. This is a fragment of the showallstacks effort. So, for instance, the first thing I have circled here is the name, the name of the process to, for which the kernel stack that's being displayed corresponds to. So threads, you know, have both a user space and a kernel competent. And if a thread is blocked it will have a kernel space competent in the kernel stack. So this is, for instance, this is the window server process and this is its process ID.

So one important thing to note is that most of the kernel synchronization primitives are backed by condition variables and in mock terminology these are wait cues and wait events. So whenever you block on a mutex, for instance, it, you know, signals that it must be woken up by stating that, you know, it's waiting for a certain condition variable. And when the owner of the mutex unlocks that mutex it will basically send a signal to the owner of the condition variable using the wake up mechanism.

And when I say signal I don't mean the user space concept of signal but just a notification to the scheduler to wake up that thread. And this is, this is the, basically the address of that condition variable. So if your thread is blocked this is the condition variable it's waiting on. And there can be, If you have a deadlock, for instance, there can be numerous other threads with the same weight event.

So it's a good way to do first order analysis of which particular threads are blocked on which particular condition variable. There's also the scheduler state for the thread, which is displayed by the show all stacks command and in this particular case the W character indicates that the scheduler thinks that this thread is currently in a wait state. It's waiting for a wake up basically.

And this is what I mentioned earlier with the kernel continuation. For example, this is a mock IPC message queue received continuation. So that particular thread has discovered the kernel stack and when it's rescheduled it will resume executing using this continuation routine. And, yeah, that's basically the address of that continuation.

So, for instance, if you, One good thing to note, whenever you have a Z ALOC style panic or if you're, if you just want to figure out, you know, how much memory is, you know, break out of the kernel memory and use, there's a pretty good macro called Z Print.

This does have a user space analog that you can run without entering the kernel debugger also called Z Print. And it will display the statistics for the internal zone allocator describing, you know, basically the breakdown of in use elements and maximum number of elements of for each zone.

And so a lot of kernel allocation requests come from the K Alloc allocator and that basically has several dedicated zones sized by the size it's rounded up to. So it's a good thing to look at the K Alloc zones too. If you, if you think you're using a memory backed by the K Alloc allocator, especially in situations where you have the Z Alloc zone exhaustion type panic. This is a good kmount to run.

There are also some I/O Kit specific kernel debugging macros like show all I/O Alloc and show all classes, which display the I/O Kit resources and the clause instance counts that your driver is using, for instance. So it's a good way to determine what exactly you're using in terms of I/O Kit resources backed by the generic kernel mechanisms.

There are user space analogs to some of these. There's an IO clause count command, which displays the instant count for, instance count for I/O Kit clauses. And going back to the memory allocation issue, there's a very useful user space command called VM start, which displays statistics for the kernel virtual memory manager.

One good thing to note is show all stacks just basically displays macro generated backtrace for each thread. If you want to examine the threads state directly using GDB's backtrace command, you can use this kernel debugging macro called Switch to Act. That basically tells the debugger to repoint its context to the state for the other thread that you're pointing it to and you would path the thread address, which you'll obtain from show all stacks to this kernel debugging macro. And then you can use backtrace and commands like frame to switch down to the stack frame and examine locals and things like that using the standard GDB commands.

If the system is hung, for instance, what you would do is you would examine all the kernel stacks displayed by show all stacks to figure out which threads are basically blocked waiting for some event to happen. And one particular shortcut you can employ to determine which threads are blocked in the same resources to look at the wait events for various threads.

And if one of those threads is blocked in a particular wait event and, you know, just several other threads are blocked in the same wait event it's possible that all of those threads will have similar looking stack backtraces but you can end up waiting for a particular resource in multiple ways.

This is a good shortcut to employ, just look at the wait events to figure out which particular condition variable they're waiting to be woken up on. And you would choose one particular block thread to analyze and then you can, you know, try and figure out which particular resource it's trying to acquire and figure out the owner of that resource.

There are many different possibilities for deadlocks. You can have the classic circular deadlock where two or more threads are involved and basically there's a cycle and a graph of their resource dependencies. And you could be trying to talk to a device that's unresponsive while holding a lock, for instance, and other threads are trying to access that particular resource or that lock and they're all blocked waiting for the thread that's trying to access the device.

And if you have a user space dependency that the kernel thread is stuck on, which is not always a good idea. You could end up waiting for that user space dependency to be satisfied. You could have a process that's died for some reason in user space and you're waiting for that to get back to you or you could be stalling other kernel events. And there are many other possibilities. Here's one particular sort of case study. It's a classic two way deadlock that was detected recently by someone on our networking team, Adi, whom you might have heard speak earlier.

So there's, there's a kernel thread from the networking card, which is trying to acquire a mutex in the UDP layer and you will notice that there's a frame which says UDP lock. The UDP lock routine is in turn calling one of the kernel mutex primitives LCK MTX lock and that mutex is, you know, And then the lock acquisition routine noticed that that mutex is not free and it proceeds to block waiting for a wake up. So what you would do in this case is you would employ the Switch to Act command to switch this particular thread.

The thread hasn't panicked so, you know, or been interrupted so you would need to switch the debugger away from whatever thread it was interrupted in and switch to this particular thread. And then you can switch to the frame, the lock mutex lock frame with the debugger using the frame command and dump out the contents of that lock mutex lock structure.

We are going to be giving you macros to obstruct away the internals of this lock structure but for now you can just directly display the contents of this mutex lock. And you will notice that there's a field called LCK MTX D locked and that basically holds the address of the thread, which has locked the mutex. So you will notice that there's a number X3912804 in the locked field.

So subsequently you can, you know, display the state of that thread which owns that mutex and then you can do the same thing again, switch to act that mutex. And you will notice that mutex also is trying to acquire a mutex lock. So when you switch to that thread you will notice that it's trying to acquire a different mutex lock, which is owned by the first thread, which is trying to acquire a lock held by this thread.

So this is a classic two way deadlock. Pretty simple to analyze but there can be more complicated variations on this. You can have, you know, ten different threads all trying to sort of chase their tails and trying to acquire mutexes which eventually are owned by something else in the chain.

And so this is how you'll typically get started trying to analyze hangs. There are numerous variations on these. I will note that we do have a kernel debugging lab this evening so if you're dealing with some particularly intractable issue we will be glad to help look at that issue and tell you how to go about debugging this sort of thing.

And one sort of good technique to employ, which will ease debugging is, so whenever you notice an exceptional condition, especially in the development phase it's good to log that condition. It might end up in the system log but if you panic before that at least you will have access to trace messages in the kernel system log ring buffer.

And, you know, obviously it's not good to be very verbose so a lot of people are going to be looking at this panic log so if your application or driver is very chatty, that's, you know, kind of annoying. There is a kernel routine called OS backtrace, which lets you dynamically generate stack traces so if, for example, if you try to analyze a leak or something, for instance, or a rough counting issue, you can log a backtrace for, you know, each rough count increment, for instance, to figure out who is not letting go of that last reference count, for instance, or you can employ it in many different ways.

There is a kernel debugging macro in Leopard called system log, which dumps out the contents of the kernel's messaging buffer. So if you IO log or print from within your driver, for instance, that's where it'll end up. And depending on how verbose the rest of the system is it might still be in memory. If not it might have been pushed out to disk.

The other thing to note is that, you know, you can't always rely on the debugger displaying the right thing. It's an unfortunate but true sort of aphorism. And DWARF is a great step in the right direction, you know, stabs was kind of inadequate for the component debugger to describe the complexities of automized code.

Dwarf does have a lot of support for displaying things like line ray splitting and register motion, all of that, to figure out where exactly, If you're interrupted at a particular point in the function, where exactly your local, which might be in a register, lives in the stack, for instance. So things are improving in that area but this is why it's almost essential to have a good understanding of the ABI and calling conventions for the platforms that you're interested in.

So, you know, if you notice something looks weird that's not because of panic it's always good to double check that using your knowledge of calling conventions for the platform. And, again, as a general rule of thumb, you know, always program defensively. You know, this might be as simple as checking the return value of malloc and, you know, checking that it's not nulled before trying to access that thing that you malloc'd.

But, you know, also more comprehensively, you know, try and make sure that you don't have any assumptions that cannot be relied upon. Obviously this takes a backseat to performance so, you know, you can always have a development version for your driver that is sprinkled with lots of assertions to trigger a run time, that you can enable a run time during the development phase.

And some miscellaneous points. So this is stuff that, you know, we didn't cover elsewhere and doesn't really fit into any predefined category. You will notice that a lot of kernel behavior is controlled by bootargs. So kernel behavior can also be controlled by sys ETLS. There's a sisiteal interface to a lot of these things.

To set kernel boot-args you can use the MB RAM command. That does require super user privileges and it does have a man page. So that basically sets rights to an area and the flash on the device and the kernel or the boot-arg will read that on start up and process on to the kernel.

And on an Intel based Mac you can also pass these kernel boot arguments using this particular P list. The behavior of the kernel debugger is, you know, mostly controlled by this one particular bootarg called the debug bootarg. This is a pretty good list of what all the debug bootargs mean and the kernel crash down tech mode 2118, which is !listed earlier.

Some other useful bootargs to know are the maximum bootarg, counts the amount of physical memory available in the kernel. It's a good way to test your application or driver under low memory situations, for instance, without having to actually physically remove the DIMS on the device. And dash V enables verbose boot and, you know, a lot of the messages generated by the kernel and drivers end up on the screen during start up, for instance, and it's a good thing too.

Look through, while things are going on, especially if you're trying to determine why exactly your kernel isn't quite booted yet. Something might have logged something and you can look at it on screen. And there's another bootarg called CPU's. It's an alternative to the CHUD preference frame. It basically lets you control the number of CPU's that are visible of the scheduler.

So if you want to, if you suspect a sort of a multi-threaded erase that happens under, you know, an MP sort of circumstance you can say CPU is equal to one and restart the kernel and that basically makes the kernel UP at that point and you can figure out if the behavior of this particular, the issue that you're looking at varies in that circumstance.

There are a few options that you could use to ensure that the, you know, to trigger run time consistency checks for the zone allocator. One such option is the dash ZC bootarg and that basically tells the zone allocator to check that freed elements aren't being used after they've been freed. It's sort of analogist of what you would do with malloc and user space. And we are working to provide a better sort of memory analysis facilities for the future.

And other bootargs, there's a pretty useful bootarg called IO equals two and the various options that you can set. It's basically a bit factor. And that controls a lot of I/O Kit debugging facilities. You can say IO log should be synchronized or log all registry modifications, for instance. And, you know, there are a lot of options here that can catch various, you know, I/O Kit level manipulation errors. So it's good to explore that. And you can look at the slide or look at the source code when it's available.

That brings me to another point, and a lot of you are going to pounce on me for saying this but we do release the source for a lot of our kernels, not always immediately but they do get there eventually. So we're available to source as your best documentation as, you know, Who said use the force, Luke? Use the source, Luke.

And there's also a special kernel configuration called the debug kernel and this is something we're going to look into releasing in the Kernel Debug Kit in the future but for now the open source kernel you can, you know, build it in debug mode and that enables a lot of run time consistency checks and assertions, especially in terms of locking and VM level operations. It does catch, you know, a lot of frequent or over sights like things like releasing a mutex that you don't own and things like that.

Or if you have a recursive mutex, you know, you are the only one who is allowed to lock in once and if someone else is locking it for you that will be signaled. So for more information you can contact Craig Keithley, who is the I/O Technology Evangelist for this track. And there's documentation for all of this and you can always download my slides later and look at those and there's also, you can always contact DTS or developer tech support for help with your driver.

And I do want to stress that there is a kernel debugging lab this evening and all of you are welcome to stop by and, you know, pose questions to us or have us walk through the kernel debugging process or look at a particular panic or, you know, crash that you have and we will do our best to help you.

So to summarize, you know, there are several different approaches you can take to kernel debugging on Mac OS X and you can, you know, by understand, by developing a good understanding of these facilities you can really shorten your workflow and your driver development process. It's really essential for post modern analysis to get a good understanding of what exactly happened after you've had a panic, for instance. And there are some new tools in Leopard, as I mentioned earlier.