Video hosted by Apple at devstreaming-cdn.apple.com

Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2011-321
$eventId
ID of event: wwdc2011
$eventContentId
ID of session without event part: 321
$eventShortId
Shortened ID of event: wwdc11
$year
Year of session: 2011
$extension
Extension of original filename: m4v
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: [2011] [Session 321] Migrating f...

WWDC11 • Session 321

Migrating from GDB to LLDB

Developer Tools • iOS, OS X • 55:25

LLDB is the next-generation debugger for Mac OS X and iOS. Discover why you'll want to start using LLDB in your own development, get expert tips from the team that created LLDB, and see how it will help you track down bugs more efficiently than ever before.

Speakers: Sean Callanan, Jim Ingham, Caroline Tice

Unlisted on Apple Developer site

Downloads from Apple

HD Video (164.4 MB)

Transcript

This transcript was generated using Whisper, it may have transcription errors.

Hi, everybody. My name is Jim Ingham. I've been working on GDB for, God, a decade here at Apple. And for the past year and a half or so, we've cooked up this new LLDB, which is our attempt to make a modern version of something that will do all the tasks that GDB has done for you in the past. And so we're here to tell you kind of how you can start moving over to using LLDB from using GDB if you were a person who used GDB. So here's what we'll talk about. I'm going to talk a little bit about the LLDB command line and just its basic structure, how you can get used to using it. And then a couple of the other members of the debugger team will come up to talk about some power user features. So Sean Callanan will come up and talk about how to make good use of the expression parser for programmatic data introspection. And then Caroline Tice will come up and talk about how to make use of the LLDB Python bindings which can do many things, but kind of like automate some complex debugging tasks for you. So first of all, there's this LLDB project. So what is that? That's our modern replacement for GDB. It's actually an open source project. It's part of the LLVM project, thus the LL. It's supposed to stand for low level, but it's not a low level debugger. Anyway, whatever. It's an open source project. It's--most of the work has so far been done by us, but anybody who is interested in debuggers or wants to poke around at the guts of the system is welcome to take a look and give a hand if they want to. It's hosted at the LLVM site, ldb.llvm.org, as you see. So we do take advantage of being a part of the LLVM project, and we use the Clang type system for our type system and for our expression evaluation, which is nice, 'cause then they get to do work for us. We also designed it from the start to have very efficient handling of debug information. So it has an incremental dwarf part, whatever that means. Anyway, what it means is that hopefully it will be faster startup times and lower memory usage, particularly as your projects get larger and have more and more debug information. Threads are a first class citizen, as opposed to GDB, where they're kind of tacked onto the side. So that's becoming important as threads become more ubiquitous in your programs. And finally, we designed it from the start to have a first class, powerful scripting component. In this case, we implemented it with Python, but it actually uses this swig interface generator. So if you said, I hate Python, I must have blah, blah, blah, and then it's an open source project, we'll tell you it's an open source project.

Why don't you go fit it to whatever blah, blah, blah is? And then you would say, no, I'll use Python instead.. So then what is LLDB itself? So it's two things. On the one hand, it's a system debugger library. So that's what, when you choose LLDB in the little run sheet in Xcode, then it's loading the LLDB library and using that to implement the debugging. And also, of course, that same library could be used in other tools, and particularly the Python bindings make it great for you guys if you have a need for some little hand-built debugger app for QA or something like that. It's perfect for that. And then it's a command line debugger. So it's available in terminal or when you, again, choose LLDB in the little run sheet in Xcode, then that's what you'll be interacting with in the Xcode console window. So even though you might say, I have this powerful GUI debugger Xcode, why would I want a console? But there are really good reasons to have a console, even in modern, powerful GUI. One is that if you know you want to get to a particular complicated piece of information, then you type it in, you hit a return, you get right there, you see the value, you can use the history recall and get it again and again as you step and so on. But the other thing which is really useful is that as you're going through a debugging session and you say, oh, the value of this is 5. I wish I could remember 10 minutes from now that the value was 5 now, then you just type it into the console and the value gets printed. Then you debug 10 minutes later and you think, oh, God, I forgot. And you can go back in the console window. So it provides a nice little history trace. So what does console LLDB look like if you were actually using it? So this is an example using it in terminal. So I'm going to start it up. I give the app that I'm debugging on. It's going to tell me, oh, I chose this architecture, whatever. And then I might do things like set breakpoints. So I set a breakpoint here. It tells me that it's at the breakpoint and where. And then I might run. So I just type run. So now I'm running, and I do some stuff and hit my breakpoint. We kind of made a little more information come out when we stopped. So we give you a little more source context. We tell you the reason why the thread stopped and stuff like that when it hit your breakpoint. And then I might do things you're familiar with, like I might PO something. So here I'm POing self. And then I might next, and so I next. So if you look at that, you think, well, what I'm using is roughly GDB-like. I mean, it's got all these nice little GDB commands, so why do I need to have a little talk about the syntax?

And the problem that we found is that although the GDB-like commands are very concise, and once you know them, they're really easy to type, they're irregular, they're hard to learn, and I've talked to many people over time who have been using GDB for quite a long time, and they'll say, God, I wish I could do thing X, and actually GDB does thing X. It was just impossible to find how it was done. So that's what we wanted with LDB, was we wanted to make an underlying command set which was very regular, well structured, easy to learn, easy to discover new features, be consistent across all the commands so that once you knew how to say I want five of something in one command, five of something in another command would be exactly the same. And then there's a powerful alias facility that's added on to the side, which is how we create this set of quick access GDB-like commands. So although the title of the talk is Migrating from GDB to LLDB-- actually, I'm just going to talk about LLDB-- when you're sitting down to use it, that's when you need to know, oh, I remember how to do x in GDB. How do I do that in LLDB? So we have a little tutorial up on the website, which is specifically for that purpose. So what does the basic syntax look like? So all the commands are in the form some object, some action, some options which modify the object-action combination, and then arguments if they're necessary.

So for instance, when you're setting a breakpoint, although I showed you in the slide before that you said B, whatever, the real thing under the hood is you have the breakpoint object, which is what you use to operate on whenever you're doing anything with breakpoints. The action in this case is set. And then here's the option value pair. I'm setting a breakpoint by name, and main is the name. Or again, delete, so that is the object, the action. And then in this case, we have arguments, the breakpoint to delete.

So options, this is all standard Unix option parsing stuff, if you're familiar with it. Options have a sharp and long form. They can appear anywhere. So here's a case where I have an argument, and then the option value after it, and it didn't freak out. And then sometimes when you have arguments that have dashes in them, you have to tell the parser, hey, that's not an option, that's an argument, and dash dash is used for that. So here I have an option value, and I have arguments, and the dash dash makes it clear that even though dash run has a dash, it's not an option. Words are white space separated, and you protect them with quotes, and you protect the quotes with backslashes, just all the standard stuff. There are a couple of commands where the parsing rules don't apply, but rather you say the command, and then everything after that is unparsed, particularly the expression and script command, because they have complex data as part of the arguments, and it would drive you crazy if you had to backslash everything. So in the basic syntax, we're going to favor option value pairs, if we can, over arguments, because they're easier to document, and they reduce the dependency on argument order. But it also means we can start to do some cool stuff when we know exactly what's on the command line. Like if I'm setting a breakpoint, and I say the breakpoint's going in a particular shared library with the dash dash shlibs option, then when you tab complete the name, I can only look in that shared library, and I won't bring up spurious hits. So we try to do that throughout whenever we can. And of course, we do shortest unique match. So you would never type breakpoint set unless you're really proud of your typing speed. But you'd type something like BRS or whatever.

The command system was designed so that you can't add a command without having good help for it. You can't add options without having good help for it and stuff like that. That was to force ourselves to do it. But that means that there's help on everything. There's help on commands. There's help on command actions.

One other cool thing is in the case of arguments, sometimes you'll see an argument in angle brackets, like in that breakpoint delete instance. Whenever the argument is in angle brackets, that means it's something which you can also get help on. So you can, in this case, help break ID, and it will tell you what that is. There's also an apropos, which does search into the arguments, into the help. So apropos delete or whatever. And then tab completion works in help.

So as you're finding your way around, actually one convenient thing is you just say, tab, and it'll give you the list of commands. And then you type the command name, tab, and it'll give you the list of verbs. And then you can keep going from there. So the LLDB command objects, which is the first level of things you kind of need to know about in order to find your way around, are represented in the command syntax by these top level commands like target, thread, breakpoint, whatever. Sometimes they're two words when something's logically contained in another, like target modules here. And then in some cases, of course, you have objects which exist-- we have many of them, like you have many threads in a process or something like that. And so we have a consistent convention that list always lists the things there are when there are multiple of them, and select always selects the one that you want to focus on.

We auto select whenever that makes sense. So like if you stop at a breakpoint, we'll set the thread and process and frame to the thread and process and frame that hit the breakpoint. And when you have objects contained in other objects, like a frame and a thread, then setting the frame gives you the context select frames, just normal stuff you'd expect. But again, the reason for doing this is that it gives you a quick way to find things. So if you're thinking, how do I find out how to do something? How do I do a backtrace? And then you think, well, OK, I know the objects which are available, so which object would hold a backtrace in it? Well, maybe thread holds a backtrace. That's the guy that's responsible. So I go to thread, and I type thread and a space, and then hit a tab, and it says, here are the actions. Oh, there's backtrace right there. And then I can use help. So that's kind of the way you can find your way around if you know this consistent structure is maintained throughout.

So the next thing is, obviously, the objects are the things that are important. So I'm going to give you just a brief tour of the most salient objects that you'll need to know about. The first of which is the target. So the target is a particular program that you're going to debug. So you create it here by giving it the target create command, and then you give it the program you want to debug and potentially an architecture if it's a universal file. In LLDB, more than one target is allowed, unlike RGDB. So you can make many targets and then use target select to pick the one you want. Note breakpoints are specific to a target. So if you have two targets, you don't have to worry about breakpoints from one leaking over to the other. And then the target holds the modules. So that's an instance of a two-level thing. And then you can look at them, but that's not so interesting. The next thing that's important is the process.

So the process specifies a running instance of a target. So you would create it with process launch or process attach. And in this case, it ended up being too confusing if you could launch five processes for a target. So there's only one process per target. If you actually do have to run five instances of the same program, you would just create multiple targets. So there's no select or list. And then the process is where you continue, kill, whatever, your process.

The next interesting one is the thread. So the threads are in the process. So once you've selected a process, then you'll be looking at the threads for that process with thread list. As always, thread selects a particular one. The other thing that the thread does is it controls all the step in, step out behavior. One kind of clever little thing that you might like is that when you issue any of these step commands, you can say whether you want all the other threads to run or only this thread to run. So that's with this run mode option.

And then the thread does backtrace, as we showed. So then the next one is the frames. That gives you the access to the frames, of course, and select and list work. The other thing that the frame is good for is frames have local variables in them, and they have arguments in them. So that's what the frame variable command does. It shows you the local variables. It shows you the arguments if there are any. This is just arguments in this case. And then the selected frame, of course, sets the context for viewing registers and for viewing expressions. I'm going to talk about registers and not expressions because Sean will talk about that. So in the register case, that gives you read and write access to the registers in your selected frame. And we give you access by name. So just like you'd expect from GDB, there's RAX, RBX, whatever the native names of your architecture are.

And then there's some convenience names, so like the PC is always PC regardless of whether it's EIP on x86 or whatever. But then there's another one which is convenient sometimes. So there are arg1, arg2, arg3 names. Those, be careful, because they are actually just the register that arguments get passed in the calling convention of this particular machine and system. So they're only valid for word type things. If you're passing a structure, it's obviously going to stripe across registers or something, and you won't be able to see it. They're only guaranteed to be good at the beginning of a function, and of course, there are only as many of them as your ABI actually passes in registers. So like, I386 doesn't pass any in registers.

We tried to make a few little convenience things when we print out the registers, which ends up actually being surprisingly useful. So if you see here's a register dump, but in this case, one of the register values pointed into a string section, so why not look up the string for you, which we do, and one of them pointed into text, so we tell you the function name in the text. That's convenient.

So the main thing I want to tell you about now is this alias facility, because I think that as you get used to the program, you'll find that you can mold it to your working patterns using this facility. So as I say, having this kind of turgid object action, dash, dash, something, something, it makes it really easy to find and document, but it makes it hard to type. So you have to start having accelerator commands, obviously. And by default, we ship with a GDB-like set, which should most of the common operations that you'll need to do.

They're all listed in the help, so if you do just help itself, you'll see the list of built-in commands and then the list of aliases that we've set up for you. But again, as you work, you may do something all the time which is not one of the things that we expected you would do all the time, and so we want to allow you to be able to write a shortcut for it yourself. So there are two kinds of shortcuts. There's a simple one, which is positional aliases, and And then there's one that if you have learned regular expression and you go hunting the world looking for uses of regular expression because you learned that R thing and you got to use it for something, then there's one that you can use that for.

So the positional aliases. These ones are just trivial to write. The command that creates them is this command alias. You say-- oh, I just clicked too much. Anyway, never mind. And then the alias name, what you are going to type, and then the substitute command line, which is the command that will get run. So in the simplest case, this is just straight substitution. So you say command alias step is going to be thread step in. So then when you type step, then what's actually going to get run is thread step in. If you're keeping a sharp eye, you would have noticed in the case when I set a breakpoint, it actually-- I used in the first slide, I said be something, and it actually printed out the command that it actually ran and then did the command. So you can learn how that works for yourself. And if you've made an alias, then any additional arguments typed after the name of your alias will just get appended to the substitute command line. In this case, I'm doing dash avoid no debug false, and it's just getting appended to the end. The reason they're called positional is the other little bit of sophistication is that you can route arguments from the command that the user types into the substitution string, which is particularly useful if you have two option value pairs and you want to jam stuff into the two values. So the way you do that is you put percent and some number in the command line, the substitution command string, and then that will get filled with the argument of that number in the command that actually gets run.

So here, for instance, I've made an alias which has a count in percent one and a start address percent two. And then when I type it, then it gets substituted in, but in the wrong order because I didn't do the slides right. And then, of course, any additional arguments are appended to the end, just like you would expect.

In this case, though, all the arguments are required. So if I said there was a percent 1 and a percent 2, there's got to be a percent 1 and a percent 2. So as you start seeing kind of a more flexible behavior pattern that you have all the time and you want to make an alias to capture it, then constraints like that, it has to have two arguments and can only have one substitution form, are going to start to limit you. So that's what the regular expression aliases are for. So for instance, here's a problem that I might want to address.

So the disassemble command has two forms. It can take a start address, and it'll disassemble some number of lines from that address, some number of of instructions from that address. Or it can take a symbol and disassemble some number of instructions from that symbol. But in C, particularly if I'm willing to type all my addresses in hex form, then it's actually really easy to tell hex addresses from symbol names. Because in C, symbol names always begin with A through Z, capital A. And hex addresses always begin with 0. So I should be able to make a little alias that says, if there's one argument and it begins with 0x, then I want to disassemble from an address. And otherwise, I want to disassemble from a function name, and if there's none, for convenience, let me disassemble around the PC. And in each case, maybe I want to always do 20 lines.

And finally, if I don't recognize-- and when you write these kind of alias, this last step is really quite useful. If you don't recognize the string the user typed after the alias name, then you just route it to the full command of whatever the command is and hope it can deal with it. So the syntax for this, basically, again, you have to know the regular expression language. But if you've learned it, you really want to use it. So that's fine. And it consists of-- and this will make you feel even nicer-- these little awky-looking patterns where you have substitution of a match string for some substitution string. So you have a list of those. And the first match string that the user typed after your alias name-- and of course, the name of the command gets stripped off first. You don't have to put that in. That's the one that wins. And that's the one that's going to get executed. And so also, in regular expressions, you can have parentheses match substrings. So the match substrings in numerical order go into percent nums in the substitution string, just like the positional alias. And you can provide help and usage. So the syntax looks like this. You would say command, regex. You'd give the name that you want the user type and help in syntax if you want, and then the substitution string pattern. The multi-line entry makes-- So this is a thing that we do in LLDB in general. Whenever there's something complicated you have to type as arguments, you can usually hit a return, and then you'll get a multi-line entry form for typing them. I'm not going to go through this in detail, actually. If you have no regular expressions, you'll be happy, and you'll figure out that there's a bug here. But you can look at it on the slide. So these are the patterns. There's an address match pattern, a name match. I'm running short on time, so you can look at these on the slides. So altogether, right, you figured it out. And you actually found the bug, right? And you're going to come up afterwards and say, hey, for Objective-C names that you didn't carry into the space. Whatever. Anyway, so-- So altogether, the command looks like this then. I would say command regex-- I'm going to call it dfancy or whatever. And I gave myself help because I'm going to forget how it works. And then you hit a return. It will very nicely tell you what you have to do. So then you'll type them all in, whatever, whatever.

I'm going to go really fast so you can't see my bug. And then this is just like I didn't lie. We could do it in a demo. But the slides I could have just typed in. So I might be on lying. I don't know. But the help works. And then the command works.

OK, so summarizing this little subsection of the talk, what I hope you come away with is that to get started, what you have to do is figure out how the top level objects are laid out, and then remember that help will always tell you what you need, and tab will always tell you what you need. And then once you get more familiar with what's going on, you can use the shortcut aliases that we provided for you, actually we'll probably use those first, and then start typing the more verbose ones as you get further in. And then finally, as you watch your behavior, you can start to construct these shortcuts for yourself. So with that, I'd like to turn it over to Sean, who's going to tell you about the Expression Parser. Thank you. SEAN BENNETT: Thank you, Jim.

I'm Sean Cowan. I worked on the expression parser for LLDB. And I'm very excited to show it to you today. Now there's going to be some excitement. There's going to be some drama. But first, let's start with the basics. You've got a program, and you'd like to use the expression parser to examine it a little bit. Now Jim just sold you on it earlier, telling you you could do all sorts of low-level introspection. Now let's see if that's actually true. Now you're sitting at the LLDB prompt, and the first thing you do is you've got to run your program.

Then, so you set a breakpoint at a particular stop, or place where you want to stop, and then you type run. You may be familiar with the syntax from GDB if you've used GDB in the past. And just like you would expect, the program will run and hit your breakpoint.

Now you can use the expression command. Now I'm going to type it out here just so you can see the full name of it. But you can also shorten it, and I'm going to be shortening it later in the slides. So here we're going to just type a very simple arithmetic expression, nothing fancy.

And what you have happen in the program is it's like a set of curly braces is inserted into your program, creating a new scope. And your arithmetic expression is put right in there and evaluated. Now, what happens after the evaluation is you get the result back as a new result variable. Now, your result variables are kind of cool. They're stored inside your program's memory, so you can access them like pointers. And they stay around for as long as you're debugging your program.

Then, once you've done this, your code gets taken back out because obviously you don't want to continue and keep running 3 plus 5. And when you continue, your program just runs as normal. All right. Well, that's the simple arithmetic expressions. But I doubt you need LLDB to compute 3 plus 2. So let's do something a little fancier.

So I've inserted a new expression. So notice something, first of all. I've shortened expression down to expert, because I need space on my slides. You still get your set of braces inserted, but you can do cool things, like you can access your program local variables. Now, that's not too fancy. I mean, it's kind of cool. But let's go and try something even neater. So you just accessed a variable that already existed in the program, right? Well, maybe you need a new variable.

Now we get to see the power of LLDB's expressions starting to peak out. LLDB's expressions are actually using LLVM under the hood to compile your expressions for you. When I said there was a pair of curly braces in there, that's actually because you're getting a scope to execute some code.

So if you declare an integer variable and you actually use that variable, Also note that I have a multi-line expression here. I pressed Enter after the expert command, so you can actually enter multiple lines. You get these multiple lines inserted into your new scope. And now you can use your temporary variable.

That expression local variable, though, goes away after your expression completes. Now obviously, that's not all you might want to do. You might want to keep your expression local variables around. Now you know the user variable -- you know the result variable that I showed you earlier, the little $0. What if you could create your own user variables? Now you may--some--the GDB gurus among you may know that you use the set command for that.

Well, we've got the same kind of concept except you do this all from inside the expression command. Inside the expression command now, you can actually create a dollar variable that stores your value for you in the program's memory for as long as you're debugging the program. So I'm going to say $i equals 3 and then compute $i plus 2. The $i variable gets inserted as a new global.

The instructions referencing it get inserted into your program, and they run. Now, I've showed you how to access simple variables, how to create your own ones. In C++, things are going to act like you expect. Your code is going to get inserted into the method you're stopped in. And you can access the C++ member variables. And if you're in an Objective-C object, it gets inserted into the Objective-C method. And you get to access your instance variables. You're not going to get any nasty surprises.

All right, well, I showed you a bunch of stuff. Let's just go through the summary and then get on to the fun. The drama, I promised you drama, right? So you've got in-scope variables that you can access just using the expression command. You can kind of think of it like GDB's print command for this kind of purpose. You can also access your globals and variables and your functions for which you already have debug information. Now, there's a little bit of a wrinkle here, just as there was with GDB, by the way, that if you don't have debug information, For example, if it's a function for the--that you got from the standard library, then you have to cast the return value for it. And if it's a variable you got from the standard library, you have to cast the variable to the type you're expecting it to be.

You can declare your own expression local variables that live as long as the expression does. And you can create user variables, declaring them once using the special dollar sign. and then referring to them in later expressions just whenever you need them. All right. Now let's do something a little bit more fun. Let's debug an RPN calculator. Now, RPN is reverse perlite notation. And what you do is you type in numbers. And then you type in operations that operate on the numbers you've already put in. So in this case, we put in 7 and 5. And we said plus. it pulls in the last two numbers and adds them. Now the way this is implemented is usually as a stack. So when I type seven, seven gets pushed onto the stack. And when I type five, then five gets pushed onto the stack after it. And then when I do the plus operation, I get a 12 at the end, which is kind of what you'd expect. I'm not gonna debug that kind of simple bug for you. All right, now let's do something a little bit more fun. First I'm gonna plus seven. And now I'm going to try using the add command. Oh! I'm sure you've seen this error before.

So you're stuck at a segmentation fault. What do you do? Well, if you've used GDB in the past or you use Xcode, you know we opened the debugger. So let's open the debugger and let's think about how we're going to deal with this. Well, the first thing you do is you run LLDB on your RPN calculator. Now, you type run and your RPN calculator is running inside LLDB. Then we reproduce the bug and the program crashes. LLDB catches you. Now let's try running a backtrace to see what we can do about this.

Now this is the moment where some of you are going to start saying, ah, because we have no debug information. This is all assembly offsets. So you're at add plus 33, which means 33 bytes into the assembly code for add. What the heck are you going to do here?

This is where LLDB is going to come to your rescue. So the first thing we need to do, because we want to be at the beginning of add, where the arguments are kind of in registers where you'd like them to be, so we're going to set a breakpoint on the add function.

Now we run again. LLDB asks you, "Oh, wait, do you want to restart your program?" Fine, it's an RPM calculator. It's not exactly doing a lot of internet interactions and stuff like that. So we're going to say seven and plus and reproduce your bug. So now when you do a backtrace, you notice you're right at the beginning of the add function. This is where you can get your arguments out of registers and that's what we're going to do.

I'm gonna first of all pull out the first argument for the add function, and you're gonna have to believe me that that's a pointer to the stack. So when you do that, you get the hexadecimal output of that. Notice I used the dash dash format command here. The dash dash format option tells LLDB what format you'd like your output to be in from your expressions. So x means hexadecimal. You don't have to remember this. There's help on it.

Now let's try doing something fancy with this. First of all, I'm going to enter a multi-line expression, again, by pressing Enter after I type expr. Now I'm going to redefine the type that was missing because we didn't have debug information. I redefine my type and create a new user variable of that type-- it's actually a double pointer-- and I'm going to assign the $arg1 argument register to that new user variable. This user variable now has the type. I can actually access it.

This is something that wasn't really possible in GDB. You had to use weird hacks, examine memory directly. Why do you want to do that? Then you have to remember all these obscure commands. You really want to just be using types. And because we've got Clang under the hood working for you, you can do that.

So, now let's look at the stack entry that we've got on the top of our stack. Well, the value is seven, which makes sense because we typed seven in at the prompt. But the next pointer is null. And we're in the add function, so it's looking at a null pointer to try to get out the next element. Well, that's, you know, that probably doesn't come as a surprise to all of you, but we've got a stack with one element, and we're trying to add two elements. Well, all right, fine. So let's try to fix this.

You're going to use the push function from your program to push a new element on the stack. Remember that we don't have debug information, so you have to cast. And then, if you now look at the next element of the top of the stack, you see, aha, there's something there.

So now it's safe to continue and let add do its work. And now we've fixed the problem. We've got 10 on the top of the stack. And hopefully you see how we've done that. Now the next thing I'd like to show you is something a little bit cleverer and something that showcases even more for you just the power that's under the hood here. Now let's do stop and add again after putting some numbers onto the stack.

We are going to enter a multi-line expression again. And now we're going to have to enter the struct that we entered before. Now why is this? It's because, remember, we had a pair of curly braces that your expression is being executed inside. That means that there's a scope there. And things like the types that you temporarily defined here are actually going to go out of scope for you. So you've got to remember to redefine your types if you actually create new variables of that type again. So here we're going to create a depth counter, and then we're going to use a for loop. We're going to use this iterator called current to walk across our stack, and at each time we walk across an element, we're going to increment our depth counter.

Finally, we tell it to report depth. We press Enter, leave a blank line, which I didn't show you here, but you have to do that at the end of a multi-line expression. And then you get the result out, which is 2. Now this is kind of cool. This is something you really couldn't do. There's this for loop here, where you're using pointers and iterating across a data structure, which incidentally wasn't even in the debug information. This is really where you're starting to see that Clang is letting LLDB do things that you just couldn't do before. All right.

And the real power comes when you start to use this every day and realize, oh wow, I don't have to worry. So let's summarize. So you can use the expression parser to interact directly with your code. You can use registers, you can use variables, you can use functions just the way you would expect.

You can create your own user variables remembering the dollar sign, just the way you would declare a normal C variable. And you can reconstruct the program state with the help of these type definitions without debug information. You can use full objective C++ in your expressions. Remember that for loop that we wrote?

These and you can, and these kinds of features are what's making LLDB the best debugger you've ever used. So let me show-- so the help expression command will give you more information on the expression command's options. Most of them center around the kind of output formatting I showed you earlier. And in any case, you should just try to explore it. Try it out on your own. See what you can do. Thank you very much for your time. I'm gonna pass you on to Caroline, who's gonna tell you about Python integration.

So hi there. My name is Caroline Tice, and I'm an engineer in the debugger group. And in this part of the talk, we're going to be going over scripting and Python in LLDB. Now, I know that a lot of you have been using GDB for years, and you've never written a script, you've never wanted to write a script, you can't imagine wanting to write a script. And you're probably asking yourself if now wouldn't be a good time to go get a cup of coffee. Well, the short answer is no. If you miss this part of the talk, you're really going to end up kicking yourself. We've made scripting in LLDB extremely easy to use, extremely easy to access. We've fully integrated it with Python and we've added a lot of extensions to make it more powerful and to make it better for accomplishing some of your debugging tasks. In short, scripting in LLDB is actually going to allow you to do a lot of things you've only dreamed of being able to do in GDB. So let's dive right in and see what I'm talking about.

What can -- excuse me. What can you do with scripting in LLDB? Well, to begin with, it's going to allow you to set some really useful conditional breakpoints. So remember in LLDB, you actually have access to the call stack as an object. This means you can query the call stack and get useful information out of it. So therefore, you can set conditions on your breakpoints based on things like the name of the calling function. This is especially useful if you're programming with Apple frameworks, which have lots of little functions that are called everywhere, and you really don't care about 90% of the places where those breakpoints are hit. You just want to see the breakpoints where they're hit in your code. So now you can actually say, I only want to stop on this breakpoint when it is hit and the calling function has a particular name or when the calling function's arguments have particular values. If you're doing multi-threaded debugging, you can say, I want to collect information about which thread hit this breakpoint, and I only want to stop at this breakpoint if it's hit by a particular thread. You can trace the execution of a particular thread. You can even record information about which thread hit it last time and say, I only want to stop there if the same thread hits it the next time or if a different thread hits it next time. Anyway, you have lots and lots of options for setting breakpoint conditions on your breakpoints that you just didn't have before.

In addition to setting breakpoint commands, you can also use scripting to help you traverse your dynamic data structure. So you've got a huge tree or a huge heap or a linked list. There's a piece of data that's missing or it's wrong or you want to see if it's been inserted multiple times. You want to find the path to your data. And you can actually write useful Python scripts to go through your data structures and help you figure out what's going on in your code now more easily. In addition, you can actually record information at breakpoints about your registers, about your local variables. You can write this to Python variables. You can write it to files using Python system calls.

You can do this every time you hit your breakpoint, so you're starting to build up traces. You can do this across multiple runs of your program. So this helps you collect a lot of useful information when you're debugging that's going to let you get your tasks done more easily. Finally, you can do this with-- use Python scripts to help you do automated testing in QA. So especially if you have some of these really evil bugs that you only hit once in a blue moon, You can write scripts to help you collect data around where you think the problem is. And then you can write Python scripts to run your program over and over again, collecting data every time you run through it, and eventually it'll hit the problem and stop, and you'll have this great set of data to help you do your debugging.

So this is just a little idea of some of the things that you can do with scripting in LLDB. Now you've probably -- I want to talk a little bit about what you actually have where. You've probably already figured out by now that you have Python embedded within LLDB. What you probably have not figured out is that you also can have LLDB embedded in Python. So you can actually run Python from the Unix prompt, not use Xcode, not use the LLDB debugger, load the LLDB Python module, and be off and running LLDB inside Python. So I'm gonna talk about this for just a second 'cause it's useful for some automated QA and testing, and then we'll get back to the main program, which is Python in LLDB.

So if you're going to run LLDB directly from Python, the first thing you have to do is you have to tell Python where to find your LLDB Python module. So that's going to be located in your resources Python subdirectory below your LLDB framework, which is going to be on your, wherever you installed Xcode, usually in your slash developer directory. Once you've told Python where to find your LLDB module, you start Python at the Unix prompt, you import your LLDB module, and there you go. You're off and running. You can now create a debugger, you can create a target, you can set breakpoints, you can run your process, you can do all of your debugging tasks directly straight from Python. You've never touched Xcode, you've never touched the LLDB debugger. Thank you.

For more examples on how to do this, you can look at the LLDB test suite. We do this a lot in the LLDB test suite, so there are some great examples there that you can look at. Oh, these functions that I've highlighted in yellow here, these are part of the API functions that come with your LLDB Python module, and I'll be talking about them a lot more later on in this talk. So moving back to our main program, which is Python inside LLDB. LLDB contains a full and complete Python interpreter. This means you get all the syntax checking, all the parsing, all the exception handling, all the error handling, the system calls, the modules, the whole nine yards. We have all of Python available to you inside LLDB. We've also made it very easy to access. There are many different ways you can access Python from inside LLDB. To begin with, you can use the one-line script command. So this is what you do if you have just a little bit of Python. You want to execute it, but you want to stay in your main LLDB prompt environment. You don't want to drop into Python in particularly. So in this case, I've said to Python, I have this decimal number. I want you to convert it to hex, give me the answer, and let me just stay here in my LLDB prompt. And that's exactly what it's done. If you want to do more complicated or interactive Python stuff, you can drop into a Python interactive interpreter from inside LLDB. This is going to be just like typing Python at the Unix prompt, except we've already preloaded the LLDB Python module in there for you, and we've set up some additional extensions to make it easy for you to be off and running doing your debugging tasks. Thank you. And finally, as I said, you can also attach Python scripts to your breakpoint commands, so this is what it would look like to do that.

Now, we've added some enhancements to the Python in LLDB to make it even more useful. As Jim was saying at the beginning of the talk, LLDB is actually a debugger library. So as with all libraries, it's got an API. You can have functions that you can call to create and access and manipulate all of your debugger objects and all of your debugger state. The complete API, all of these functions are available and callable directly from Python. Also, whenever you halt your execution in LLDB, you have an execution context, and you've We've got a target object, a frame object, a thread, a process. We've preloaded these objects for you into Python convenience variables. So whenever you drop into Python, you're going to have LLDB.target, LLDB.process, LLDB.frame, LLDB.thread. They're there sitting, waiting for you, ready for you to use and go off and call your API functions on them. Finally, we've set up the Python interpreter so that you have a single interpreter for your entire debug session. This means that you have a single underlying context for the whole session, so you can create a variable with a one-line script command, define a function that uses the variable in your interactive interpreter, call the function from your breakpoint commands, and they all communicate, they all share the same state, you can access everything from everywhere. So all of these together make scripting in Python and LLDB very easy to use and very useful. You can accomplish a lot of powerful stuff. Now for the next part of the talk, I've gone over the basics and I'm going to spend the next part of this talk going over an example of actually trying to debug a problem using scripting in LLDB and seeing what we can do with it. So I've created a simple program. It reads in a text file and it stores the words from the text file in a binary search tree. You can then ask the program about a particular word, is it in the tree or not, and the program will tell you yes or no. Now this wouldn't be much of a debugger session if we didn't actually have a problem in our program. So of course, there is a bug. I've read in the text for the play Romeo and Juliet, and I'm asking about certain words, and it's finding some of the words you would expect it to find, but for some reason it cannot find the word Romeo. Now we know Romeo has to be in the play, so what is the problem? Well there are several possible reasons why it might not find the word. One of them is that maybe the word never got inserted into the tree, maybe the word got inserted into the tree in an unexpected location, so the binary search algorithm just isn't finding it. So our first problem is going to be, how are we going to figure out whether the word is in our tree or not? If it were a small tree, we could traverse it by hand and examine all the nodes, but of course, Romeo and Juliet contains thousands of words. The tree is much too big to search by hand. So the answer is, of course, we're going to write a script to do it for us. The idea goes something like this. We're going to write a recursive depth-first search function in Python. We're going to put it in this tree-utils file for two reasons. One of them is so we have it to use again later on if we want. And the second reason is because as with all interactive interpreters, the Python interpreter is not very forgiving of typing mistakes and you don't want to have to type the whole thing over from the beginning if you make a small typo. So we're putting it in a separate file. Before I go on with the talk, I wanted to mention -- I forgot to mention this earlier. If I go through this example too quickly and there's stuff you didn't catch and you're worried about that, don't worry. The entire example is up on the LLDB website with complete source for everything and more detail than I can possibly go into in this talk. So you can always go to the website and get whatever you missed from the talk. Anyway, so we're going to write this depth first search function in treeutils.py. Our depth first search function is going to have several interesting parameters. The first one is this root parameter, which is actually going to be a node, a binary search tree in our program. The second one is the word that we're searching for. And the final one is a string representing the path from the root of the tree to whatever our current node is in the tree.

We're going to attach our program, our LLDB to our running dictionary program. We're going to use the interactive script interpreter to call our depth first search function on our binary tree. And then the depth first search function is going to return the path to the node if it finds the node in the tree or an empty string if it doesn't. So let's see what this looks like in action. Here we are. We've attached to our running dictionary program. We drop into our interactive script interpreter. The first thing we do is we import our file that contains our depth first search function. And again, if you want to see the script for this, the code for this, you would go to the website. And the next thing we do is then we try to take our binary search tree and put it into a Python variable so that we can pass it to the Python depth first search function. Now, there are a couple of interesting things in this line. The first one is that I'm making use of one of these convenience variables that I mentioned earlier. whenever you halt your execution, LLDB automatically takes your current execution context and puts it in these convenience variables for you. So I'm going to use the frame convenience variable. I'm going to call the find variable API function on the frame convenience variable to say frame, find me your variable that has the name dictionary. We're then going to take this variable that we find and we're going to put it in the Python variable named root.

Once we do that, we're going to actually call -- we're going to initialize our other parameter, which starts out the current path as an empty string because we're starting at the root of the tree. And then we're going to call our depth first search function. Our depth first search function, we have to tell Python that it's in the tree utils file that we imported. So that's what that's all about. And again, we pass it our binary search tree and our search word and our current path. print out the path to see did it find the node, and sure enough, it found the word Romeo in our tree, and the path from the root of the tree to the node containing Romeo is left, left, right, right, left.

At this point, we are halfway there. We have found our word in the binary search tree, so the next question is, why didn't our program find it? And how are we going to figure out what the problem is? And the answer is, we're going to use scripted breakpoint commands. So the idea is something like this. We know that our word is here. We know that a binary search algorithm has two decision points. There's the decision to go right and the decision to go left. We're going to set breakpoints at each of these two decision points, and then we're going to attach a breakpoint script command to each decision point. The script is going to compare the decision with the path. As long as the decision matches what the path says we should be doing, we're going to continue executing. As soon as we come to a decision where our decision differs from the path, we're going to halt execution and say, here is our problem.

So this is what our Python breakpoint command would look like. Before I go over this in great detail, there's something that I need to tell you about. Whenever you write a breakpoint command in LLDB, a Python breakpoint command, it's going to take the code that you wrote and it's going to wrap it up in a Python script function. So it actually wraps it up in a function and it's going to pass in two more convenient variables for you. So it's going to pass in the frame where the breakpoint was hit and the breakpoint location object for the breakpoint that was hit. Now there are two important things you have to remember from all of this. The first one is that these two convenience variables are there for you whenever you write a breakpoint script command, so you can just assume that they're there and you can use them. The other thing that you have to remember is if you want to use a Python variable that was defined outside of your breakpoint script command, you have to tell Python it's a global variable.

Otherwise, Python's going to think it's local to the script and you're going to get unexpected behavior. When you actually hit your breakpoint when you're executing, Python is then going to call this function, it's going to pass in the correct frame and the correct breakpoint location object for you.

So going over the code in a little more detail now, the first thing I've done is I've told Python that we're going to use the global path variable that we already got from our depth first search function. So we're using this variable that we got in one point and we're using it in another. We're going to compare our current decision to go right with the path, with the beginning of the path. If the path says yes, you should be going right, then we're going to strip the first character off the path and then we're going to resume execution.

We're going to resume execution using this frame variable that we know LLDB is going to have there for us. We're going to call these API functions to first get our thread and then get our process, and once we have our process, we're going to resume execution. So what this looks like for the user is as long as the decision matches the path, execution just keeps running. The user never even sees this breakpoint being hit. However, if the path differs from what our decision point is, then we're going to stay halted and we're going to print this error message for the user. So let's see what this looks like in execution. execution. So we set our breakpoint command on our breakpoint to go left. We've already seen what breakpoint to go right. We've seen what that looks like. We add our breakpoint script command to the breakpoint to go left. That looks just like the one to go right except you say left. We continue execution. It runs for a little bit and then it actually halts and prints out one of these error messages. So we did find a problem. So now let's examine what's going on at this place where we found the problem in our program. We look at the word at the current root and the current word is Dramatis. We say, okay, well, what are we searching for again? We're searching for Romeo. Since the tree is sorted alphabetically, Romeo should come after Dramatis, so going right seems like it ought to be the correct decision. But we were going to go right and our path apparently says we should go left. Let's ask Python one more time, show us the path variable from the current node to where you found Romeo. So we use one of these one line commands to say print the path variable. It prints a path variable. It's still left, left, right, right, left. So surprisingly, we actually hit the problem in the very first node of our tree. We say, OK, what is left, left, right, left from here? And we print out the word, and it says Romeo. But aha, we have an uppercase and a lowercase. There is a case conversion problem in our program. And this is the bug.

So this is an example of just some of the stuff that you can do using Python and scripting in LLDB to help you accomplish your tasks. Hopefully, I've shown you that LLDB makes scripting very easy, very useful, and very powerful. I've shown you that you can use the convenience variables in the API functions.

They really help you accomplish a lot of great things. I've demonstrated if you want to do automated testing in QA, you can actually run LLDB from inside Python at the Unix prompt, and you don't have to touch Xcode. You don't have to touch the LLDB debugger. If you want more examples of that, as I said, you can look in the LLDB test suite. And there's just lots and lots of more cool stuff you can do with scripting. I've just shown you the tip of the iceberg.

Going over what we've covered in this entire session, we've shown you the LLDB command line with its object, act, and syntax. We've shown you that you can use help and apropos and the tab key to find all the commands you need to use in LLDB. We've shown you how to use aliases to create simple shortcuts to do what you want to do easily. We've shown you you can use the expression parser to execute your code in your own program, and you can even use it to help you debug when you don't actually have debug information there, which is going to be very useful. I've shown you some of the great, cool stuff stuff you can do with scripting in Python and LLDB.

For more information, you can go to the LLDB website. The LLDB website has great information about LLDB in general. There's this tutorial for how to do stuff in LLDB that you used to be able to do in GDB. The entire scripting example with all the code for the depth-first search function, all the code for the dictionary program, more explanations than I had time for. It's all there on the website if you want to go look at that. If you want more information about the API functions or about running LLDB directly from Python, I recommend that you actually download the source code from LLDB. Again, it's an open source project. And then you look at the header files in the source tree for the API functions, or you look at the test suite for how to run LLDB from within Python directly. And of course, if you want information about Python, Python has its own website. So thank you very much.