Mac • 1:08:22
Speakers: Ted Kremenek, Yuji Akimoto
Unlisted on Apple Developer site
Downloads from Apple
Transcript
This transcript has potential transcription errors. We are working on an improved version.
So, good afternoon everyone. Welcome to improving your application with the Xcode Static Analyzer. My name is Ted Kremenek. I'm a member of the compiler team at Apple, and I'm very excited to be talking to about the Xcode Static Analyzer. The Static Analyzer is a new feature in Snow Leopard's developer tools that quite simply it's going to provide you the means to find a wide variety of a list of software bugs both in your iPhone and Mac applications far more quickly and easily than you were able to before. And all this is based on the use of static source code analysis.
Now static analysis is something that many of you have probably not heard of before. The premise is very simple. We essentially just want to automatically find bugs in your programs by analyzing your source code. Now here are the keywords. Static simply means we're going to do this without running the program.
So static analysis is based essentially on deep compiler analysis. So it mean the same ideas we use to built compilers to make your code run fast, we retarget to instead reason about what your code is doing to find bugs. And so the usage model for static analysis is could be thought of as the form of advance compiler warnings. So we're used to feedback from compilers concluding like errors, like things that tell us why our code can't build, but also warnings to tell us about issues, you know, potentially suspect code that when actually run could do something that we did not anticipate.
Static analysis tries to take this idea a lot further by finding very deep kind of programming mistakes, mistakes that you could typically only find with testing. So as a basic example of something that we can find frequently with static analysis are null dereference checks. Null dereferences are bad, you know, they can cause our programs to crash.
And here is an example code fragment that exhibits a null dereference. And the basic premise of static analysis is that we could feed this to a static source code analysis engine. And then it would spit out a warning telling us that there's a null dereference on a particular line of code in our app.
And this is really powerful because it's just very advanced feedback about the intricate behavior of our program and what could be going wrong. Now static analysis is not meant to be replacement for any of the other ways you find bugs, whether it be testing or using your debugger or using instruments. Instead it's meant to compliment these techniques with three particular strengths. The first is the early detection of bugs. If you think about it, static analysis can be applied very early in development workflow.
Before you can actually build an entire program, before you can actually run it, as long as pieces of it can be compiled, you can analyze it for bugs. And this means like as you're actively editing your code, you're hacking away, you have the potential to find bugs close to the moment that they were introduced. And bugs that are fixed earlier often much, much fixer, much, much cheaper to fix.
Second, static analysis is very comprehensive. It can find deep bugs without test cases. Now testing is really great, but its strength really relies on how good your test suite is. And so if you think about all the corner cases in your app that you have to actually test, if your test suite does not exercise it, you're not going to find those bugs. Static analysis tries to reason out all the different ways that your code could be exercised, and so it's good in complimenting testing in those harder reach cases. Now the third real strength of static analysis is that we can often get very detailed error reports.
Sometimes it's not just good enough that we know that an error is present, but we want a full diagnosis of how the bug actually occurred. And when you can see the precise lines and expressions involved in an issue, it makes it much easier to fix it So we've taken the strength of static analysis and work very hard to put it right at your fingertips in Xcode. This is a new feature in Snow Leopard that's going to allow you to analyze both iPhone and Mac applications for bugs. And when you're analyzing for the iPhone, it works either for when you're targeting the simulator or the device, it works seamlessly.
Now we focus on a few key areas for-- initial key areas for bug finding. The first is basic Cocoa API checks, and I'll talk about that a little bit more in this session. And within that theme, we've done a lot of work on Cocoa memory management rules. Cocoa memory management is key to developing both good iPhone and Mac applications. And third, we've looked at the general sense of logic errors. So things like null dereferences, uses uninitialized values and so forth.
These are things that benefit any kind of app regardless of whether it uses Cocoa. Now all this is based on the Open Source Clang static analysis engine which is part of the larger clang and LVM projects, and I'll talk about that briefly later. And finally, we have really worked hard to create a fluid workflow experience within Xcode so that you never have to leave your editor to actually go and find and fix bugs. So that said, why should you actually care? We got a ton of features that you can-- you can learn to use, what benefit with learning a new feature brings to you? Well, there are three reasons.
The first is that we at Apple found it extremely valuable. While they'll bring it up on Snow Leopard and the iPhone 3.0 OS, we found and fixed thousands of bugs. And the key thing here is that some of these are major, others were minor, but once they were spotted using static analysis, they were often trivial to fix.
And this is the real value of a feature like this. The second thing is static analysis will help you be a better Cocoa programmer. If you're just learning Cocoa, it will help you understand just the, you know, the basic rules of Cocoa, how should you go about using memory.
And it will give you-- get you over the common mistakes that beginning Cocoa programmers have. And even if you're a veteran, there is still benefit to you. None of us write perfect code. And there are always corner cases that we never thought about, static analysis will help look over your shoulder and essentially do a form of automated code review.
Now the third thing and this is what I think is really a key, is that there is real value in fixing your bugs before users see them, and static analysis gives you a way to more proactively find bugs before you get some kind of bug report saying, "Hey, your app crashed".
And so if you think about the growing ecosystem of apps on the apps store or even on growing set of apps on the Mac, users notice polish. They notice apps that have good UI polish, but they also notice apps that have behavioral polish and apps that crash or sluggish 'cause they leak memory, users will notice this, and it translates directly into ratings on the App Store.
So we think this is going to be a tremendous value to you. So that said, let's show you static analysis in Xcode. And driving it is going to be Yuji Akimoto. He is a Senior Xcode UI Engineer, and he is one of the great members of the Xcode team, I have the great pleasure to work with on this feature.
So we're going to show you three examples. We're going to warm up with something very simple similar to what we showed in the Developer State of the Union and then go from there. So we got basic Xcode project, Hello World. We're going to open up one of the files. And here is this contrive function that all it does is allocates a Cocoa NSString object and then print it up. And it's very simple. So for those of you who aren't familiar with Cocoa, Cocoa traditionally uses a retained account mechanism to manage the lifetime of objects.
And so here sending the alloc to NSString returns an object as the caller's responsibility to release by sending the release message. Now, you might think well there's also garbage collection available for Mac development. I'm going to show later in this talk how static analysis can help you find memory leaks in your garbage collected apps. So, well, this seems like a fairly innocuous example of what's going wrong.
Well there's actually a memory leak here because we allocated the object, we used it, and then we just forgot about it. And traditionally finding this bug would require you to actually be able to run the app, maybe ran a bunch of times through instruments and hopefully identify the leak.
But now with static analysis you can find it with the click of a mouse. So we have this new Build and Analyze option in the Build menu, and the Static Analyzer is integrated into the Build system because the Build system basically knows how your code integrates together, how it's built, what the dependencies are, so it's a natural place to put it. So we're going to run Build and Analyze. And what's happening here is your code is being both compiled and analyzed by the Static Analyzer.
So, the output of the Static Analyzer is just like any other, initially like any other compiler error warning. We get this nice message bubble. Notice it's actually tinted blue to distinguish it from other compiler errors and warnings and it's got the static analysis icon. Now if this was a compiler warning, this would be the end of the story. This is all the information that you would have.
But the Static Analyzer actually has a rich amount of information to show you full diagnosis of how this bug occurred, and you can activate that by clicking on the message bubble. So this is very cool. So suddenly a bunch of things change. The first thing that happened is the message bubble that we clicked on disappeared. And if there were other compiler errors and warnings that were in this one, that they would also disappeared as well. And I'll show example later on.
This is a transient change in your Editor Window. Now couple of things appeared instead. First is we have these two new message bubbles. These are what I call Analyzer events.
They are essentially things that happened, it could have happened in the function in order for this bug to occur, and there's essentially a temporal ordering between them. There are also these blue arrows which are overlaid over your code. These illustrate the control flow for your code in order for the bug to occur. This is straight line code so it's not very interesting right now, but the next example will really show how they shine.
And then lastly, there's this new tool bar at the very top of the Editor Window that lets you to navigate through the Analyzer events. If we go and click on the event, we can see, you know, it pulls down. You can slide between them, you can jump between them.
There are also these arrows on the right side of the tool bar that lets you to navigate them in a forward and backward. So you have both random access and forward and backward and flow through the bug. So it quickly lets you go through the bug and see what's going on.
And when you're done actually inspecting the bug, you can just click the Done button and it returns us back to the original editor. So this is a very, you know, simple way to find out more about the bug and then when you're done, leave them. So let's go and actually fix this issue. And as I mentioned before there's a leak and what we need to do to fix this is send a release to the object.
[ Pause ]
And voila the bug disappears.
[ Audience Remark: Can you enlarge your font?]
[ Audience Remark ]
OK.
[ Audience Remark ]
[ Pause ]
[ Applause ]
[ Laughter ]
OK, sorry about that. Thank you for pointing that out. The rest of the code can be a lot more complicated, so it is good to point that out now. So, so that's a simple bug, contrive example.
And the point of this was that you were able to just within the editor, find the bug, fix it, and quickly verify that was gone. And so this is a really powerful quick workflow. Let's go ahead and move on to a slightly more complicated example. And well, that's interview quiz. Now let me give you a moment to look at this code and try to figure out what's wrong.
To give you some context, this isn't a contrived example. It was a real bug found in some large unanimous Open Source, not Open Source, code base. And I've removed extraneous logic from this bug. So if you haven't seen it yet, let's see what the Analyzer can tell us by running Build and Analyze. And what-- OK, so we don't have a leak in this case. We actually have something far more serious.
And here we actually have an object that's been over released. And so how could this actually have occurred? So we're going to go and click on that message bubble again to disclose the bug. And we've got a lot more information than we did from the previous example. Now notice because this bug involves a for loop and two nested loop statements, the arrows illustrate the control flow through these structures.
And so and actually not all of the-- some of the arrows are more bold than the other ones. The first set of arrows that are bold are the ones showing essentially the sequence of branches taken from the entrance of the function to the first event which is the place for the object is allocated.
If we go ahead and click to the second event, you'll notice that the set of arrows that are highlighted actually change. And what we're doing is we're showing you basically between what you just looked at and what you're looking at now what was the flow through the-- through their code for that to occur.
And so what we see now is that there's this path from this allocation to the place where the object is released. And here you actually see an example. There are two Static Analyzer Events on the same line. You can see in the navigation tool bar that's indicating the one where the object is released.
So if we click to the next event, we see that what happens is we leave to the loop iteration and then we go back to the head of the loop, and then we click-- again click to the next step that shows that we entered the loop, you know, the second time, but this time the set of errors are different.
It shows us the case where the first branch isn't actually taking it and then we go to the second one. And because object ID still refers to the object that we release on the previous iteration, the object is over released. So this is actually a very, and if you think about it, this is actually a fairly complex kind of bug, there's a lot of information here.
But you can easily digest it by just stepping through it in a very consistent and logical way. So let's go and fix this bug too. So the problem is that at the beginning of the first loop iteration, object ID was nil, right? And so it looks like this code basically expects that to be the invariant at the beginning of every loop iteration.
So we can solve this problem just by nuking the value of object ID after we release it. So we're going to set object ID equals nil. Just ran Build and Analyze. And the bug disappears. OK, so you-- again, this whole very quick workflow of finding and fixing bugs.
[ Applause ]
So those are toy examples, right? I mean the real value of this tool is they can find bugs in a real mature code basis, code basis that are used, code basis that people care about. So, to illustrate that, I want to run the Analyzer on the real Open Source project and that's Growl. And my choice of illustrative on Growl is not because I'm trying to pick on Growl. It's actually a well written code base. It's mature.
It's well used. There are developers who are actively working on it. They care about the code being correct. And let's the value of actually running the Analyzer on that. So here's the Xcode project for Growl. We're going to actually pull up the Build results window to kind of illustrate where the analyzers falls in with the rest of your built task.
So let's go ahead and run Build and Analyze. And you can see there's actually, you know, you can see the actual steps that the Build system does to both compile and analyze your code. And it actually groups the compiler warnings and errors and the Analyzer, she's right under the individual build phases, so you can-- you can actually use this to navigate through your issues.
And we're going to talk about this in a lot more detail later in the session. Let's go ahead and look at one of the real bugs here, go to CFURL editions, that yes, that final. So you see it's lot of real code. This is an actually-- this is a C file. And on-- in this example you can see that there are 2 Static Analyzer issues and the compiler warning.
So this is a real example of how all those would coexist in the editor. So the first Analyzer issue is something called the dead store, I'm going to talk about that later in the session. The second one is actually a very serious bug and it says passed-by-value argument and function call is undefined.
And so what that basically means is you're calling a function and one of the arguments are passing through it is garbage. So this would result in a really kind of weird bug that the caller, I mean that the callee would get this garbage value and then do something really awful with it. And the real-- the nastiness about this bug is that it occurs long after the point where the bug actually is introduced.
So let's go and disclose that information. And so this is, you know, it illustrates on real code. We can see the precise inter branches that are taken. We say the bug occurs because there's this variable that's declared without initial value. And what I want to point is that Clang has full range information for all these expressions. So notice for the Analyzer event, we completely highlight the actual declaration where the variable was undeclared.
So there's no guessing about what we're actually talking about here. And if we click to the second event, what happens is the set of, you know, branches that are taken to reach the actual call are highlighted so you can go and follow them. And we see this actual function call where the argument is passed uninitialized. We see that past style is also highlighted, so you see the actual argument that is uninitialized.
So it's again, trying to give you some very precise information of what exactly went wrong. And we think this is just really tremendously valuable for finding and fixing your bugs. So there you have it. Static analysis in Xcode, find and fix bugs quickly with the click of the mouse.
[ Applause ]
So before we move on, I just want to ask you a question, don't you think you did a great job on these arrows?
[ Applause ]
[ Laughter ]
Excellent. So let's go ahead and set back to the slides. So that's a feature in-- that's the feature in a nutshell.
So but what I want to talk about for the rest of the session is how do you go about making the most out of this feature. And I want to cover three things. First is how is it basically working under the hood. This is important so that you can understand its strengths and limitations.
We don't want this feature to seem like its magic. The second is workflow. How do you go about best using this tool, I mean your, you have a way, if you go back developing software, where does it best fit in to your processes. The third thing I want to talk about are bugs, right?
This is all about bugs.
How is the Analyzer relaying informing to you, what kind of bugs does it actually tell you about. And because the Analyzer actually is not perfect, it's going to have some noise which we call false positives. I'll talk about different categories of false positives and how you can go best dealing with them.
So how does the Analyzer actually work? Well as I mentioned before, it's all built on top of Clang. Now Clang is this overloaded term but here I mean the actual Open Source C and Objective-C front end that we're building to build great compiler technologies. Clang and the Analyzer are 100 percent Open Source, right? You can go and check it out, look at the code, use it, and this has been really valuable for us.
People have contributed both to Clang and the analyzers to make it a better future. The Analyzer currently handles C and Objective-C code, so you can use it freely on just plain C apps. Now we've really too did more for Objective-C code but still very valuable at this point for analyzing straight C. We previewed this technology at last year's WWDC.
And I just want to mention this because the feedback that we've gotten from you, from using the Open Source release of the Analyzer has been invaluable and we really appreciate that feedback and we hope you will continue to give us your thoughts and insights on how we can make this better. You can find out more about the Open Source project by going to clang.llvm.org.
We really encourage people to get involve if they're excited about this. So how is the Static Analyzer actually built? So in the Compiler State of the Union we flashed a diagram like this where we talked about how the Clang compiler is built. It consist of this thing called the Clang Front End which is the-- which is an LVM library which handles the parsing and lexing and preprocessing of C and Objective-C code.
It then builds an in-memory representation of your program that has been fed to a compiler back end which consist of this Optimizer and Code Generator which generates a compiled code. The Static Analyzer uses the same exact front end as the Clang compiler, and then instead of the in-memory representation being thrown over the wall to a code generator, it's fed to a source code analysis engine which feeds its results up to the Xcode user interface.
Now the hive that I want you to get out this diagram is that the compiler and the Analyzer are seeing your code and exactly the same way. So we have a region of representation of your program to serve both of these tasks. And so we're really excited about the kind of tools that we're going to be able to build using Clang.
So when your code is analyzed what actually happens? Well, each file or method-- each file is handed to the Analyzer engine, and within each file, each function or method is analyzed one at a time. And within a function, the Analyzer reasons about what we call paths. And a path is simply just the sequence of branches taken through your code. So, you know, you have different ifs and in for statements what are the different ways in which you could run through that code. And if you think about a test case where you actually running the program, only one path will be triggered through that code.
What the Analyzer tries to do is reason about all the ways that your code could be exercised and this is how it achieves coverage. Now in order to get this coverage, it's not doing a perfect-- essentially not doing an execution of your program. It's an abstract in a way details in order to basically collapsed redundant behavior and just look for the salient features that are needed to identify bugs. And you might think that this is expensive. Well, it is. So essentially using static analysis is about trading CPU time for improving your code.
And that's essentially the tradeoff we're trying to make. It's a different goal than making the compiler fast. And static analysis essentially uses worst-case exponential algorithms or exponential in the number of paths. We do a lot of clever things. So it typically isn't like that, but at the end of the day, the Analyzer's best effort which is trying to do a very good job of analyzing your code, but if your code is too complicated, we bound the amount of work so that you get results in a reasonable amount of time. So that's the expectation. It's a best effort tool. So how is a bug actually found? Now this is a real code fragment. I actually didn't make this up.
And the great thing about this talk is that it's not hard to come up with examples. [Laughter] So I'm going to-- this is actually shown to me as the bug find by the Analyzer. When you feed up the code, it doesn't matter. And let's walk through what the Analyzer actually does.
It starts analyzing paths from the beginning of the function. So here we see that it looks at-- well, the first statement here is that result is declared and it's declared uninitialized, OK. So well, alright, we're going to record that fact, result points to garbage. And then starts tracing the path through this code until we hit the switch statement. And then we can see that the switch dispatches on the value of the field sa_family. Now sa_family is a field of essentially value that was pathed in as a function parameter.
And so the Analyzer, because it's really just looking at your code you know one function at time goes, "Well, I'm not going to make, I don't have any assumptions about this-- the value of this field, so I'm going to assume it's what we call unconstraint, it could be any of the possible values in the switch statement." So I'm going to consider each of those cases one by one, and look at each of those paths that's spawned from this point.
So let's look at the case where we go to case AF_INET6. In this case sa_family would have to have that value. And so at this point along this path we have the constraint that sa_family has that value. If we go inside the case statement, we then see this call to inet_ntop.
Now the Analyzer doesn't have any special knowledge about this function. But it sees that returns a point or value is actually compared against null. Alright, so we'll the analyzers well goes well this value is unconstrained but I know that you're comparing against null, so I'm going to assume that it can either be null or not null. And we're going to look at those two possibilities from this point on. For the case where it's null, if statement of value is a false, jump to the break, and then we go and hit the return statement where return result uninitialized to the caller.
So this is a very systematic reasoning about what your code is doing and exploring each one of these branches. And this is what you would see on Xcode. The Xcode has a much better job of drawing the arrows than I do, and it actually tells you the relevant information. It shows you the point where the variable was declared and where it was returned. So it gives you that full diagnosis right there in the editor. So the Analyzer is awesome.
I'm really proud of what we've done but there are some limitations. First I want to talk about inherent limitations. Because the Analyzer abstracts way pieces information about your program in order to reason about all these paths, because sometimes being precise and we can have false positives. Sometimes the false positives are due to false paths, so paths through your code that just couldn't actually occur.
And I'll talk a little bit more later about how you could go about fixing them. Second false positive occur because the Analyzer doesn't know something that you know. You've got some assumption in your head that's not reflected in the code, and those can often be addressed using assertions, and I'll talk about that later as well. The second thing I want to talk about is that the Static Analyzer is not going to find all of your bugs.
I've talked-- mentioned earlier that it's not replacement for testing or using a debugger, but more importantly, we have not designed it to be a program verifier. Static analysis tools that try to verify that code is perfect. It tends to have a high number of false positives since our goal is to make the tool useful, right? You don't want to go and look at a bunch of garbage results. So we're perfectly willing to miss some bugs as long as we give you a high signal to noise ratio from the tool.
And further, the static analysis tool is not magic. It's just a piece of software in itself. It won't find a bug unless it's been specifically engineered to look for certain kinds of issues. If there are things that you would like it to try and find, please just let us know. Now, some important current limitations with the Analyzer and I've mentioned this before.
Because the Analyzer only analyzes one function at a time, it really loses information that spans across function call boundaries. This can lead to some false positives and it can lead to some false negatives. This is something to keep in mind and it's just a current limitation of the Analyzer. Second, many static analysis tools have been engineered to try and go after buffer overflow checking.
It's an interesting problem but just one that we haven't focused on. So let's talk about workflow. Now that you understand a little bit more about what the tool is doing underneath, how do you actually go about finding and fixing the bugs? So there were a few important goals that we had in mind when engineering the Analyzer. The first is that we wanted you to use it very actively.
As you're editing your code, we want you to feel that you can proactively run the Analyzer and have it tell about issues. And the whole idea is that issues that are introduced just recently can be fixed quickly and much more cheaply. And to do this, we want to experience where you never had to actually leave your code. The worse thing that we could have done is that run the Analyzer, it's looking at your code, and well, you suddenly get taken away, highjacked into a separate user interface.
This is horrible, right? This is just not an optimal experience because you have to switch back and forth between editing your code and seeing what the Analyzer tells you. Second, we wanted you to treat, looking at Analyzer issues just the same as you look at regular compiler errors and warnings.
If you think about all the things that the Build system does, all the tools that it runs, it's basically giving you a whole bunch of feedback about the current state of your code, about the current state of your program and it seems very logical to centralize that in all in one place you get this wholistic view of what's going on.
Third, we wanted to give you the flexibility of when you run the Analyzer. We want to give you the ability to do both an on-demand analysis of your code like I've illustrated already or just automatic analysis that every time I hit Build I want my code to be analyzed. You now have this option.
And we wanted all these to combine together to create what I call the Analyze-fix-Analyze workflow, that is you're looking at your code, you analyze it, you find bugs, you try and fix them, you analyze it again to see if you actually fix the issue. And if you want a quick turn-around time to see if the issue actually has been addressed, and if it isn't, you can try and go and fix it again. And this is-- this-- it's really a key that you can do this very rapidly in order to make the most out of the tool.
And I think we have done a great job at accomplishing this. So let's review on-demand analysis. We have this new Build and Analyze option in the Build menu. It's actually has a keyword shortcut as well. And essentially what happens is that your code is compiled and analyzed. Now the Build system knows your code and knows what's actually been modified since the last time you compiled it and the last time you analyzed it.
So doing Build and Analyze is only going to reanalyze the files that have been modified since the last time you ran the Analyzer. And this means you get incremental analysis. And so it means this enables the whole Analyze-Fix-Analyze workflow that I mentioned that you could just be editing your code and fix it, reanalyze it, and your entire project is not reanalyzed, that would be the worst thing ever. You might have to wait a really long time, it's a big project.
You don't have that kind of latency. Now what about automatic analysis? Well you can now do automatic. You can do automatic analysis using a Build Option. So here's the Project Settings window. And under Build Options you now have this Run Static Analyzer check box. And what this does is it basically makes Build to be synonymous with Build and Analyze.
And so it means every time your files are compiled, they are also analyzed. And it's important to know that this isn't a transient setting, it's persistent in your Xcode project. If you quite Xcode, bring it back up, this setting is still persistent, you will still-- you will still do an analysis every time you run a Build. Well what's cool about this is that you can tailor this to your workflow. You can have this on a per-configuration basis.
And I'll talk a little bit more in a moment about how that can be useful. Now some important side effects about this is that this is also picked up by xcodebuild. If you try and build your Xcode project from the terminal, you will actually see the Analyzer emit diagnostics to the terminal. And so if that's of interest to you, that's something you can do.
If it's not of interest to you, that's just something to be aware of. And the final point that people sometimes forget is while we're trying them have the Analyzer run in a reasonable amount of time, if your goal is to have a fast build time, you just keep in mind that it can either double or triple easily when you run the Analyzer. And really the mileage-- your mileage can be very just depending on your code. So it's something you can try out, see what happens, and see what works for you.
So how can you use that per-configuration setting of the static-- of Run Static Analyzer, how could that actually going to benefit you? Well an Xcode project defaults of having a debug and release configuration. Debug Build doesn't have any compiler. It doesn't really have any compiler optimizations plus the compiler quick have faster round. Release, you want this to go mad and optimizer code.
What if you want something that's somewhere in between? Well you have this option by creating a new configuration which you can easily do just by duplicating the debug configuration, renaming it to something like analyze, and then just setting the Run Static Analyzer Build option for that configuration. Now what I-- this is useful is that you could enable a whole set of other compiler warnings in this configurations.
So now you can have the configuration where I want to do extra checking or I want to run extra unit test and ran the Analyzer. So it's kind of like your progression kind of configuration. I think it's an interesting aspect of the workflow that might really benefit you. So let's go and talk a little bit more of how do you actually navigate results within the-- within Xcode.
And for that, we're going to jump back to the demo. So we're going to look at the Build results for Growl, so I can go ahead and minimize the Editor. And well, I've shown this before. This is actually only part of Growl. This is just one of the targets and it says a lot of information here.
And this is way too much for me to process. Well, the new Build Results window is really cool because it allows you to look at varying levels of information. So if we go the top, there is this new filtration pull down. So right now it says all messages so that would include the build steps and their warnings and errors produced. We can select issues only. And what you're going to see is only the actual compiler warnings and the Analyzer issues.
Now let's say I just wanted to look at issues found by the Analyzer. I can further reduce this down just by selecting Analyzer Results Only. And so it's a much more concise view of the information that I get from the Build system. Now we're looking at it. Issues are now grouped by the file that they occurred in.
Let's say I just wanted to look at particular kinds of bugs. We have now have this new By-Issue tab which reorganizes the issue, reorganizes the compiler warnings and errors and Analyzer issues into logical groups. Analyze issues or groups by categories such as memory managements, logic errors, dead stores. There are a lot of other categories in this Growl.
This example only exhibits these 3. And this is great. But let's say I want to look at only very specific kind of bug. Well now you can prune that down using Search. So let's go to the Search tool bar. Let's go ahead and just type the work "leak", I just want to look at memory leaks.
You'll notice that Xcode does a live filtration on-- textual filtration on each of the rows in your Builder Results menu. So you can quickly go through the results of your build. And this includes searching for files. This includes searching for particular issues. So very quick way to get down to the information that you want. So let's look at one of those, one of those bugs.
We're going to take a look at New Parent, so the leak involving New Parent. Let's go ahead and expand this disclosure triangle over the top in the Build Results menu. Now what you notice is actually the complete set of events within that bug, so you can not only navigate through and look at the different issues within the Build Results, but you can actually navigate specific bug from the Build Results menu.
So let's go ahead and click on one of those. And you see that it immediately takes us into the editor right into the Analyzer mode. And it takes us at a particular event that we were interested in. We could go ahead and navigate through this bug just in that disclosed several events at the top or use the new tool, Navigation tool bar within the Editor, you can do both.
And what's great about this is you can actually quickly use just the keyboard to navigate through the individual Analyzer issues and then within an individual bug. So I want to look at this bug just to show you one other feature of the integration of the Analyzer into the Editor.
This is a memory leak, and what kind of sucks about this example is that it spends a lot of more code than I can view on the screen, right? So, we see this is the actual allocation site which is at the first event and then laying down below, we register, we get this report that the object has been leaked. So this kind of sucks.
But as we see that from after the allocation site, there's this logic on Line 312 where name has compared against null and then we actually skip this big long else branch. Well what's cool is that Xcode's called folding option work-- feature works completely seamlessly with the Analyzer mode.
So we just can go ahead and collapse that else block. And you see the errors draw read, draw alive, and we can see just within a few line of code the actual bug. We can see where it was allocated and the set of branches that were taken to actually release it. I mean for it to leave. This is really powerful.
It allows you to just drill down to the parts of the bug that mattered to you. So let's go ahead and fix this, what's happening here? There's a leak of parent. OK. Well, let's add a call to CFRelease of parent right at the leak side, which is the parent.
OK, somehow I screwed up. Alright, I did something wrong, I didn't actually fix the bug, instead I introduced a new one. And it's a much worse bug. It looks like I've over released the object. So well OK, let's see what I did wrong. Let's go ahead and click on that issue and we can click through this just as before. As we see it's the same allocation site as before.
It says involving the same object, but we can-- with using the Navigation toolbar, let's just click through to see what happens. Notice that the block of code that I've previously folded automatically expands because the issue, the bug, now occurs on that else branch that I've previously thought was irrelevant. And we can see these pointless days that there's now 2 CFRelease calls where we path parent.
So this is the over release. So the nature of the bug here is that CFRelease, the call CFRelease was buried too far in within the conditional, so that the fix is just to remove the original one, we're going to just comment it out to you don't forget that it was there.
We'll Build and Analyze again, and the bug disappears. So once again, you have this fast tight workflow within Xcode to find and fix bugs. And if you screw up and don't fix it correctly, the Analyzer will give you quick feedback. Now notice that we didn't rebuild all of Growl, right? The only thing that happened was that this file was reanalyzed so we got a very quick turn-around time and this is really what makes this feature just shine. And I haven't seen this in any commercially available static analysis tool. And you get if for free.
So let's go ahead and switch back to the slides? So lets talk about my favorite part for the talk and that's bugs. This is all about bugs that house what kind of issues does the Analyzer actually find? Well, we care a lot about Cocoa and Core Foundation. These are the cornerstone APIs for doing iPhone and Mac development.
And so naturally, we're interested in finding bugs related to the use of these APIs. We've engineered a growing set of API checks and I'm going to illustrate one of them on the next slide. And then we've also done a tremendous amount of work on trying to do memory management checking using static analysis.
And you might think that memory management, that's kind of hard. The Cocoa is engineered. It's designed to have very strong interfaces with how you use these objects. And the Analyzer is designed to check that you're using these objects in conformance with the Cocoa conventions. And the idea is if you follow conventions pretty strictly, your code is practically guaranteed to be leak-free. I'm also going to talk about this general language -evel or logic errors. We showed the uses of uninitialized values.
This is also going to include things like null dereferences. And then a real pearl that people often overlook is dead stores. It's a form of dead code checking that could find some truly hideous bugs. And then in this discussion, I want to talk about false positives. It's something that we work-- continually work hard to make the Analyzer better at having lower and lower false positives, but it's something that you will likely encounter. And we want to talk about the different ways in which you can deal with them.
In some cases, it's more about your code not really documenting its assumptions and the Analyzer doing something wrong. So here is a Core Foundation API check. There is this function called CFNumberCreate which returns CFNumber objects. And CFNnumber objects can totally free bridge over via casting to Cocoa and its number objects. So you-- it's not uncommon for this function to appear in Objective-C code. Now it's interesting the way this function is designed.
Essentially what you do is you pass in a pointer, a void* reference to some integer. And then you tell CFNumberCreate what is the type of that integer by passing the second-- the second parameter which is this i num value. Now in this case, the function is being called by specifying-- you should expect a long. And sure enough when you're compiling for a 32-bit architecture both unsigned and long have the same size.
Now if you suddenly decide to move over to 64-bit which we're encouraging you to do, this is not totally awesome. What happens is long is now 8 bytes long. And so CFNumberCreate when it's trying to create the CFNumber object, it's going to read 4 bytes after the end of i, so it's reading essentially garbage and the results can be completely not what you expected.
This is actually a real bug that's occurred several times since just partially is due to this API, but the Analyzer can now check it for you and try to ensure that your-- code is more 64-bit friendly. So there are just little things like this where it'll be very hard to catch this bug even you could you actually run it.
But the Analyzer pinpoints to you exactly where the error occurred. And here is how it appears in Xcode. There's actually this tremendously verbose message, telling you exactly what went wrong and it's basically saying the size of these-- of your objects-- the size of CFNumberCreate expected was 64 bits and you fed it a 32-bit number2. So Cocoa Memory Management.
This is really where we try to make the Analyzer shine. The Analyzer is basically trying to enforce the Cocoa object ownership conventions and these are documented into these 2 ADC documentation. These are freely available and if you haven't read it yet, you're strongly encouraged to do so. Now I mentioned before about garbage collection. Ejector C 2.0 supports garbage collection when you're targeting Leopard or later and the conventions for using garbage collection are also documented in the garbage collection programming guide.
The Analyzer's garbage collection-aware. There's these different flags that get passed to the compiler basically to tell you, tell it whether the code is to be compiled to either run with either garbage collection or traditional ran retain counts, that's the fobjc-gc option. The second is if you're compiling code that only should run with garbage collection. And the last case is if there's no flag.
This is just the traditional-- the traditional model. If you're doing iPhone development, you're not going to use garbage collection and so you have the option to use garbage collection on the Mac. Again the garbage collection flags are only specific to Leopard or higher. And what I want to emphasize is the Analyzer sees these flag and will analyze your code accordingly. It will see-- if you intended it to either be used for garbage collection or not in garbage collection, it will analyze it for both cases. If it's only intended to be used for garbage collection, it will just analyze it for that scenario.
So what about naming conventions? How does the Analyzer actually infer that objects are allocated and released? Well Cocoa is actually very simple. In the Cocoa programming memory management guide, we talked about this naming conventions or methods. So methods that start with alloc or new or contain copy are expected to return a known objects, specific contract on these methods. If people don't necessarily understand it, these are guidelines not only just for our own APIs, but we really encourage you to use this.
It's basically what programming in Cocoa means. Core Foundation which is a CAPI has a similar, has similar conventions. You can use the-- if you have the keywords create or copy and the Core Foundation function in the case that it returns a known object. And the key thing to note here is that create is not a Cocoa convention. And this is something that's been really emphasized to me by the Cocoa framework's curators. This is something they don't want to actually enforce, so they've could explicitly ask that it's not support any outlines there. But you have-- you have a way out or you have other options.
I'll talk about that later. So I've always showed you some examples with leaks. The main thing I want to show is just how much information the Analyzer gives you. So this is an example of using NSMutableDictionary and calling dictionary with capacity which returns you an object that is not the caller's responsibility to release.
It's going to be autoreleased or whatever, it's not your responsibility. And then we go ahead and mess around with the retain counts by sending retain or release. So this is contrive bit of code and there's a leak here and the Analyzer is going to give you a lot of information.
It's going to tell you each time that you retain that particular object and where it was-- where it was actually send a release message. Now if you imagine the code, it's much spread over a lot more code. This is actually really useful, you know it's buddying exactly where the retainer release has occurred. So what about leaks, under garbage collection? Garbage collection is a great technology available on Leopard that will make programming many Objective-C apps much easier, but you need to use it correctly in order to make sure that your code is leak-free.
The main thing to keep in mind is that Core Foundation objects are not automatically garbage collected. And because this can be toll-free bridge over to Cocoa objects, it's very easy to lose track of this.
So here is a real example of code that is leak free when using traditional retain counts but then leaks like a sieve when you use garbage collection.
The problem here is that we create a Core Foundation object which then is returned at the very end casted to an NSString, so it's returned as a Cocoa object. And because we're using garbage collection, the caller is not going to expect it to actually-- have to manually release it using CF release.
And so the Analyzer actually tells you about this and these are very verbose because people tend not to understand how they can leak objects when they're using garbage collection. The first diagnostic tells you you're returning a Core Foundation object and if you floated your mouse over, you will get a tool tip saying these are not automatically collected when using garbage collection. Then you get a reminder message that when you're using garbage collection, auto release does nothing.
Just in case that you forgot that this was what-- this was your safety mechanism for freeing your memory and then finally there is this warrantees diagnostic that tells you what you did wrong and I try to trim this down but whenever I did, people didn't really understand what was going on. Basically, you're returning object that the caller expects is going to be garbage collector, but it still has this positive 1 retain count.
And so, it's never going to get released and so every time this method is called, it's going to leak memory and these are very hard to catch. But because, we have strong interfaces in Cocoa we can check for them with static analysis. And so the solution here is to use of CF made collectible which registers the object with garbage collector.
So, Cocoa conventions, the Analyzer finds memory leaks by enforcing Cocoa conventions. What if your own code does not strictly follow Cocoa conventions? This can cause false positives The Analyzer would tell you that you are leaking in places where you're not. They can also tell you places where you are overall releasing an object where you're not.
The reason it does this is because the Analyzer is just trying to follow what it thinks is the right way to do Cocoa programming and if there is some other structure in your code that deviates from that which isn't necessarily bad, it just doesn't know what's the right way for things to be done.
Now, what we suggest you do is we actually really want you to rename your methods just so that they follow Cocoa conventions. This is somewhat draconian but it's going to make your code much easier to read by others and if you can use Xcode's core refactoring feature to rename your methods across your project. Now, if you're really recalcitrant against doing that, you have another option.
We used to pride these new ownership annotations which allow you to specifically document that a method returns an owned object. So, this is a code fragment from Vienna which is an open source RSS reader. And what you see here is these two methods perform query and perform query with format which returned essentially this SQL result object that's owned So, it's a result of a query but it doesn't contain the keywords alloc, or new, or copy.
There's no way for the Analyzer-- Well, the Analyzer actually can analyze the implementation of these functions, but in general just by looking at the names of these methods, they can't tell is this object going to be owned or is it not going to be owned and honestly, anybody else looking at this code is not going to either.
It's just an implementation detail of the code base. Now, you can add the NS_RETURNS_RETAINED annotations at the end of your method declarations to educate the Analyzer that this is the contract for this method. It will then assume that returns an owned object. So for all the callers that will use this new, this convention, and when analyzing the implementation of these methods it will assume that the method is supposed to return an owned object. Now these macros are only currently available when you're targeting Snow Leopard. You get them just by including the standard foundation and Core Foundation headers.
We also provide a CF_RETURNS_RETAINED macro which does essentially the same thing except that it indicates that it return a Core Foundation object and this is really important when you're dealing with garbage collection as I illustrated previously. Now, if you are targeting the iPhone or a previous version of Mac OS X you're not out of luck. It'd been actually defined in these macros in yourself and it's using essentially this if def free processor logic. And so essentially, these macros expand to claim specific attributes that are added to these method declarations.
Now you don't need to memorize this 'cause actually they've provided it on the clang.llvm.org website. So there you have it, structured annotations annotating your code that explicitly says whether or not you return owned objects. And for those of you who do not want to rename your methods, you now have an option to clearly document your assumptions within help of the Analyzer as well as other people understand what your code is doing.
So what other checks does the Analyzer perform? I'm not going to go into all of these in real detail but it's a variety of checks that are done simply by reading about the semantics of your code, null dereferences, uses of initialized values, returning the address of a stack variable.
Obviously, really bad because it doesn't right-- when you're returning an address of the stack variable, you're no longer referencing a valid object. It's also illegal in some cases to send nil, send a message to nil because the returned value will be garbage, and so on and so forth.
There is many little checks that the Analyzer does just by analyzing your code. So let's talk about null dereferences. This is a real bug in Wolfenstein3D for iPhone and it happens to deal with using the ogg vorbis API. And what I want to illustrate about this is that we see that the pointer VI which basically is passed in from a caller is checked against null in the ternary operator and you see the arrows actually show the control flow within the ternary operator. And you're going to see this also for using short circuit operators like && or ||.
So logical operation, you're going to see exactly what is the reasoning of the Analyzer performed. And then what happens after these null check works is well, you check it for null and I am going to assume that the pointer is null. Down below it's dereferenced in the condition for the for loop.
So can this actually occur? The Analyzer is just looking at this function in isolation, alright. This value has been passed in through one of the arguments. The Analyzer does not have any prior assumptions about the value of this pointer, but because the code itself checks it for null, it assumes that it could be null, and therefore it's going to flag an instance later on where you go and dereference it. And so even if the coded-- if all the callers, this function, never pass in the case where it could be null, this is actually going to be flagged. So should you interpret it? We have 2 responses.
The Analyzer is wrong. It is reporting the above that can-- that doesn't actually occur. Well, that's one way to interpret it, but there's-- I think there's a better way to look at it. Second, if you're checking for null, why are you checking for a null if it never could be null? If it really could be null, there should probably be an assertion in here or if later on, well it could be null. Well, it looks like you have a null dereference. In either case, the code is confused, right? It either it should clearly document its assumption that the pointer can't be null or should handle the case where it is.
And so in either way, the Analyzer is providing useful feedback, telling you that there's something potentially wrong in your code. And many of the other logic errors that the Analyzer reports are like this. They just follow from looking at the assumptions that the code itself documents, and it's just telling you a feedback based on those assumptions. And I wanted you to keep this in mind as we talk about how do you handle false positives later.
So dead stores. What are they? It is a simple kind of dead code check. The basic idea is that you do some kind of computation and then you store that-- that the result of that computation to a variable and then you forget about it. So this can easily indicate broken logic in a code like dead code or just you're doing something wrong or just missed the optimization opportunities. So we'll show you a few examples.
This is something that people tend not to really appreciate. This is a real code example. I've renamed some of the methods here and really the-- a little bit of the extra logic. But this is-- this is a real bug. We see 2 dead * warnings reported by Xcode. And what you see here is there are these 2 methods that do some work and are basically recording in this pool whether or not something bad happened.
But the problem is that whether or not something bad happened gets completely obliterated. On the last line when we just unconditionally overwrite every thing as fine. This is really not what the code intended to do. I mean, it looks like it was written to handle failure and it's actually not. So this is an actual real logic bug. And it's very simple but it's clearly indicating that the code is just-- isn't doing what it expected to.
There's a real dead store in Growl and this is just-- this is a really simple bug. Here we see that's the variable second is assigned a string constant. There's a missing break statement, so it just falls through to default where then it gets overwritten with the value NO.
It's a kind of a cosmetic bug but it's still a real logic issue and this code is dead. It's not doing anything. And the problem here is have-- not having a break statement is not a crime in itself. The problem is that the code is just not doing what the developer expected.
And so this is easily fixed just by adding break. Here's a dead store in Word Press for iPhone and this is probably, mainly just an optimization opportunity. What you see here is there's this method called to get the string and the string is never used. So well, that means either the string was supposed to be used, and it-- it's not being used so that's badness or this is not needed anymore. And I think, well, if it's not needed anymore, will the compiler will just optimize this away.
That's actually not the case because Objective-C is such a dynamic language. It cannot remove these message calls, these message sends for Objective-C. I mean this method could do anything. And so the Analyzer is telling you this is either a dead code or a potential-- a logic bug or a potential optimization opportunity.
And if it's really not doing anything, it's just slowing the app down. Now, dead store, the check itself is actually very accurate, but it can produce a lot of warnings for things that you know are dead stores but you don't really care about them. You can handle that in two ways. You can file to dead store warning by saying that the variable itself is unused and if you use pragma unused or you can use the GCC attribute unused which goes on the variables declaration.
Sometimes people introduce dead store because they're doing some kind of programming for debugging so that it can run through the debugger and print out values. That's fine. You can tell the Analyzer to be quiet by analyzing-- adding these attributes. And because both of these are standard annotations provided by GCC, you can use them in either for iPhone or Mac development without doing anything special. Now, how about suppressing false paths? These are just bogus paths to your code that just couldn't really occur.
So consider this example where we are initializing this pointer because we're going to go through this loop and we expect that the loop is going to be executed at least once. Well, the Analyzer doesn't necessary have that knowledge. So if you run it on this code, it would print this path out where you actually didn't-- the loop condition will validate the false and you never went inside the loop unless you have an older reference.
The problem here is the Analyzer doesn't know your assumptions and the code doesn't reflect it either. You can easily silence this issue by just adding assertion after the loop saying that I expect that the pointer is not null. Now this has two benefits. One is shuts the Analyzer up, but the documents are assumptions so that other people can look at your code and understand what's going on. And in the off chance that you're assumption is wrong, you can now catch it at run time.
So instead of just having, you know, silently missing an error. Now this is a code fragment that I see over and over again and it's kind of interesting 'cause it's kind of inefficient and it's also kind of buggy. So if you run the Analyzer over it, it sees all these necessary if else statements. It doesn't know that tag might return the values just between 0 and 2. It actually might not. It doesn't know.
And code doesn't reflect it either. And so we'll actually flag a warning in this case if you use this value on initialized. This is easily solved by just rewriting the code. We're going to replace the last else if with an else that just has an assertion and there you're just documenting your assumptions.
You're saying what I expect the code to do. Now, a better way to write this is with a switch statement which actually just calls, you know sense the tag message "wants" which is much cheaper and it's much cleaner and you can just add an assertion for the default case that says this is not fit, this is not possible and there will be no waring in this case as well. So the moral of story is you should always analyze your code when you use assertions, 'cause assertions document your assumptions and the Analyzer can learn from them.
Assertions are always disabled when you do a release bill. There are these macros that are pound to find out when you actually want to compile your code to run fast. So it means you should analyze your code only in the debug configuration or something similar to it. Now, if you happen to write your own accustom assertion handlers, the Analyzer doesn't necessarily know about these functions. You can easily educate it by using the GCC attributes noreturn or the Clang specific attribute Analyzer_noreturn.
They're very similar to the attributes I talked about earlier and I'm not going to go into them but they're documented on the Clang website. So the Analyzer is steadily improving. It's brand new in Xcode. We've done a lot of work on it but there's so much more we can do. I want to remind you that it's 100 percent open source. This has just been really great for getting a lot of testing over the Analyzer and feedback from developers like you.
It's really important that if you see the Analyzer doing something done, that you tell us about it. It's the only way we can fix this. So please tell us about false positives. Now, I look at blogs and I look at Twitter, the best way, you make sure that information gets to us is by filing blog reports.
Also, if you're excited about this feature, please tell us about what bugs are important to you. This is the only way we can prioritize to get the maximum yield out of this feature. And you can report bugs using the typical Apple interface, bugreporter.apple.com. And this is also a great place for suggestions on how you want to improve the Xcode workflow and your experience, the Analyzer itself, or just feature suggestions. That's a great way to tell us what you think. You can also file blog reports for the Clang, Bugzilla website. This is for things that-- this is the open source avenue.
Note that this is out in the open so you shouldn't really mention anything that would-- you wouldn't mind being public. But also it's a great way to get that information to the entire open source community working on Clang and the Analyzer. So today, we talked about an exciting new feature in Xcode that's going to allow you to find bugs faster and easier with the click of a mouse within Xcode and this is all based on the use of static source code analysis.
You can analyze both iPhone and Mac applications easily with no extra work. We have a tight integrated workflow within Xcode so that you can analyze your code, fix your bugs, and analyze to verify that your-- the fixes have actually done what you intended them to do. And lastly, the Analyzer, as I've said, is a 100 percent open source.