Assigning Your Application an Identity with Code Signing - WWDC 2009

Mac • 56:16

Code signing allows Mac OS X to establish and verify your application's identity without user interaction--even after you've updated your application. Find out how digitally signing your application ensures the integrity and security of your code and enables the system to recognize and alert users to unauthorized changes. Learn how signed applications work, how to sign your Mac OS X applications, and how signing improves your customers' experience.

Speaker: Perry "the Cynic" Kiehtreiber

Unlisted on Apple Developer site

Downloads from Apple

SD Video (97.4 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

I'm Perry the Cynic. I am an architect in the OS Security Group. I invented code signing. So I should know. So why do you care? Code signing lets you identify code and recognize code. That's what it really does. A lot of people think it's a security feature and there is security in there, but the primary purpose of code signing is to stamp a reliable identity on a piece of code, Apple's code, your code, anybody else's code, and then have the means to actually recognize whether it's still the same.

And so recognize, filter, and organize groups of code, that's fine. Almost incidentally by having a way of recognizing code, we solved the software update problem like, "Is this really an update for that, or did somebody spoof me? did somebody screw it up? this is not really the same." And the security part is simply there to make sure that information we get about code identities is reliable, that somebody can just fool you easily.

And I don't know anyway of saying this really nicely but you're supposed to sign your code. If you're shipping code into Mac OS either way, you're supposed to sign your code, if you don't, you're not-- the code you're making, the code you're shipping to your customers these days is not first class code, your shipping legacy code whether you know it or not. So you need to sign your code to be up with it. And I will explain in some detail down the road what will happen to you if you don't.

So identity, this is really about giving an identity to code, to your application. I'm going to use Apple's Mail program because it's such a hapless victim and I like to beat up on it. So Apple's Mail program, that's a code identity. It was made by Apple and it's Apple's Mail, not Apple's iChat, Apple's, you know, Safari. So we define code identities. We secure code identities.

We put a cryptographic seal on them. So if somebody hacks around with the code, we can recognize it and we know that it no longer has the identity that it was supposed to having. We provide a small but useful language, almost like a little programming language to discuss identity, to form, define identities.

We'll talk about that in a while. And this is really important. We are identifying code as it's running. This isn't just about looking at the hard drive and going, "Oh, look it's an application." It's about a running application and going, "Where did that come from and what is it doing?

Is it still doing what we think it was supposed to do?" Now, just as importantly, what does code signing not do for you? Signing of code no matter how you sign it does not get you program privileges it does not already have. You're not-- you don't get a different user ID.

You can't suddenly read files you couldn't read before, but you can ask system daemons or APIs for stuff and they may look to get you an identity and decide whether to give you stuff. Code signing does not protect against bugs, so be very clear about this. It's, again, about identity. If you write a program with a bug that erases the hard drive and you sign it and say, "This is mine. I made it.

I'm proud of this", and you ship it to customers and the customers go, "Yeah, I know Joe Developer. He's a good guy." If they run it and you're right, it's a hard drive that is not a failure of code signing, that is a failure of your coding skill.

And conversely from the engineer's point of view, code signing doesn't also automatically protect against placing trust where it should belong. If instead of downloading your excellent program with the unfortunate bug, they download Hacker Inc.'s hard drive eraser tool and they run it, it will erase the hard drive or do worse things for them because they trusted somebody's code that they shouldn't have.

Code signing can tell you whether this is really Hacker Inc.'s hard drive destruction tool, but it doesn't tell you inherently whether that's a good thing. So code signing in a single slide. If you just want to get over with it, I scared you, you know you need to sign your code and you're really sorry you aren't already doing it, what's the minimum you can get away with? Well, you need a digital signing identity, a cryptographic signing identity in your keychain.

I'll tell you later how to get them. It's really quite straightforward, but you know, I give you the picture of Keychain Access with the-- a signing identity made. Now all you need to do is get that name and stick it into your Xcode project right there where it says code signing identity, who would have thought, and then build.

And that's it, you're done. At least if you have a simple project. I mean, this is really simple. The hardest part is actually setting up the signing identity which you do exactly once and then you think about it maybe once a year, and other than that, you just run your Xcode builds, get signs to program for you, you ship it, you're happy. If for some reason you don't use Xcode, there is a command line tool called the codesign. Same thing, you put that string into the -- argument of codesign, stick that on your make file or whatever other third party build tool you've got, you're done.

So, there's really very little excuse for not doing that, except for a couple of reasons we'll talk about that are really more having something to do with how you organize your build process. Code signing is meant to be a pervasive feature of the operating system. It's everywhere. It's meant to be everywhere. We're not just signing Mach-O applications which is what you'd expect. We're signing libraries. We're signing plug-ins. We're signing bundles containing any of these. We're signing scripts.

And because it's not really all that clear what's a script, the whole thing is extensible so if you happen to be writing a script interpreter, you have the means to explain to the system that your scripts are actually code and they can be signed. Code signing applies to everything we ship essentially: Leopards, Snow Leopard, and the various versions of the iPhone OS, it's on there. And our long term goal is to get everything, all code on the system signed. Now on the iPhone that's already true because we made it so.

We started from scratch so we said, "OK, nothing unsigned runs here." For Mac OS X, it's not true, not as of Leopard or Snow Leopard, but it is a goal. And you'll find that as time passes and we're now going boldly into the future, it's going to become less and less convenient not to be signed. And again because this is really important, primarily, code signing is meant to be a runtime feature.

It's really about the security of running code, your running code, Apple's running code, daemons talking to clients. So the code signing feature is focused on running processes, running scripts, talking to each other, establishing each other's identities, and then making decisions based on that. There is some dynamic state that goes with your running program.

As a matter, one of the more interesting things you can do is say, "I'm not sure I'm valid anymore, just, you know, mark me invalid." You can do that to yourself at anytime. This is irrevocable. Once you are marked invalid, you can't ever be valid again as long as you're running.

And there is a means that I'll touch on very briefly called code hosting that allows code to manage other code, like you know, if you do it with an interpreter then you want to manage your scripts. And static validation, the thing that looks at the hard drive and decides whether that something it sees there is valid, that's just a subfunction.

I mean we need to do it anyway as part of the implementation. So it's exposed, you can use it. It's useful for things like disk integrity checkers but it's not the main feature, so keep that in mind. OK, let me just give you a brief show for those who haven't seen it yet. So I have a really simple test program here. It's just a little Cocoa apps made straight from the Xcode template. All it does is display a couple of parameters.

Let's just build it the way it is. It displays a couple of code signing latent parameters and its one redeeming feature is that it can actually try to access a keychain item. So, yes we build, thank you. So here's a little CSTest which asks itself, "Gee, I wonder if I'm signed and what's my status is and of course right now it's unsigned." And yeah well, wake up. Let's make a keychain item for it.

Very creatively called test and you've probably all seen this dialog before, it's the-- do you actually want this program to access this keychain item because we haven't said anything beforehand so the system doesn't know.

So it asks you, you're the user, whether you want the CSTest program to access test, and when you click on Always Allow you get to retrieve the item and the system remembers that you said, "Yes," and so you can get it again, which is all very nice. And one of the things that you've probably figured out by yourself is if you make a change to the program, you know, a corporate lawyer comes along and says, "You really need to put a disclaimer in here. You can't ship that without a disclaimer."

So we rebuild it and we relaunch it and we hit that button again. And here's the dialog again because the system in the essence of code signing has no idea whether this CSTest has anything to do with the previous CSTest, they are 2 different programs, there's different bytes in them. They may have some similarity like they're both called CSTest and their info.plist sort look similar, maybe even the same, but any hacker could do this. So, we just don't know, we have to ask.

And if you click on the Always Allow button again, then well we'll remember this new one and now we'll have a list of 2 different CSTests that can access the item, and every time you make an update it happens again, and you get dialogs and that's not good, so let's say "No" for a change. So, that's a situation you're in if you haven't started signing your code yet. So let me show you where you go from here.

Keychain Access has this neat little subfunction called the Certificate Assistant, which does certificate-related things. And one of the things it does is it can create an identity for you, a cryptographic identity. It's really quite straightforward, I mean let's name it Demo, why not? And the only thing you really have to set is that you want to use it for code signing instead of, you know, SSL or email or something. And it gives you a nice warning saying that what you're doing here is not the most secure thing in the universe, but for testing purposes, it's fine.

So we've just made a cryptographic identity called Demo and it's a new keychain. And if you remember the picture from before, all we have to do now is, we have to go in here and look at the built parameters for CSTest. And in here where we have code signing identity, it actually finds them all for you. So it just looks through all of your keychains and finds all of the eligible signing identities, so let's name it Demo.

That-- that's the same dialog this time because Xcode is trying to sign your program and it's trying to use that keychain item that is your cryptographic identity, and the same logic applies, you know, we haven't done this before. So, if you say "Always Allow," this will be the last time you see that dialog for your signing identity because, you know, Xcode, codesign is trying to sign it, so we build.

And we still have a CSTest here and if you are wondering how the hell you can find out that this is actually signed there is codesign-v, the command line command. Like most beginning commands, it just doesn't say anything if things are OK, it only yells at you when things look bad. If that bothers you, then you can dial up the verbosity by giving it another v-flag and we're valid, cool. So, what does this mean?

Well let's launch it and now we're valid, of course. But at first nothing much changes, we're trying to get the keychain item, it's a new program, the system goes, "Do you really want this new CSTest to access this keychain item?" And we say "Always Allow" and that's cool. But now let's do that change thing again.

A new lawyer got hired and now we need a strict disclaimer. OK. And we're opening the thing again, and we don't get a dialog. Yay, happy users! I want you to think about what just happened. I mean why is there no dialog. We don't omit the dialog because we set a flag saying don't put up dialogs and nobody cares about security anyway. We did not put up a dialog.

The system did not put up a dialog because we don't need it anymore. Because when we hit the Always Allow button the last time, and it will be the last time for this program if you keep on signing it, when we hit that button this time, the thing remembered in the Keychain Access control is a code identity, a code signing code identity.

And when you resigned the program after putting this strict disclaimer and by signing it, you said, "I'm saying this is the same program. I'm the guy who made the previous version, same signing identity, and it's the same program, so I'm telling you it's the same, it's OK, just treat them the same." And the system looks at this and goes, "OK, you're the guy who made the program. If you don't know, who should? So, I'll take your word." It's the same program, and so we do not have to put up the dialogue.

The new version gets access to the keychain item. So that's basically the idea here. There's a lot more to it and let me just draw you through a couple of examples. We are not just looking at the main executable. I could show you, you know, opening up a Hex Editor and futzing around with the program's bytes and you'll see that it goes invalid. But to get to the point, if you go into the bundle and you try to add some kind of interesting new resource, you see that suddenly the program has lost its validity because we're actually inventorying all of the resources that belonged to the bundle.

And we keep a memory of what they were, which ones where there, and where they belong. If you remove a resource, you add a resource, you hack around with the contents, code signing will figure it out. And as a matter of fact, if you remove it, it goes back because we know it's still OK. Alright, enough demo. I don't have time for more demos sadly.

[ Pause ]

OK, so let's look a little bit in more detail on what happened when Xcode built that program after you said, "Here's my signing identity." You've got your program and typically in addition to the main executable, you've got info.plists and you've got resources. And if you did what we told you about doing privileged operations, you'll probably have a Helper tool in there that does all of the dangerous stuff and it's really small and really tight and you really read every line of code in it. And by the way, if you are writing a tool, then you'll just have the executable, but everything else still works the same.

OK. So we'll take those files and we'll give them an identifier. Every code signature has a string identifier in it. Typically, we take that from the main, from the info.plist, so it will be out something like com.yourcompany.mysuperprogram. So we seek the identifier and then we put a cryptographic seal on it that's simply there so we can detect if something changed.

And I would take the code signing identify from a keychain and we put a digital signature on the seal. So that together makes sure that somebody can't change the program, recalculate the seal, and just go there, see nothing changed because you're the only one who's supposed to have this signing identity.

You don't give it to anybody else. And so anybody who isn't you can't actually recreate that signature. So these new things-- the identity actually gets embedded in the signatures so we can talk about it later. So anytime you'll see a signed binary, you can figure out what digital identity was used to sign it.

So these things we add it, the identifier, the seal, the signature with the embedded identity, we call that a code signature. If you're wondering what gets added to your program when you sign, that's it, all of these things. And if you're wondering why we didn't include this Helper code that's because the logic of code signing says that Helpers get signed separately. This is important.

So if you have Helpers, you need to sign them in a separate signing step. Of course you know if you set the build variables in Xcode, it'll just happen for you. OK, so now that we've built it, What do we do with it, how does this work? So you build your program just like you always do and using Xcode as a last step because it sees that Build variable.

If you're using some other Build tool because you wrote shell scripts or shell commands that do it, you take that signing identity from the keychain, feed it into the codesign or an Xcode, and what comes out is a modified version of your program with that signature added. Remember the thing in the box, we add all of that to your program. If it's a Mach-O program, it actually goes directly into the Mach-O executable And that's what you ship. You send that out to the user. You put it on a DVD.

You put it on your website. You have some intricate binary patching protocol or you just tell users to drag and install from a USB drive. We do not care how you do this. Seriously, the dotted line here doesn't matter to us. What matters to us is that the sign code that ends up on the end-user system is exactly what you started with. It's the thing that got signed. And we mean exactly, same bytes, same files, no files missing, no files edit, no special markers or anything.

As long as it's the same files with the same contents, we're happy.

That same code gets run. Remember this is a runtime feature. And somebody eventually cares about whether that program is actually valid and what its identity is, and so they pointed it and feed it into the validation API which is API now and it's delivered. And up pops essentially a "Yes, it's fine" or "Here is why it's not fine".

That's the basic validation logic of code signing. Except of course up to here all we've validated is that the program that we fed into the API is well-signed. If somebody replaced your program with somebody else's program that is well-signed, we couldn't tell the difference. So there is one more input to the validation API and that is what we call a Code Requirement.

It's essentially a set of conditions that are placed on the program in addition to being well-signed. If there is something wrong with the signature, we always fail the validation. I mean this has been an edit. In addition game, everything must be right for things to proceed. If anything is wrong we stop and yell, "It's not no good."

So, a code requirement, well, where do you get that from? Well typically, you either have a hard coded configuration that you built into your program or you store that in some kind of database, somewhere, where it depends on, you know, who's doing the validation. OK, so where do you get the code requirements from that you store in the database? As a matter of fact, there is an API for getting that directly from the signed code. We call that the Designated Requirement. We'll talk about it in a little bit. But it's essentially a requirement that answers the question, "Who are you?" OK, code requirements.

Now I've tantalized you and titillated you with what that is. So, it's a little programming language of sorts that lets you write conditions about a piece of signed code. You can place conditions on the code itself, typically things like, you know, is there a particular key in the info.plist or you can place conditions on the actual cryptographic signature that secures it like, you know, was designed by Apple.

Yeah, that's a popular one or was designed by me. That's probably a popular one for you. Or you can get a lot more elaborate than that. Code requirements essentially define code identity because it's the requirement that you're feeding into the validation that determines what code you're looking for. Is it Apple's Mail? Is it your hard disk eraser?

Is it anything written by your company? You can express code requirements either in a text or binary form. There's API for converting between the two. The text form is there so you can show it to a user, not any user, but a user who knows what he's doing, of course. They're also good for editing.

If you need to take an existing code requirement make a change to it, put it in text form, you just change the text and then you convert it back to binary. The binary form is there so you can store it in a database of your choice. And it's just a binary blob. It's a self-contained binary blob with no pointers or funny business. So, you know, store it in a file, store it in a database, store it in an XML file as a binary, we don't care.

As long as you get the binary back with all the bytes intact, that's fine with us. And there's this specific kind of code requirement that actually gets embedded in the code signature. We call those internal requirements. They are conditions that a program itself places on other programs it wants to interact with.

Like for example this is the rules for which libraries I'm willing to be linked against or if you happen to be a script, this is the kind of interpreter I'm willing to let me run 'cause obviously if a hack interpreter runs your script, it can do-- make your script do anything it wants. And then there are these very special requirements called Designated Requirements. So there's an API that you can point at any signed piece of code and essentially say, "Who are you? Would you get back as a code requirement?"

And it's the way for the application or library or anything else, any piece of code to say, "If you see another piece of code and you ever were wondering if it's me?" This is where you check for it. So for example, if you point at Apple's Mail, there we go again, it will send you back through this API a code requirement that says, "I was signed by Apple and my name is com.apple.mail", which if you think about it is the best definition of Apple's Mail you can come up with 'cause it's made by Apple and Apple said it's Mail, it's not Safari, it's, you know, not iChat, it's Mail. Most of the time, that is completely automatic. The API, if there's nothing special embedded in the code, will simply make one up for you.

It'll make up pretty much the right one for you, a combination of who signed this and what is it called. If for some reason you wanna get fancy, you can explicitly put a Designated Requirement into the code when you sign it and make it say anything you want, which is occasionally useful if you know what you're doing.

So the Designated Requirement is the definition of what an application's "myself" means. This is what the application says is the meaning of "itself". And of course since we try to make software updates, maintain the identity of a piece of code, be the same piece of code just better and with fewer bouts or at least different ones, so we expect that a software update satisfies the designated requirement of the thing it's updating.

So almost incidentally, we're defining what a valid software update means. OK. So let's-- let's go back over this because it's really the most important concept in code signing and simply back and do it again. The identity, code identity is defined by requirements. It's not defined by the code signature itself.

If you find yourself seeing things like "And then I'll compare this code signature to something else", you're thinking wrong. What you do is you are validating the code signature against a requirement. That's what you do with code signatures. The code identity is not usually a single piece of code. It's a class of code.

It's all possible pieces of signed code, well-signed code that satisfy a particular requirement. All the codes that, you know, was made by Apple and just called Mail for example. In particular, the designated requirement of a piece of code is typically like an entire class of code because you want to capture not just the program you're just looking at, you want to capture all the updates in the future and you probably also want to capture all the older versions of that program because they're all supposed to be treated the same. So the Designated Requirement is the way for a signed application to identify itself to the system saying, "This is me. If you want to recognize me, this is what you'll use. This is what you remember about me."

And as a matter of fact, this is all you have to remember about an application. If you're ever in a situation where you're writing a daemon or a server or something else that gates access based on code identity and you have decided that, you know, this particular program here, Apple's Mail, is allowed to do this.

Get a keychain item, you know, open in bundle network connections, I hope not. Then-and you want to remember this-- you want to put in some database of yours, some configurations somewhere, you know, the one piece of information that allows you to recognize this program again. This is what you do. You take the designated requirement from the code. You stick it in your database. You feed it to the validation API. You're done.

No other piece of information required for the code signing part to work. Now the converse is also true. Let's think about this for a moment. Any one piece of code as it's running or as it's sitting on the hard drive tends to satisfy a number of different identities. I mean obviously Apple's Mail program satisfies the use of Apple's Mail program identity, but it also satisfies the much loser "Was Made by Apple" identity. That's another code requirement.

You basically just drop the name part and get signed by Apple. And of course you can make up more interesting identities that this particular program happens to also satisfy. So just like a single code identity defines an open-ended set of applications it satisfied, so any one application can satisfy a number of code identities. I already told you that one. In particular, if you're ever tempted to compare 2 applications by fishing out some part of the signature and just comparing them for equality, you're doing it wrong, don't go there.

And just as a little warning marker, the word "identity" is awfully, awfully overloaded because it's such a popular word out there. Cryptographic signing identities which are the things that you feed to Xcode in the code signing identity field, those are cryptographic identities. You use them to make signatures, but they're not code identities. They used to make code identities.

I'm sorry for that, but we ran out of words. Well somebody told me yet again that a picture is worth a thousand words, so let's see if we can cram a thousand words in one slide.

Imagine that this entire screen is all the possible well-signed applications you can imagine. And that's really "that you can imagine," not just the ones that exist but the ones that could exist, the ones you make tomorrow, next year, the ones that Joe Hacker makes next year. Some subset of that is the designated requirement of Apple's Mail.

These are all the applications that satisfy the code requirement made by Apple and it is called Mail. That, of course, includes the Mail.app that we just started from. It would be very bad if a program didn't satisfy its own designated requirement. We all hope that it also includes some update to Mail even one that hasn't been made yet, even one that, you know, Apple won't actually master and ship until next year.

It doesn't include, we hope, some hacked version of Mail that some hacker took in to change and then resigned with their own identity because, well, it's not made by Apple. But there is a bigger code identity. Everything made by Apple is a rather large subset of made by Apple and it's called Mail. It includes iChat, for example, which clearly is not Mail.

So the system, as you can sort of see, can keep these things apart by simply forming these subsets of applications which are not enumerated. There's no list of these applications anywhere There are simply conditions that are being checked as you need them. Now, you can go in the other direction too.

You can make a code identity for Apple's Mail but only Version 3, which just looks in the info.plist which is fine because the info.plist is secured by the code seal which is secured by the code signature, so we know that they're not lying. So that's the subset of Apple's Mail.

And you're not restricted to subsets and supersets either. For example, there is a code identity for anybody who's allowed to access your me.com password, what used to be the mac.com password. That's actually what we call an application group. It's a marker in the info.plist that says it's in the group of programs allowed to access the me.com password. And contemporary Mail is in that certainly. iChat is in that. On the other hand, some really old version of Apple's Mail may not be, so these are overlapping but not hierarchical.

And you can imagine how from here you can go to anything you like. If you are stepping outside of the space of what Apple can sign, you can make up code identities that include, you know, hacked mail, if you happen to be the hacker who makes mail. And if you want, you can have your own code identities overlap Apple's. You could make one for Apple's Mail of MyMail, but that makes practical sense depends on your situation, but the system can do it.

So that's a thousand words of the previous slide. This is kind of hard 'cause I-- there's a lot of depth to this feature here, but I only have about 2 or 3 slides worth to talk about it. The system knows how to deal with code that supervises or controls or manages other code. In the trivial case where you're just riding a Mach-O binary, that's just the kernel managing your program, it's not really very interesting. But let's say you still have a PowerPC application sitting around somewhere, you know, one of those old games that they never update.

What really happens the way it runs is that there is Rosetta, which you-- we all love, which is really the Unix process that's being run and Rosetta, in turn, controls the PowerPC binary by translating it incrementally. And so the system just implicitly without you having to notice builds this chain of code being responsible for other code and the chain can get longer.

If you're calling the client APIs, you don't even have to know. That also applies to scripts, things like, you know, script interpreters like Ruby should be doing this. Sadly, Ruby right now doesn't do it on our system. That's our fault or somebody else's, whoever you think should be doing that work.

But Ruby should be and Python should be and many of the other interpreters should be calling API to declare themselves code hosts. And why do you care about this Well, for one, of course, if your program, the thing that you ship is an interpreter or a code manager, then this is your program and you should be reading up on the hosting, code hosting APIs.

It's not that terrifyingly complicated. In the simplest case, if you're a simple script interpreter, then there are really only 2 API calls you need to make. But the more important reason why you care is because this could be your customer script. If you are actually selling interpreters, then your customers to a large extent are the people who write scripts that your interpreter runs.

And these things, being code, need to be code signed and in order for the system to sort this all out and know that myscript is actually a separate piece of code rather than the Ruby interpreter being dragged, you need to explain it to the system. So that's why you care if you're an interpreter or if you're a code manager. If you run applets, scriptlets, scripts, or anything else that you honestly think of as code, you need to dig down into code hosting and get up to speed.

That's all I have time to talk about. OK, so what happens if you completely ignore me and just keep shipping unsigned code? Well, you know the answer about the phone. It will politely tell you that, no, sorry, they can't do that. But Mac OS X will still run it because we're tolerant and besides we don't like knifing our developers in the back, for a while.

[Laughter] So unsigned code for the most part will still run. The kernel will still execute it. You know, the Windows will come up. You know, everything will look more or less OK and unchanged which, of course, leads to a lot of developers going, "See I don't need this stuff, you know, nothing changed", yeah, except, of course, when you hit a subsystem that actually cares about code signing.

If you hit the keychain, if you actually try to create or fetch a keychain item, then the system does suddenly care and it will say things like "The authenticity of your program cannot be verified", which is a very politically correct way of saying, "I don't know who this guy is and I don't trust him".

If you are accepting inbound network connections and the user happens to have turned on the application Firewall, then the system cares and you may get a dialogue that you wouldn't otherwise see. If you're running under parental controls, if the user is a managed account, then it definitely cares because we're using code signing to decide whether a particular program is allowed to run.

So depending which subsystem you happen to be touching on, you'll see different kinds of behaviors, anything from on the one end, "Yeah, we don't care, we'll just fake it", to on the other end, "I'm not going to deal with you and you can't have it". If you're writing a debugger, I know, who does, but if you were writing a debugger or performance tool or something along those lines, you will find that some of the system calls just won't work for you anymore unless you're code signed in a particular way. And you get it all all the way in between. The keychain actually bends over backwards to still sort of kind of try to work with unsigned programs, which is kind of cheap because all we do is we left the Tiger code in there.

So, seriously, if you're unsigned and you work with the keychain, you are running Tiger code. How much testing do you think that's getting these days? So if you want to be first class code, if you want to run with the big boys, sign your code. And, of course, you know, you're shipping updates, the system goes "I don't know who that is", so software updates won't be recognized by any of these subsystems. Parental controls will go, "I don't know what that is". The Firewall will go, "Are you sure you want this?" Keychain will say, "Do you want to allow access?"

Dialogs, dialogs. Everybody hates dialogs. Now, here's a flip side to this. You did everything right. You signed your code. You shipped it. You were really careful with your software updates, but something went wrong, usually, by your update mechanism. And now the signature's broken. So what happens then? What happens to your poor defenseless code?

Unsigned code is considered to have no reliable identity. It says it's your program, but who knows. Code that is signed that looks signed but has a broken signature is considered to have no identity at all, not even a tentative one. Anytime the system has 2 incidents, the answer is, "We don't know, mystery anonymous, don't know." So that means that anytime you're making an API call that's-- has access controls based on who you are, you just won't get it.

And that's considerably more strict than for unsigned code because you're not on the legacy path anymore, you're on a "something went wrong here" path. In particular, you won't be able to get access to keychain items. The system will not even put up a dialog. There is, by default, no dialog that says this application of yours that has a broken signature wants to get a keychain item, do you want to allow this anyway.

You-- the call just fails because clearly there's something wrong with your program and we don't want your program to get anywhere near its secrets if clearly its identity has been messed up. Something that's not so obvious is if your-- if you've lost your identity, you also can't create keychain items, why is that?

Well, there's access control lists attached to each keychain item that says who's allowed to access it, which applications. So when we make a new keychain item by default, we put in an access control list entry that says the creator, the application that made the creation call, is allowed to have access to this item, what do you think is in there? Code signing identity. If you don't have an identity, we can't make that ACL entry and so it fails. Once a code signature is broken, it's broken. The system does not try to fix code signatures. This is really important.

We can't because we have no idea what went wrong. It could just be that, you know, you got a file wrong, it could be that some over zealous disk cleaning program went over and removed one file to many. It could be that, you know, you've got a virus running around or some worm hacking around in your programs making your user really unhappy. We can't distinguish these.

All we know is, you know, something is wrong with your program, so the situation will simply stay that way and if the thing that's wrong is a static problem, something on the hard drive got messed up, you know, the user needs to recover by restoring a backup or by reinstalling your application. You know, do some things that make the situations on disk be OK again.

If the problem is that the program got dynamically invalidated, remember I told you, you can always say make me invalid right now and it sticks, because if you don't, you're about to do something that makes you untrustworthy. So, if that happens, then the program will stay invalid until it quits so you can tell the user to quit it, relaunch it, and not do that thing again, whatever that was that caused the invalidity.

Now that was all really abstract, so let me give you a couple of practical examples. These area all Apple applications 'cause, you know, we started using this first, but there's nothing Apple magic about it. Remember the picture I showed you earlier on how to solve flaws together starting with you building the application and shipping it?

the part that we're interested in is where the sign code actually gets run and then we feed a code requirement to the validation API and the code requirement needs to be stored in some database or configuration somewhere. It's really the nature of a code requirement and the way it gets stored that's different in the different applications. The rest of the machinery is very standard. There's very little change there.

You've got a couple of flags you can pass in for making things faster or slower. But it's the code requirement and how it gets stored that distinguishes the interesting cases. Let's start with the application Firewall which was new in Leopard. The application Firewall is trying to restrict who can accept inbound network connections, and remember it's all by default these days so if you want to test this out, you need to turn it on. Its policy is to look at the code identity of the caller to accept certain system trusted code signatures as is without dialog and anything-- anybody else who asked for inbound network connection gets a dialog, I showed it here on the right.

And if you click-- if the user clicks on the Allow button, we remember the Designated Requirement of the callers so next time we can recognize it. The application Firewall stores these code requirements in a system database that's secured just by route privileges because this is per system facility.

Keychains, keychain's access control list, I've already told you, contain a list of applications that are allowed to access a particular keychain item and this is the dialog that you get when you're not on the list. The policy that the keychain machinery uses is completely explicit. Whatever is in that list that's what we allow.

And each of the application entries in the access control list is a code requirement, that's how we remember it, that's how the keychain machinery remembers applications. So in this particular case, there is no system database. There is no global file anywhere in the system that says, "You know, Mail is allowed to access you're me.com password."

There are code requirements stored right there with the keychain item in the keychain and secured by the keychain's proprietary of course, and the verification happens whenever somebody asks for access to a keychain item. There is a new feature in Snow Leopard called Service Manager which is kind of an interesting example of how easily you can get certain things done with code signing that were kind of almost impossible to get right before.

And when a program has a privileged Helper, you know, we keep telling you, you know, factor out your privilege operations, put them in the little Helper, you know, yeah, there's sample code and there's many-- lines of sample code, but the problem keeps coming up, you know. OK, factor your program, you have a little Helper now. How do you install the Helper? We tell you to use launchd to launch the Helper which means you need to install a launchd configuration and the privileges for that so we are back to I need privilege string installation.

Well, the Service Manager is trying to make it a lot easier in Snow Leopard, it doesn't exist in Leopard. What it does is you've got your application and you've got the Helper that you've factored out. And instead of using things like authorization or UID checks or whatever it is that you'd normally use to authorize the installation of the Helper in launch-d and then the launching of the Helper, what you is the applications info.plists simply contains the code requirement for the Helper identifying the Helper and the Helper gets an info.plist that identifies the application so that the 2 essentially point at each other and say, "This one's mine. I'm happy to work with this guy".

And remember that info.plists are secured for the code signatures. So this is, you know, the classic worm biting its tail. It's secured because they are both referring to each other and securing each other's identities this way. So again, there's no global database anywhere that says which Helpers belong to which application.

It's distributed in the applications in the Helpers in the info.plists and what code signing does is it secures these relationships. So it's a pretty straightforward application of code signing, but it's an interesting one nonetheless. And of course, it's API now and you can come up with your own ideas.

Let me just go through a couple of points that need to be said. The cryptographic signing identities in your keychains have an industry-standard format. It's, you know, RFC 2459 and the dozens of RFCs that are mended and modified and make changes to it. You can make your own. I showed you how to do that in the simplest case with the Certificate Assistant.

The Certificate Assistant can also make interesting CAs that can actually build certificate chains to you, so instead of 2 minutes, that takes 10 minutes but it can be done. If you'd rather you can buy your signing identities from a commercial CA, that's fine too, we don't care. We don't tell you to do it. We don't tell you not to do it. And of course on the iPhone, you get it as part of the page developer iPhone program.

No matter how you get them, you store them in keychains, you don't get a choice there, there is no "But what if I don't want to use keychains," you do. If you got it from somebody else, say in some other form, it's probably in PKCS12 which is the industry-standard exchange format for cryptographic identities, so just input that in a keychain. You may want to make a separate keychain and put it just in there if you're paranoid but keychains it is.

No choice there. There is a particular mode of signing that you may run into called ad-hoc signing. In some situations, a subsystem just can't deal with unsigned code anymore, application Firewall, parental controls are examples. And what they do when they run into unsigned code like yours is, they simply write there on the spot the first time they see it, they apply a signature that's called an ad-hoc signature and no, that's not a good thing. That's actually a pretty bad thing. It's sort of a last ditch emergency measure because otherwise, we just have to kick you out, and we're just too nice for that.

So, an ad-hoc signature doesn't know who signed the stuff. It can't place the cryptographic identity to secure this so it doesn't-- the last best thing and it just takes the narrowest possible code identity, the one that only satisfies this very one program that's sitting there on disk that we're staring at right now and it forms an identity for just that, which is good enough to get, say the Firewall going but well, there's a couple of problems with it. Obviously, we can't track updates because if you change it, it's different, we have no idea that it's meant to be an update.

But more to the point, this changes your program. We've had some game developers who thought they were being really, really paranoid, applying their own integrity checks on their programs and the first time they hit the application Firewall, they failed their own integrity check because we signed them and that changed it and that failed their integrity check.

So ad-hoc signing is not your friend. It's not even your buddy. Also, ad-hoc signing is architecture specific. It's whichever architecture happens to be running at the time.

If you run on a 32-bit laptop say, guess this, it'll run out there, and then you transfer the program or you migrate to a 64-bit machine, the ad-hoc signatures will not be same, so emergency measure. OK, I don't have time to talk very much about the API.

It's in the Snow Leopard. You've got your seat. The head of files have had a DOC in them and that's pretty much the primary documentation in addition to the stuff on the developer website. So rode around and play around with it. It's, you know, I think it's great fun.

The API is new in Snow Leopard. It wasn't public in Leopard. There's a client API that essentially lets you validate identities, built references to running code and static code, and manipulate code requirements. There is a separate hosting API that you only need to touch if you are writing a code host, an interpreter, a code manager, something running, lets of some sort, applets, you know.

There is no API, no public API for actually signing code yet. So as of Snow Leopard, you still get to run the code sign command if you need to programatically sign code. The underlying API objects are CoreFoundation objects and I mean that, they actually are. They don't look alike. They don't smell alike.

They actually are CoreFoundation objects so you can do anything with them that you do with the CF something, you know, CFRetain it, CFRelease it, whatever in the CF dictionary, you know, all that stuff. There is a secCodeRef that are referred to running codes, secStaticCodeRefs referring to code on disk.

When you do that, you want to think about whether that's really what you want, and secRequirementRefs for the internal representation form of code requirements. These objects are local write-- they are in your address space. They refer to things out there in the system. So if you have like secCodeRef, it refers to a particular piece of code out here, a process running, a script being run by some other program.

But if you have 3 processes doing validation on the same piece of code, they each get the separate secCodeRefs. It's an important architectural feature of code signing that validation happens in the client space. So whoever is interested in the identity of a program and its integrity gets to do the validation. So there's no server or daemon out there that keeps that for you and just tells you it's blessed, it's good, trust me.

Everybody gets to do their own validation. This is the top list of things that go wrong in practice. It's a much-- there's a much longer list of what could go wrong but this is what people actually run into often enough to keep asking about it on the mailing list. So I'll just try to save you in trouble and tell you.

Code does not go into the Resources directory. I know that Xcode makes it really easy to stick stuff into the Resources directory of a bundle, just don't do it. Resources directory contains resources, code, Helpers, don't go there. Put them in the Mac OS directory or put them in the contents directory or put them in a Helper's directory, that's something people out there are trying to get started.

Code signing does not care as long as it's not in the Resources directory. Don't change your code after you sign it. I mean that seems pretty obvious but there is many ways you can accidentally do it anyway. If your workflow calls for building the program and then sending it over to some different department to master it, then finding as part of the building is probably not what you want.

You want to build-- you want to sign after mastering because mastering tends to change the files in your bundle. If you are outsourcing localization, keep in mind that you need to have the full bundle when you sign it, all the localizations that you want to be valid. So you can't sign a program, ship it, and then have somebody else provide localizations later that weren't there before. Localizations can be removed after signing. There is an understanding in the machinery by default that it's OK to remove localization files but not add them.

If you're having one of those nice installers with the, you know, special options to say which parts of the bundle you want to install and which ones not, that's probably not going to work because the code signature is an all or nothing thing. Either all the files are here or not. So optional installs.

You probably better off restructuring to put them into separate bundles. If you absolutely can't, you do have access to what we call resource specification lists which is essentially a list of regular expressions determining what resources are really resources and which ones should be ignored. That lets you point that particular resource files and say those are optional, just like the localizations.

If you think it's really cool to make up a bundle with symbolic links pointing somewhere else, "Oh look I made a bundle from pieces," don't, don't go there, bad idea. But of course if your code is self-modifying, you know, you're not even getting to the starting block here, self-modifying code is just that. More information.

Craig Keithley, the Omnibus Technology Evangelist. He's over there. He will be on the stage in a moment. He would love to answer more questions. The main pages for code signing are pretty good, probably not complete but if you don't find something there that you think should be there, file a Radar.

There is a little utility called CSreq that lets you manipulate code signing requirements in files, convert them between text and binary and backwards and verify them, stuff like that. And there is an increasing set of documentation on the Apple developer website. The API reference, you still need to log in as a registered developer because it's a Snow Leopard feature. The other ones are already publicly there.