Assigning Your Application an Identity with Code Signing - WWDC 2008

Essentials • 1:11:41

Code signing in Mac OS X allows the Keychain and other operating system features to verify your application's ownership without prompting your users--even after you've updated your application. Find out how digitally signing your application ensures the integrity of your code and enables the system to recognize and alert users to unauthorized changes. Learn how to sign your applications, how signed applications work and how signing improves security and your customers' experience.

Speaker: Perry "the Cynic" Kiehtreiber

Unlisted on Apple Developer site

Downloads from Apple

SD Video (871.9 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Welcome. I am Perry 'the Cynic'. I am the senior architect in the data security group. I designed code signing and implemented it with a lot of help from my friends. So I'm yet again here to tell you about code signing in Mac OS X and why it's good for you.

That's me. So I'll tell you about what it is. I'll tell you enough about how it works so you get an idea. I'll tell you how to work with code signatures, how to make them, how to verify them. I'll tell you about the API, which has just become public, so you too can now call code signing functions in Snow Leopard. I'll tell you what you need to do to get your program signed and I'll tell you what to look out for so you don't get into trouble.

Code signing helps seal a program against modifications. So it's first a way to figure out whether something's changed. It's a way of defining what it means for code to have identity. Like what it means to be Apple's Mail.app versus an imposter Mail.app or, heaven forfend, your Mail.app. It can verify things on the hard drive--it's called static validation--and it can verify things that are running in the system.

We call that dynamic validation, and it's really what code signing is about. And finally, to get all of that working, almost as a side effect, we'll provide a way of formally expressing constraints on code--conditions you can impose on code that it needs to satisfy in order to be allowed to do things.

And what does it not do? That's almost as important. Just like a lot of the other security services like authorization, signing a program does not get you automatically any kind of process privileges that you don't already have. What it does is help other programs in the system establish whether you're really you and whether you're an acceptable candidate for getting services.

Code signing does not protect by itself against bugs. Because it's about identity. It's not about behavior. If you write a program, and you put a bug in that erases a hard drive, and you sign it, and you send it to the user, and the user runs it, it will erase the hard drive. And that's not a problem with code signing, because nothing in there promised that it wouldn't erase your hard drive.

And from the end user's point of view, code signing does not protect the user against trusting the wrong guy. If the user says, hard drive eraser dot app sounds great, and I'll run it, then it will go and erase the hard drive. Because that's what the user said he wants. And despite what you may have heard recently, this is really about what the user wants.

Code signing is meant to be an integral foundational part of Mac OS. And so it's in Mac OS X Leopard and Snow Leopard and it's on the phone. It's really pretty much the same code signing technology. Some of the details are different, some of the emphasis is different, but it's the same technology. It works pretty much the same on the two platforms. It obviously applies to programs, Mac OS processes, but it can also apply to libraries and frameworks and scripts and plugins. It's designed to be able to apply to anything that's runnable code on the system.

And, well, it does. Static and dynamic validation. The goal going down the road is a system where everything is signed. That's already true on the phone because the phone enforces it. Code just won't run on the phone unless it's signed. It's not true in Mac OS X as of Leopard, or for that matter, Snow Leopard, but it's a goal we're working towards. That's where we need your help because, you know, your code wants to run on the system too. So in the end, down the road, it will need to be signed.

So what are we using it for? I mean, obviously now that we have APIs, you can use it for anything you like. But if you just install Leopard or Snow Leopard, what does it use it for? Anytime you use Keychain services, the code signing logic is used to identify programs that ask for Keychain services.

So when an access control list says, "Only Apple's mail program is allowed to get your mail password," Apple's mail program is really a code signing constraint that gets evaluated by the security server at runtime to decide whether the program making the request is allowed to get access to that item.

For that matter, any time there's a dialogue that comes up, one of those security dialogues that says the program so-and-so wants to do something--your program, some other program, like for example in an authorization dialogue--the name of the program that's showing up here is determined by code signing. So this is all about program identity, about code identity, and obviously anywhere in the system where we care about code identity, code signing is used.

The extreme case, of course, is parental controls, which cares about which code is allowed to run at all if you're a restricted user. And as of Leopard, parental controls and MCX are using code signing to decide whether something is allowed to run at all. So here's an extreme case of if code signing doesn't work for a program, then a restricted user just won't be able to run it at all.

The application firewall is using code signatures to keep track of the identity of programs. Once you've answered a dialogue and said it's okay for that program to allow inbound network connections, the firewall remembers the identity of that program so the next time you don't get asked again. There's actually a lot of situations where it works like this: you point at a program and you say, "This program from now on." And code signatures are used to make sure that the next time the program comes and asks, it's still the same program.

And then there's a couple of little things like what we call developer tools access. For example, the task for PID system call, which is extremely powerful but kind of useful if you write debuggers or performance tools. Going down the road, you need to be code signed and marked up to be allowed to use that at all.

Well, let's try the thousand words for a picture here. You have an executable and if it's not a plain tool, it actually has resources and an Info.plist and maybe a couple of helper tools because you're supposed to factor out your privileged operations. So what do we do to sign this?

Well, we take these files, we add a unique identifier--you pick that one if you're doing the signing-- And we put a seal on these files to keep them from being changed. But because a seal can be rebuilt at any time, we then use a digital identity to sign this sealed blob of data.

The idea of a digital identity is that only you, the developer, have the private key that can make these. You don't give it to anybody. It's a secret that only you have. And therefore, as long as you don't mess up and, you know, post it on the internet, nobody else can actually make a signature that looks like you made it.

This part is what we call a code signature. So it's the identifier, it's the seal, and it's the signature that's placed on the seal. And usually the signature gets embedded in the program. If it's a MachO binary, it just goes into the MachO file itself, which of course gets modified during that process.

Now, you may wonder what happened to that poor little helper tool off there to the side that clearly isn't part of the code signature. No, it isn't, because we want you to put a separate signature on your helper tool. That's an important point and I'll come to that for a couple more times. Helpers are not part of the code signature of the thing that they're helpers for. They are separate pieces of code. They need their own signatures and it's important not to mix them up with the signature of the program and whose bundle they happen to be in.

Okay, so now you've signed a program, now what do you do with it? Well, so you start with your code, as we just discussed, you sign, starting with Snow Leopard, actually starting with the new developer tools that just came out. Xcode has learned how to sign, so you don't have to run the command line tool anymore, it still works of course, but now you can actually get Xcode to sign your code. So, you sign your program, and then you ship it.

And we do not care how you ship it. You can have a drag installer, you can have the Apple installer, you can use a third party installer, you can post it on your website and have people download it. We absolutely do not care how you get that code to the end user. What we do care about is that what ends up there is exactly what you said.

So, this is the code that you started with. So, this signed code on the end user system better be byte for byte, file for file, the same thing that you made when you signed your program. But usually it isn't a big problem if you do an initial installation. If you go into incremental updates in areas where you're only shipping the files that changed, you have to be a bit careful.

Okay, so then the user runs the program, and that's fine. And there is an API, the Code Signing Validation API, that you can point at a running program, basically say, "Is it okay?" And what you get out is, "Yes, it's okay," or, "Here's the reason why it's not okay."

And that by itself validates that the program hasn't been modified since it was signed. That's useful. I mean, if something corrupts the program, you'll know. If something intentionally corrupts the program, you'll usually know, unless the guy doing the corrupting is smart enough to corrupt it and then re-sign it with his own signature, which, of course, he can do. Anybody can sign a program. I mean, there's no secret to it.

So we need a little bit more to actually notice that somebody changed and then re-signed a program. So this is where these code requirements come in, these standardized ways of expressing constraints on programs. Because another input to the validation API is a requirement object. Now, where do you get that from? Well, usually you store it in some database. For example, the keychain API stores it in the access control list of a keychain item. In some other cases, it's a constant. Like, some program may only care that the program was signed by a particular manufacturer, like you.

But usually there's some kind of database involved somewhere. So how do you get it into the database? Well, there's actually a call that you can apply to a signed program that gets you one of those code requirements. And we'll discuss exactly how you do that a little bit later. But the general idea is that as long as you just care about continuity, about recognizing a program you've seen before, this is the data flow.

You start with the code when you see it the first time. You get yourself a requirement. You store it in the database. And then later on, if you're wondering if it's the same program you're looking at, you pass that to the validation API. And it will basically say, "Yeah, it's the same program," or "No, it's not."

So by now we've figured out that this is primarily a runtime feature. So this is not really about scouring around the hard drive and finding things and kind of going, "Yeah, that looks like mail over there in that bundle." This is primarily about--there's this program running this process. And yes, right now, at this particular point in time when you call the validation API, this is a valid instance of Apple's mail or a valid instance of your program.

Sort of as a side effect, we do a lot of static validation. We do look at the files on the hard drive because we need it as part of the implementation. And we give you API for actually looking at the disk and checking the static validity of code on disk because the functionality is already there and it's handy. But the focus of this whole feature is a runtime focus. It's about running code.

One thing that is part of the architecture and we're not doing a lot of yet, but we will do a lot more in the future, is dealing with hierarchies of code. I mean, in the simplest case, you've got a process running and you know the kernel manages it, but, you know, we know that. And so that's all we care about, a process. But then sometimes your code is a script, and the script is being interpreted by another program.

Or managed or supervised. Think of an applet running in Safari. The applet really is code conceptually and Safari is a program. So the two have a relationship and one of the features of code signing is called code hosting where you essentially can express to the system a relationship between a code host, a piece of code that's managing another piece of code which we call the guest.

And lastly, it's kind of important for this to be efficient. There's a lot of code signing machinery out there and other operating systems that works okay, as long as you really don't care about performance, because it basically sucks. So there's a lot of emphasis in our implementation on making this work without there being this big momentary impact when you run a program or when you call a verification API where everything comes to a screeching halt while we're thinking about whether this is okay. There's a lot of trickery in there, and if you want to know, ask questions later. But the one point that explains most of what we're doing is that we're signing a code, we're spreading stuff out.

There is no one point where a validation happens. There's a lot of stuff that happens incrementally as the program runs, and the real magic--and, well, we did patent it--is how you can string these little incremental verification operations together without having a hole in the middle that the attacker can drive some truck through.

Okay, that's just a little bit of graphic. This is the classic case. You have a program like Mail.app. The kernel runs it. That's the code signing view. Basically, you have the running process. The system knows where it came from on the hard drive. It picks up the signature. Verifies.

If you happen to be running a PowerPC program, you may know that there's this thing called Rosetta that actually runs your code. So it's not the kernel running your PowerPC program, it's something called Translate, which is the name of the program that implements Rosetta. So code signing sees it like this: the kernel actually runs Translate, and Translate is code signed and has a little mark on it saying "I'm a code host," and Translate is managing the PowerPC code. So you can see this chain being built.

Code signing understands this. I mean, it's already implemented in Leopard this way. So this is covered territory. And if you want to see it any deeper, it can go deeper if you're unfortunate enough to run Windows 2004 still, which happens to be a CFM application. If you don't know what that is, don't worry about it, but it needs a special interpreter called LaunchCFMApp, which just happens to be a PowerPC binary because it's really a legacy feature, so now we're 3D. And the machinery still works. Code signing can deal with this. If you have an interpreter, there is API that you can call to tell the system, "Hello, I'm managing other code, and I'd like to be a code host," and you'll just fit into the machinery. No problem.

Okay, so code requirements. It turns out from the two years that I've talked about this now that that is the sleeper feature of code signing. This is the thing that a lot of people don't even realize is there and yet in many ways it's the most important single part of the machinery. A code requirement is sort of like a little formula, a little expression that contains conditions that a program needs to satisfy. Things like this is its name or this is who signed it or, you know, there's this key in the info.p list of the program.

And you can match against all kinds of things. The really important part to understand is that a code requirement really defines what it means to have code identity. When you're talking about, like, what is mail.app? Apple's mail.app. Codesigning answers, it's this requirement right here. It's signed by Apple, and it's called com.apple.mail. That's not just an approximation. That's not just a good idea.

That's the rule. That's what it means to be mail.app as far as codesigning is concerned. If you have your program, your program's identity is probably going to be defined as it's signed by you, and it's got whatever name you assigned it. com.yourcompany.magicalfoodprogram. So a code requirement is a definition of what it means to have a particular form of identity, and it doesn't have to be an individual program.

You can make a code requirement, and it's a code requirement that says, signed by me, and it'll cover all the programs you will ever sign in your life by definition. So it's not necessarily the identity of a particular instance of program. It could be a group of programs. It could be an open-ended group of programs. It's a very powerful feature.

Code requirements can be stored pretty much anywhere. There's two forms of it. You can write them as text and you can store them as a binary blob, which is just a variable length binary array of bytes. You can stick it anywhere. You can stick it in a file, you can stick it in a database. We don't care. And there's API for converting between text and binary.

And there's a particular class of requirements that we call internal requirements because they're actually embedded in code signatures. They're requirements that programs have on other programs, that code has on other code. For example, if you happen to be signing an interpreted script, you can probably imagine that it wouldn't be good for security if your script gets run by some unfriendly interpreter.

So if you want to secure your script, you need to say, "I need to be run by an interpreter that I trust." Well, guess what? It's code identity again. So you can embed an internal requirement in the signature of your script that says, "This is the identity of the interpreter that I'm willing to run with."

internal requirements. And this is the most important one requirement of them all. We call it the designated requirement because it's a requirement you get from a program when you ask it, "Who are you? What are you?" You point at signed code, you get back a requirement. The API makes one up if there is no explicit one specified by the signer, or if you know better when you sign, you can explicitly craft one, stick it in, and you'll just get it back out from the API. And you know, the meaning of this API, the meaning of the designated requirement is how -- if I look at the program later on, how can I tell whether it's you? It's a designated requirement. This is how the program designates its identity.

Oh, and, uh, sort of almost incidentally, designated requirements define what it means to be a software update--an update. Because think about the definition of Apple's mail--signed by Apple and called mail. If Apple makes a new one and ships it, it's still signed by Apple and called mail. It's the same program.

Code signing understands that even though not a single byte might be the same between these two programs, they are meant to be the same program because of the way they're signed. So almost incidentally, sort of as a side effect, we solved the software update problem in terms of tracking what it means to be a continuation, an extension of another program.

[Transcript missing]

Oh, and obviously you can combine these conditions with AND and OR and parentheses in the perfectly obvious way. If you dig down to the elements of the requirement language, you can match for exact string values. And we mean exact, case sensitive and all. You can match for prefixes and postfixes and substrings. There's no regular expression facility in there right now because we tend not to trust the security of regular expression evaluators all that much.

And as of Snow Leopard, you can actually compare for inequality. If you use inequality like greater and less than, what you actually get is string comparison with the numeric option. Look it up in the Core Foundation CFString documentation so that if you're comparing, for example, version strings like 7.4.3, you actually get what you want. You get the numbers sorted right.

Okay, if you want to play around with requirements, either because you're just curious or because you're trying to make up just the right one, there's this little utility Swiss Army Knife program called CSREC that lets you convert between the text form and the binary form of requirements. You basically can pass it either and you can ask for either form as output.

You can actually ask it to convert from text to text, which is useful because it internally compiles it and then uncompiles it back to text. So it reads you back what it thinks you said, which is useful if you're a little bit wondering whether you're saying it the right way.

So that's CSREC. And if you're actually trying to evaluate a requirement to see if it applies properly to a program on disk or to a running program, there is the dash capital R option to code sign. You can ask code sign for a verification and pass any requirement. You like it and it will tell you whether the program is intact and whether it satisfies the requirement you passed.

So the whole thing is based on signing your code with digital identities. And, you know, I could do an hour on what digital identities are, but thank God I don't have to because Ken MacLeod over there has an entire session on digital identities and X.509 certificates. So if this is news for you, you probably want to go to his session where he will explain what those are, how you make them in detail, how you manage them. And this isn't code signing specific.

Digital identities are used for all kinds of stuff. Any time you use SSL, you use them. Any time you sign email, you use them. It's the same machinery. As a matter of fact, it really is the same machinery. We're using the same cryptographic libraries for making code signatures as we're using to sign email--CMS.

So if digital identities are used for all kinds of things, they're not necessarily used for the same purpose. So if this is news for you, you probably want to go to his session where he will explain what those are, how you make them in detail, how you manage them, and this isn't code signing specific. Digital identities are used for all kinds of stuff. digital identities are a big mystery for you.

Can session, I think, at five is definitely where you want to go. So for the rest of you who pretty much think you know what a digital identity is, let me just run through this. Anything that's marked in as in the extended key usage as can code sign is okay for making code signatures. There is a little utility inside of keychain access to make these identities from scratch. It's called the certificate assistant. Doesn't call you anything, and in 30 seconds you can have a perfectly good tested identity for signing.

If you are not planning to turn into a large enterprise, that's pretty much all you need. If you are working for a large enterprise or you're planning to turn into one, you want to go a little bit further down the road and perhaps make yourself your own certificate authority, which in the certificate system takes maybe five minutes, which is a lot better than the five hours of OpenSSL you probably need. But anyway, you can make your own. Or you can go out to various companies out there, commercial certificate authorities, and you can buy a code signing certificate from them.

That's fine with us too. As long as they're properly structured, they'll work. If you happen to have an identity called Authenticode, that's something that Microsoft invented as a trademark a while ago, if you have an Authenticode identity, it'll work for code signing because it's following the same standards.

It's a box standard RFC internet standard. Okay. Um... In the special case of the iPhone, Apple issues identities as part of the way we're controlling what gets installed where and who trusts who. So if you are developing for the iPhone, you are going to use Apple-issued identities. You don't get a choice there.

No matter what kind of identity you're using, you end up storing them in a keychain because that's the way it works on Mac OS X. If you get it from somewhere else, you have to import it into your keychain. If you generate it in your keychain and you want to send it somewhere else, you have to export it from your keychain. Ken will explain all of these things in beautiful detail. If you just need a hint, if somebody sends you something that's supposed to contain that identity, you're looking for something called PKCS12. That's the name of the standard file that's usually used to transfer these things.

Okay, this is another question that came up a lot in the last two years, so I'm dedicating a slide to it. People talk a lot about code signatures being trusted by a particular system, like, you know, "Does this system trust my signature?" And the question usually is mistaken, but let me tell you why. There's a feature in Mac OS X--it's been there for a while, but it got revamped for Leopard--it's called Trust Settings.

What Trust Settings does is, for any certificate, but particularly for Anchor root certificates, it attaches a marker saying whether the system trusts that certificate for a particular purpose. You can actually--in the simplest case, you can just say, "I trust this thing." You know, "It's from Apple. I trust it." Cool. Or you can say, "It's from Adobe. I'll trust it." Okay.

But you can also be more specific. You can say, I trust it for the purposes of SSL server authentication. So you don't have to-- if some company, some bank is giving you a particular certificate and saying this is how you can tell that we're not being spoofed, you can put a trust setting marker on it and say, trust this for SSL, but don't trust it for email or code signing, because that would be excessive.

So anyway, that's a mechanism for putting markers on certificates to say whether the system trusts them for a particular purpose or to say whether a user trusts them. There are user-specific versions of those, too. And that's a great feature, and it does not apply to code signing, usually, at all, even though that has the word trust in it. Because code signing, for the most part, cares about continuity of identity rather than origin of identity.

Now those are big words, but this matters. Usually, you only care whether a program is the same as the one was before. Like, is the mail that's trying to get a keychain password the same as the mail that stored it? Or is the program that's trying to retrieve the .mac password one that's in the group of programs that's allowed to access the .mac password? This does not, by itself, depend on who signed the program or whether you like them. It's about, is it the same program? And for that, we don't need to consult the System Trust Settings database. We just care whether it's still the same.

So, code signing by itself, at its core level, does not use the Trust Settings database at all. It doesn't mind if a certificate is trusted by the system. It just doesn't care. Now, there is an element in the requirement language that says, "And the certificate in the signature has to be trusted by the system." If you put that in, or if somebody else puts that in, then it matters because the requirement says so. But it's important to understand that this code signing infrastructure does not impose any such requirement. It's a particular code requirement that may or may not say so.

So most of the time, designated requirements just work. Good. Because those of you who are signing your program probably didn't even think about whether you needed to do anything there.

[Transcript missing]

types of certificate chains we know what to do. If you're making yourself your own signing identity, just a self-signed certificate, it's easy. That certificate, it's the only one you got.

If you make your own CA and issue yourself signing certificates and then you sign your code, you'll be OK, too, as long as you put your name unchanged into all of the organization fields. So here's a hint. If you build your own CA for code signing purposes, put in an organization field into all the certificates and give them all the same name. Doesn't matter what it is. It's your company name, usually, in some legally acceptable form.

But make them all be the same, because that's how the designated requirement generator figures this out. It starts at the leaf and kind of goes, those all seem to come from the same company. I'll just go all the way to the root and use that. If you change it, it assumes, because it's trying to be conservative, that going up the chain-- oh, it changed here. I guess they bought that from someone else, so we'll stop. here.

And that's also the point where you may have to explicitly write your own designated requirement. If you buy a signing certificate from a company, we have no clue what that company's policies are. Particularly, we have no idea what they do when they reissue your certificates. These things expire and then you get a new one, of course you pay again. Duh. So, since we don't know how they reissue certificates, we don't know what they change when they do that, we can't automatically write a requirement that says what that policy is.

So, in the case where you're buying yourself a certificate from someone else, you need to write one. You can either do that yourself or you could actually try to call up your vendor and ask them what to use. I have no idea whether they can give you an answer, but they should. Thank you.

The general process is look at the certificate chain that you're getting when you sign with one of those certificates issued from a company that has their own policies, and then try to write a code requirement that basically nails down the parts that you know won't change. Like if they're always using the same subject common name for you, then you want to put the subject common name of that certificate is this. Or if there is a particular marker extension that they put into an intermediate, you want to say that. In the demo in a little bit, I'll show you one example of that.

Another little incidental thing: when you're signing code, one of the arguments that you can pass into the signing operation is called the "signing flags," which you can ignore for the most part. If you are making a code host, there's a flag that says "I'm a code host" that you need to set.

But other than that, if you don't pass any flags, you'll just get the default behavior. There are two flags that are interesting, though, if you want to achieve a particular effect. There's a "force_kill" flag that if you pass it during signing, sets a flag in the signature that immediately causes the system to kill your program if it ever loses its identity.

Oh yeah, that's harsh. But sometimes it's exactly what you want, because by exclusion, it means as long as your program is running, it's valid. And sometimes that's kind of handy. It means you never have to worry about "have I been hacked?" Well, if you have been hacked, and the system actually managed to detect it, you'd be dead. So you haven't been hacked. Case closed.

There is a slightly milder version of this called the "hard flag," and you can force that on, too. The meaning of the hard flag is if I ever have a choice between getting a resource, like loading a resource file or paging in a page, or loading a library, and that makes me invalid, because we can't trust the thing we're going to load, or just not getting this thing but staying valid, I'd rather not get this thing.

So if you set the hard flag on a program, it may be told by the system that you can't page in this page, you can't read this resource, you can't load this library, but you'll stay valid. If the flag is off, the default behavior is, "Here, have your library," and you just got invalidated.

Okay, API. The API itself is at least a year and a half old, but until a month ago it was private to Apple. We've just managed to publish, I think, about 60% of it. And so I'm going to give you a quick run-through just to give you a guide to what to look for while you're reading the header files and trying out things.

The API objects are core foundation objects. They really are core foundation objects, the genuine article. So when you get them, you can use them for as long as you want and then you CF release them. You can stick them into CFArrays and CFDictionary's and CF sets if you absolutely have to.

You can treat them like core foundation objects in all respects because they really, really are. These are the three types: a SecCodeRef is a reference to running code. So that's the ones you usually are going to throw around. A SecStaticCode is a reference to code stored on disk. That's usually a single tool or a bundle. And a SecRequirement is a code requirement.

So in order to do validation, which is the reason for existence for this whole thing, where you're checking that a program can be accessed, that it has integrity, and that it conforms with a requirement, that's the call you're going to make. It's really quite innocent looking. Basically, you pass a code reference and a requirement reference.

All of the APIs have a flag argument, and usually you just pass kseccs default flags, which says I'll take the default thank you. There are API-specific flags you can pass into change behavior, and if we ever need to do the backward compatible thing, that's how we'll do it.

These are a couple of errors that you might find. There's this list of 50 errors that code signing can return to you telling you what went wrong. And it can also return to you all kinds of other OS status error codes from other subsystems. But these are the ones you're most likely going to find. Error status 'unsigned' means this wasn't signed, so I can tell you whether it's valid. Error status 'failed' means it's broken.

It's signed, but there's something wrong with the signature, and so you don't want to trust it. Error status 'failed' means that the signature is fine, the program has integrity, everything's cool, except the requirement you passed in doesn't actually match the program. So you probably don't want to accept it. Okay, so that means you need a code reference and a requirement reference. Well, how do you get a code reference?

There is one API function that pretty much does it for you all the time, and since we're being Core Foundation-like, we give it a long name. Basically, you pass it a CFDictionary of attributes. And, for example, if you're after a process, you would be passing in a dictionary with one entry that says 'the pit is this'. There's other possibilities, of course.

You can either specify a particular code host and say, 'I'm only looking for guests of this host', but if you pass null for the host, you'll just basically start at the kernel and you'll find what you can with the attributes that you passed in. And that's almost always what you do. You just pass null for the host, you pass in everything you know about the code, starting with the pit, and you let the system figure it out. You get back a set code ref, or if it doesn't work, you get back an error code.

[Transcript missing]

And finally, we have the Omnibus Information Call. You pass a bunch of flags, we'll give you a CFDictionary with more than you ever wanted to know about a code signature. Some of the things in the information dictionary you're getting back are actually live API objects from other subsystems. Like, one of them is a SecTrustDraft.

If you know what that is or you want to look it up, this is the actual validation object that is used internally to validate the cryptographic part of the signature. So if you are maximum amount of extra information out of the system, you can then in turn apply information calls to that and get even more stuff out. Okay, let's take a break and do some demo.

I'm still demoing with a command line tool. I hope you don't mind. It's just a lot easier. So here's a little program that basically doesn't do anything except it can create a keychain item and retrieve it. So we have something that we can actually apply code signing to. So for those of you who aren't actually signing your programs yet, let's just explore what happens if you don't sign it.

So here is the tool and let's say we want a test Keychain item which creates a Keychain item called test in the Keychain and If we retrieve it again, it comes back out because when you make a Keychain item, the application that created it is allowed to access it. Everything's fine, right? And then you update your program. And now it's version 1.1.

And you go and you run the program again and you get this dialog which you're all painfully familiar with. And the dialogue really, if you read between the lines, says, "Okay, there's a different thing here, but it kind of looks like it wants to be the previous one, so tell me, is this the same program? Do you want me to treat this like the same program, or what do you want to do?" And it's annoying because, of course, almost always you want to say, "Yes, of course it's the same program.

I just built it." And, of course, then perhaps once in a blue moon, it's not because some hacker sent you a hacked version. And how is the user supposed to know? It's not a good situation. So what do you do? You can hand sign it, but with the awesome power of Xcode, you can do better than that.

You can actually tell Xcode to sign. Ah, but what do you sign with? Right. Let's make an identity. So here's Keychain Access, which you'll find in the application utility. So you'll find the key to the key directory, and there is a certificate assistant. And you can ask the certificate assistant to make you a certificate.

The only trick you need here is you need to check the "Let me override the defaults" checkbox. And, yes, this just warns you that you probably should know what you're doing. And you can ask for a certificate type code signing. And then you can just keep on going. It knows what the other defaults are. And before... you know it.

You have a signing identity in your Keychain. Here's test the Keychain ID we just made. And a signing identity looks like a private key and a certificate. Keychain Assistant makes you the public key just in case you care. So now we have a signing identity called Demo. So let's go to Keychain Access, Xcode, and If you have the latest version of Xcode, you'll notice there's this new section here called Code Signing. All you really have to do is... put in the name of the signing identity. Demo. And then build your program.

Let's--better safe than sorry--clean it first. OK, what happened here? Internally, Xcode is running the code sign command. And the code sign command has no particular access to the identity that you're signing with. So the system gives you this dialogue. And this dialogue is not superfluous. This is a really good dialogue.

It says, do you really want the code sign command to have access to the signing identity? And you can either say, yeah, allow, which means you'll get asked the next time. Or you can say, always allow, which is easier in the long run if you're just testing around.

OK, so now we've signed the program, which you can verify by asking for a verification of the Frop tool. Code sign is a typical Unix command. So by default, if everything's fine, it doesn't say anything. But if you add more verbose flags, it will actually tell you that everything's fine. So let's run it again. So here.

You get this dialogue because what you've done now is you're trying to access a Keychain item that was made by an unsigned program, your old FROP tool. Now there's a signed version and since we don't yet know whether those two are supposed to be the same, we have to ask one more time. So we're saying yes and now everything's fine. And this time, if we are changing the program, we're going to version 2.0.

No dialogue, because now we use code signing. And now we understand--the system understands without having to guess or without having to make any dangerous assumptions that it's really the same program. Because you said so. You said it's PropTool and it's signed by the same guy. Let's see. We are, as always, a little bit behind on our schedule. So let me just show you a couple of potentially useful things if you're working with keychains.

If you are-- Looking at the access control list of a Keychain item, you'll see there are two entries for the tool now. One is the old one that applied to the unsigned program and the other one is the new one. How can you tell the difference? Actually, usually you can't, but there is an undocumented little preference which you can set. It's called Distinguished Legacy ACLs. Yeah, put the spaces and all in there like that. If you set that to true, then Legacy ACLs, the ones for unsigned programs, turn italic.

So you can tell the difference. I don't think that's documented anywhere, so bonus. Actually, I'll give you a better bonus. If you select a Keychain ACL entry that's for a signed program and you hold down the Option key and you hit the plus button, it will actually show you the code's requirement that's hidden in there.

So this is what really happens. When the system is trying to figure out whether FropTool is supposed to get access to this Keychain item. This is the requirement. It says this is the identifier and the leaf anchor, which since it's a self-signed certificate, it's the same as the root anchor, has this hash. We don't store the whole certificate. Certificates are pretty big, like a kilobyte.

Instead, we store the hash of it. And if you actually want to, you can edit this and hit OK inside of Keychain Access. and basically change it to whatever you want it to be. And, oh yeah, we were talking about how to make designated requirements. Let me show you how some parts of Apple do that.

Okay, this is a phone application that I just got from one of my colleagues. And if we take a look at the signature of this, This is the designated requirement that's generated by the system for this phone application. It looks a heck of a lot more complicated than anything you've seen so far because the situation is more complicated. Remember, this is a signing identity that Apple issued to a developer. So what we're seeing here is the identifier, which is just like the identifier in the Frop tool.

And it needs to be signed by Apple here, except we don't actually care whether it's signed by Apple for itself or for developers, so that's why the generic clause is here. And then we're seeing things about the certificates in the chain. We're seeing that the leaf must have a subject of this string, which just happens to be the signing identity for the guy who signed this phone application. And we're seeing that an intermediate--the first intermediate in the signing chain--has to have this particular certificate extension. Now, we generate this automatically because Apple knows how Apple does this.

We understand the rules that we made for ourselves for issuing certificates to developers. Something like that will need to be written if you buy your certificates from VeriSign, except the actual contents need to reflect that. So, we need to protect VeriSign's policies rather than Apple's policies. And that's the point where we can't guess for you. Okay.

I promised you some information about gotchas and things to watch out for. So let's talk about what could go wrong. First of all, what happens if you don't sign your code and you keep shipping it? Not that you would, but, you know, what would happen if you did?

The system believes that unsigned code has no reliable identity. Or perhaps you could say no verifiable identity. I mean it can claim to be a particular program but there is not very much we can do to actually verify such a claim. So the system proceeds on the assumption that yeah, it could be that program but then it could not. And when we have to present the user with a dialogue, that's what we'll tell them.

So if you run an unsigned program and there's a Keychain dialogue coming up, you'll find a little fine print in the dialogue basically says that the identity of this application cannot be verified, which is exactly true. Well, we don't know. Could be. Could be not. But in the end, the user just has to decide whether to trust this thing based on something that he knows that we don't. There are situations where an unsigned program will be automatically signed by the system.

Because you are about to interact with a subsystem that can deal with unsigned code. Parental Controls is one example, the Application Firewall is another. Now this may sound like a good thing at first, but it turns out not to be for a number of reasons. I'll talk specifically about ad hoc signatures in a little bit.

But generally, this automatic signing trick is trying to tide the user over until you can ship a signed version of your program. It's not a replacement for you signing your program because, well, we don't have your signing identity. We can't make any statements for you. We can only say, "Okay, we've nailed this program down in this spot and we can track that it hasn't changed until it changes."

And I will just have to assume it got hacked because we've already signed it. But generally, this automatic signing trick is trying to tide the user over until you can ship a signed version of your program. It's not a replacement for you signing your program because, well, we don't have your signing identity. We can't make any statements for you. We can only say, "Okay, we've nailed this program down in this spot and we can track that it hasn't changed until it changes." And I will just have to assume it got hacked because, well, we don't have your signing identity.

And I will just have to assume it got hacked because, well, we don't have your signing identity. But the other thing is it got hacked because we don't know any better. If this had been your code signature and the hack actually was a software update, then we could say, "Yeah, okay, satisfies the designated requirement. Everything's cool here.

We don't need to bother the user." But we can't do that if you don't sign your code because we don't know what your rules are. We don't know what you did to your code. All we know is it changed. So one of the sort of implicit side effects of ad hoc signing is we always take the pessimistic assumption that any time the program changes at all, it's probably a hack because, you know, you need to be conservative here.

If an unsigned program uses the Keychain, you essentially get shunted into a legacy path that tries to behave as much as possible as Tiger did. I don't know if you still remember Tiger, but there were a lot of dialogues in there. And you'll get every last one of them because they're still in there, complete with little notes in them saying that the identity of the application cannot be verified.

And that means, in particular, that if you are issuing software updates, you get those lovely "is this still the same program?" dialogues. And, you know, it's just not a really great user experience. And I don't know about you, but I've gotten sort of addicted to not getting those dialogues.

And some of the new subsystems will just basically give you the raspberry and not give you service, like the Developer Tools Access. If you want to call task for PID, you pretty much have to sign your code because, well, that's just the way it is. And as I said, if you don't sign your code, we don't know what a software update for your code looks like. So we'll just assume that it isn't.

There's one class of programs that absolutely, positively need to get signed, because otherwise they won't work, and that is programs, paradoxically, that self-check with their own self-check trickery. Because they want to defend themselves against evil hackers. If they're unsigned and they self-check and they touch something like the application firewall because they make an inbound connection or because they're running under parental controls, they will get ad hoc signed and the ad hoc signature will change the program. And then the self-check will fail.

That's not good. So paradoxically, if you are writing the kinds of programs that perform self-checks, you absolutely positively need to sign them right now or your users will be--well, your users already are very unhappy. Now, there's a flip side to this. So you go and you sign your program. You are very conscientious about it. You do everything right. But then something happens and the code signature breaks.

A program that is signed but has an invalid signature is considered to have no identity at all. Because clearly whatever claim to identity it has made with the code signature cannot be believed because, you know, it's broken. So obviously that means that any time that program makes a request on the basis of "I get this because I'm me" like requesting a keychain item, requesting developer tools access, requesting to be allowed to run because of parental controls and so on, it just won't--it will be denied. As a matter of fact, the system is currently shipped in a state both in Leopard and Snow Leopard where you don't even get a dialogue. You just--the retrieval call for the keychain item gets denied automatically.

Unless, obviously, a program like that can't make a new keychain item either. Why is that? Because when you're making a keychain item, you are making an access control list for the new item that says, and the creator has free access to this item. But the creator is a code signing requirement for the creating program. And if you don't have an identity, we can't make a code signing requirement for your identity because you don't have one.

So a program with a broken signature cannot make keychain items. And there's a number of other things that it can't do either. If the breakage is... Oh, okay. You can usually run on Leopard, personal Leopard, because we don't currently on Mac OS X simply deny launch for broken programs, for programs with broken signatures. Except in the case of parental controls, obviously.

There is no mechanism in the system that will repair a broken signature because we have no idea how we would do this. Because for one, we don't have your code signing identity, which is a good thing, which means that we couldn't fake a repair for you even if we wanted to, which we don't.

So once a program is broken on disk, it'll stay broken because presumably something painted over it, intentionally or not, and, you know, restore a backup or reinstall the program. That's basically the options. Now there's a second type of breakage which is dynamic invalidation, where a program does something at runtime that makes it lose its validity bit, its, you know, its state in the system that says, "I am still valid." If that's the only thing that happened to a program, you can actually quit it and relaunch it and not do that thing again that lost its identity because the thing on disk is fine. So here's another example where static and dynamic has slightly different consequences. So that's your options. Reinstall, which can mean restore the backup, obviously. That's Time Machine, you know, try it.

And if it's a dynamic breakage, then relaunch. When you are issuing software updates that do more complicated things than just replacing the whole application, you have to be careful that you are really shipping all of the files that have changed. That includes the files that code signing made in your bundle.

If Generally, any update mechanism that faithfully replicates your desired state onto the end user system will work fine. Because code signing absolutely doesn't even understand what your transport mechanism is. It just cares about the end result. And the end result has to be exactly the same files that were signed on your system when you signed your program. So if you're having complicated update scenarios, be careful and just test. It's really quite simple.

So ad hoc signatures-- This particular type of signature which is--remember the picture with the seal and then the signature applied to the seal? An ad hoc signature is a seal without an actual digital signature. We don't use identities at all. We just stop after the making the seal part and we just take the seal as given.

The minimum amount of code signing you can possibly get away with to notice that a program has changed. So the parental controls on the firewall currently do that. And there are two obvious gotchas with this. The really obvious one is that since we are nailing down exactly this program, software updates can't be tracked.

If the program changes in any way at all, we'll just say it's a different program. The thing that is not perhaps obvious is that because of the way these things are made, they're architecture specific. So if your users end up getting your program ad hoc signed because of something they're doing, like, you know, network connections or parental controls, and then they switch from PowerPC to Intel, or from Intel to Intel 64-bit--heard about it-- then these ad hoc signatures will no longer verify. And that will probably lead to complaints. from your users.

I did mention you should sign your code, right? Because obviously, if you sign your code, you will never get ad hoc signed and all of this problem will just not happen to you. Okay, frequently asked questions before we take your real questions. No, we do not decide what code is good and what code isn't good.

We very, very intentionally do not have anything in our code signing code that makes any judgments as to whether your code is acceptable or good or nice or otherwise qualified. That's what the code requirements are for. All of the policy is in the code requirements. Everything. There's no policy in the core code signing code other than, you know, the seal can't be broken.

Nor does the system as a whole--the system that you're running on, generally--make this judgment. Remember I told you about trust settings? No, the trust settings configuration of the system or any other configuration database in the system does not a priori decide whether a program is good or not. Again, it's in the code requirements, and that means it's individual. I get the question a lot, "So does the system trust my program?"

I don't know which part of the system. Each of the different parts has a different idea of how to make up code requirements. It's perfectly possible for your program to be trusted by parental controls and the firewall, but not the keychain. So, the system is probably the wrong question.

Mac OS X will in general still run your program if it's not signed. And as a matter of fact, because some parts of it don't like unsigned code, it will sign your program for you, and we've just discussed that that's not a good thing. Well, if you don't know my answer to that one yet, I'm not going to repeat it.

And, yeah, we don't get that one much anymore. Yes, you can buy yourself a digital identity from a professional commercial provider. That's fine with us. We support it. But we don't require it. You can do anything that you can do with a professionally sold identity with an identity you make yourself with Apple tools. And in the end, it's up to you. - Yeah.

Um, that question will recede into the distance. Signed code actually runs fine on Tiger, in case you still wonder about that, because Tiger just ignores the code signature. With one exception, you don't want to ship frameworks into Tiger because the linker actually complains that it doesn't understand what those strange bits are.

Top problems, again, these are things that people have run into often enough that I'll just tell you upfront about it. If you have helper tools, don't put them in resources. Put them somewhere else. We don't really care where somewhere else. Typical places are the Mac OS directory or perhaps the contents directory. Just don't put them in resources. I realize Xcode makes it excruciatingly easy to put things into resources, but helpers don't go there.

If you change anything on your program after you sign it as part of your mastering process, you know, putting them on DVDs or uploading on your website or whatever, be really careful that you don't change the program because a modification is a modification and the signing system can't tell whether it's okay. If you meant to make a change, that's okay. Just make the change and then sign the program.

If your installer allows partial installs, you know, like optional modules and optional parts of the program, then it's not a good idea to have these optional parts be resources in your resources folder because when you sign, you have to sign the whole thing. There is a facility for saying these resources over there are optional. This is called a resource specification. But it's usually a much better idea to just make those optional parts optional. and then we can do a little bit of coding.

So we can Obviously, it's a bad idea to change the code on the end user system for any reason. What I usually recommend is that just imagine that your program gets installed on a read-only file system or on a DVD. And if something in your program's operation breaks because you can't write to your bundle, you probably need to redesign it.

Localizations. You need to make sure that what you're signing is the whole thing, including all localizations. So if you have optional localizations, pile them all in before you make the code signature. Localizations are automatically optional, so they can be stripped out later. If your localizations are optional installs, that's fine. We know about that.

It'll work fine. But you cannot sign your program, ship it, and then have some third party make the changes. So make more localizations and just stuff them in because those are new files and the system will complain about them. And if you still make self-modifying code, well, never mind.

Okay, and more information. Greg Keithley, who I can't actually see right now, is the omnibus technology evangelist, including data security. So if you need evangelism services, he's your one-stop shop. There are perfectly good man pages for CodeSign and CSREC that are actually right now your best documentation for how they work and what they do. There is a conceptual code signing documentation part in the developer library, and there is a tech note that's been held up since December that still hasn't been published. So it's not on this list yet.

As I said, Ken is giving a presentation entirely dedicated to X.509 certificates and digital identities today at 5:00. So if you need any help with that, please go. He loves to answer questions. There is a data security lab right after this session down on the first floor. Since, as expected, we won't have much time for Q&A, I invite all of you to come down with me there. I'll be there and you can spend the entire lab asking me questions if you want to. And if you have phone-specific questions, you're unfortunately too late because there was a phone-specific security session that you've already missed.

They asked me to put a summary in. So code signing is about defining code identity, about what it means for code to have identity and about tracking this identity at runtime. You, as the company, the person who signs the code, define what that identity means. So you actually get to define what it means for your program to have software updates.

And conversely, the user on the end user system makes the policy decisions because he directly or indirectly specifies what code requirements get applied to the code. So neither of these two roles is Apple's. Neither the defining role nor the policy-making verifying role. Unless, of course, one of us happens to be Apple. If Apple makes the code, obviously we're the manufacturer.

Sign your code. If you haven't done it yet, you really want to get going on this. Right now, we're transitioning. Transition is good. It gives you some breathing room. Eventually, the transition will be over and then it'll be an emergency and it'll be your emergency. Please, don't let it be an emergency.