OS Foundations • 56:45
Mac OS X Tiger introduces a number of interesting and useful enhancements at the BSD level. This session outlines two new system services, Apple System Logger and launchd. Apple System Logger provides logging information in a consistent format, enabling administrators to easily analyze system behavior. Its rich API set also allows programmers to better customize their log messages. The new service management system, launchd, introduces a flexible and powerful way of handling StartupItems and daemons. This session is essential for anyone developing a background process or system service.
Speakers: Marc Majka, Dave Zarzycki
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it may have transcription errors.
Hi, good afternoon. This is CoreOS enhancements for BSD developers. I'm happy to see you all here today. The heckling section is in the front. My name is Mark Mica and I'll be doing a tag team session today along with Dave Zerziski. He'll be up in a few minutes. We're going to tell you about a couple of topics that I hope you'll find interesting and useful. Stuff that we've added to Tiger. I'm going to start off talking about the Apple System Logger. And let's just get into it. What is it?
This has kind of come in as a fairly low-profile enhancement to Tiger, but it's the start of something that I think will be a useful direction for you as developers in the future. ASL is a new approach to creating and managing system log messages. It's a replacement for and an extension to the old BSD syslog system. Thank you. There's several new components. There's a new API that I'll be describing a little bit later. There's some behind the scene changes in the syslog API library.
There is a rewritten syslogd server and a new syslog command line utility. The goals of this work were to provide structured, flexible, and hopefully more useful system log messages to try to reduce the proliferation of log files that you find scattered around on the system and to make it easier to find log file messages and read them.
Let's start with our starting point, which is the BSD syslog message system. BSD log messages had associated with them a priority level, one of a fixed set of different facilities, and a fixed format of message. In the example, a timestamp, a host name, the sending application's name and process ID, and a piece of message text. Pretty basic.
What we've done with ASL is extended messages, or we've made messages extensible key value dictionaries. Keys and values are just null terminated strings. There are a bunch of standard keys in every message, things like the priority level, a timestamp, the sending process ID, PID, and the message string. But you can add additional keys that are appropriate to your application. So if your application feels it's useful to attach, let's say, a color key to a message or a language or something like that, this is totally up to you and hopefully to make things more useful to your application. The little example message at the bottom shows a typical ASL message. The content here is not the big thing. The useful feature here is the message structure.
take a quick tour of the ASL APIs but before I get into it with the slides let me just mention that your conference registration includes some complete example code that has a bunch of this stuff in it that may be useful to look at also direct you to look at the online Unix man pages for ASL and syslog number of associated components that that I think will really help So here's a little example of just hello world in in the ASL ASL API ASL log is the basic send a message Routine there's a couple of null options at the Command or null parameters at the front here that I'll talk about in the next slide we're just using default messages in this one and basically we're just saying it's setting a priority level, debug here, and a hello world message.
And off it goes to the syslog server. And once again, you can see the message that this actually produces has a bunch of this sort of extra or standard message keys like the time stamp, the host name, and so on, all of which are added into the message by the library code.
Here's a little bit more of a sophisticated example of sending a message. In this one we start off by creating an ASL client. It's a connection handle, a connection to the server. In the previous example, we didn't have one of these, and that's sufficient for code that's single threaded or that only has one thread that actually logs messages. If your application has multiple threads logging messages, each of them should have its own connection handle to the server to make them thread safe. We also create, I should mention, in that ASL open call that creates the handle, there's a couple of parameters. There's a sending process name. We've allowed that to be null here and it'll just pick up your application's name.
And a facility. One of the nice changes in this ASL system is that facilities are no longer hardwired into a header file somewhere or other. A facility is just a string. So if you're creating a new suite of utilities, application programs, that all kind of work together in some way, then you can make up a facility name and use that. You don't have to go edit anybody's header files or you don't have to recompile anything. After opening a connection to handle this example, creates a message, an ASL message structure. That's actually the key value dictionary. It's an opaque structure here. And in order to set keys and values in that dictionary, you use ASL set. Here we're adding a subsystem key and a language key with appropriate values. Once again, call ASL log to log the message and then clean up with a couple of frees, a free for the message and a close to shut down the connection handle. Let's look at the output from this. You see, once again, all the standard keys and values set by the library, but this message includes the subsystem and language keys that you set in the previous slide in the example code. that's about it for that.
Okay, that's basically sending messages, nothing terribly difficult about it. Once you've sent a message, what happens to it? It goes off to the syslogd server. The syslogd server, new for ASL, keeps a unified log message data store. It's a little database, and we've provided a search API to allow you to search and retrieve messages from that database, and a command line utility called syslog for searching that database and monitoring it. These three bullet points together are to address that second goal that I mentioned, which was to reduce the proliferation of log files all over the system. Over time, we're planning to actually start reducing the number of log files that we keep in var log and actually start emphasizing the use of this database, this little data store of messages, to allow you to search through things. So you could say, gee, I'd really like to see what happened, you know, four hours ago from these two different applications. What messages did they log? And allow you to go and search the data store. You can do that now. All that, the API is there, and the syslog command line utility is there. We still don't have great hooks up into the HI layers. We don't have a great application for browsing through the data store. but that'll be coming. All the old stuff is still there, though. The old etsy syslog.conf file and all of the old traditional files in var log are still there, so you may not have even noticed that we've rewritten all of the underpinnings because we still have legacy support for all that stuff. But as I say, we plan to start de-emphasizing that going into the future and making a heavier use of that data store. Thank you. Here's a search example, the search API. Once again, we open up a connection handle using ASL open and once again, create a message. This message is slightly different. This is actually a query. Looks a lot like a message.
Once again, it's a set of keys and values, but in addition to setting a key and a value, you also can supply an operator. Operators are things like equal, not equal, substring prefix, as you might imagine. They're all documented in the man pages. And in this example, we're looking for all messages that have a key flavor that has a value vanilla, exactly. And we run through the data store and search for all of those with the ASL search call. That returns a list that we can iterate over with ASL response next. each of those gives us a message structure. See the example code or the man pages to look at how you tear apart a message structure to get the individual keys and values that are in it. You can print them or compare them or look at them any way you want. Finally, some cleanup to free the returned iteration list, the query message, and shut down the connection.
It may be a little bit onerous to ask you to write a new application every time you want to go look for log messages. So we also provided a syslog command line utility, which will do exactly the same thing. It has a bunch of different search options. Syslog is a real Swiss army knife of a tool. It does all kinds of stuff. So I highly recommend taking a look at the man page to see all of the options. You can also do syslog-help, and it prints a reasonably useful help message on terminal. A few usage examples here. Syslog, all by itself, just prints every message that's in the data store. Syslog-k, flavor vanilla, as in this example, prints out every message that has a keyword flavor with a value vanilla, as you might expect. Dash k facility user dash k time ge minus 1d. What's all that about? Well, that's actually searching for two different keys and values. We're searching for all user facility messages that were logged within the last day. So ge greater than or equal minus 1d means one day before right now. And that's a useful little convention. A day is just a 24-hour period. It'll take all kinds of different things like H for hours, S for seconds, and so on, M for minutes. So if you want to know everything that happened since a particular time, you can also actually give it an absolute value of seconds, and they'll use that too, but most people don't have a watch that gives them the absolute number of seconds since January 1st, whatever, 1969. Final example here, syslog-w. The dash-w option means watch the database, watch the datastore and print messages as they come into the database. It gives you a similar functionality to something like, let's say, tail-f of a log file, so it keeps watching the datastore. The dash capital F flag means that you are supplying a format, an output format for the messages.
It has a couple of default, or a couple of different formats that it can print messages in. One's a sort of very much like the kind of format that you might see in a syslog file, or a system.log file, excuse me. It has a couple of different options for printing out the messages. And if you want, you can say, "No, I really only wanna know, let's say, "the timestamp and the sending process ID and the message text as it is in this example. And once again, in this example, I'm only interested in messages that had a level less than or equal to three, so fairly high-priority messages. is.
Let's take a quick look at the architecture of all of the stuff that goes on behind this. syslogd is the ASL server. It provides both the old syslogd functionality plus support for the new API in the data store. There are input modules that receive messages on a variety of different channels and output modules that sort and file and forward messages as appropriate. We'll take a look at that in a sec. There's an API for sending messages. both the ASL API that we just looked at a moment ago, and the syslog APIs. Both of those are in the system framework lib system, and the syslog APIs continue to work as they used to. And, of course, there's the kernel printf, which also gets picked up by syslogd. you.
API for searching the data store is in ASL, the ASL library. And once again, I mentioned the syslog command line tool for searching the data store and monitoring messages. Here's the big picture. At the top, a number of different clients using different APIs, either the ASL API, the kernel using its printf, syslog API or a remote client sending messages via UDP Datagram from from somewhere else across the network Syslog D listens on all of those various ports pulls in the messages and hands them to various output modules in this case There's a couple an ASL out output module and a BSD out Module both of which do whatever is appropriate with those messages The ASL module will put things in the data store. It will actually also forward to NotifyD. It's kind of a... piece of work that's in development. We haven't really done a lot of documentation on that. If you're really interested in it, I can tell you a bit more in the Q&A after this, but you can have syslogd send a notification, post a notification using the notification APIs whenever a particular message comes in that matches a particular search criteria. The BSD output module logs things to all the various log files as it used to. It'll send messages to a terminal as the old one used to. And once again, it will also forward on messages to some network log server.
Let me talk a little bit about filtering messages, because lots of processes are logging lots of messages, can lead to a fairly large amount of messages. So there's some filtering that you can use to control the flow. The client library, both the syslog library and the new ASL library, contain some filtering controls that you can set when you're writing your applications to determine which priority messages actually get sent from your application to the server.
If you've used the syslog API, you're familiar with setlogmask. And ASL has a similar filter to determine which priorities actually get through to syslogd. The server, syslogd, also has a filter which controls which priority of messages it saves in the data store. So by default, it doesn't save everything in the data store. You can tell it to if you want. By default, it actually ignores debug and info level messages and stores everything else in the database. And what's interesting is that we've added a remote control mechanism on top of both of these that allow you to control the flow of log messages at runtime. So you can tell an application, hey, even though I set the log mask to, let's say, filter out certain types of messages, now I want to see those messages. And you don't have to restart your application or send it any signals or anything. All of this happens dynamically. Let's take a quick look at that. What we've done is modified the client library, both the ASL actually and the syslog library, so that they listen to notifications from the syslog command line tool, syslog-c in this case, once again, seed man page, but you can say to an application, I want a certain set of log messages to be either filtered out or passed through to syslogd server. Likewise, there's a control on the syslogd server for which messages it will actually put in the data store, and once again, syslog-c will control which messages get stored in the database. space.
There's actually three different filters at work in the background here. There's what I've called here a local filter, and that's set using either the set log mask API that you're probably familiar with or in the new API ASL set filter. And that's as you would expect it to work. It determines which messages get sent to the filter. There's a notion of a get to the server. There's a notion of a master filter, and the master filter is a global filter that applies to every application on the system. The master filter is normally off and has absolutely no effect on anybody. But if you turn it on, it overrides everybody's log mask. So if you want, you can open the floodgates and tell every application to send every log message on through to syslog. If you're trying to track down some weird thing that's going on in your system, you can just open the floodgates and get everything coming through. Or, alternatively, you could also use that master filter to tell all the applications in the system to just quiet down. Don't send anything except perhaps emergency messages through to syslog because you're watching for a specific thing. There's also an application-specific filter, which, when the master filter is set, or actually at any time, will override both the local and the master filter. So you can point to a particular application-- I'll point to that one over there-- point to a particular application and tell it that you want certain messages to be sent through to the server or not. Also, this remote control mechanism, as I mentioned, you can use to specify which messages get saved on the back end into the database.
Well, of course, if lots of messages get saved to the database, the database can get large. So we need some sort of a pruning mechanism to get rid of old messages. And by default, as I mentioned, the datastore doesn't save info or debug-level messages. You can set that as a startup option for syslogd if you wish, or you can use the control mechanism. So just by not saving info and debug messages, we actually don't, the file doesn't grow that fast, but it still grows. So there's a daily cron script, the periodic daily cron script actually prunes the data store. If you look in that file, let's see, periodic daily 500.daily, you'll see where the data store gets pruned. You can also use syslog, the syslog command line utility with the dash P flag, which is all that happens in the daily script, to prune the data store any time you want manually. You just tell it to throw away certain messages. In the cron script, we sort of tail off messages. So although we keep all messages above the level notice and above at the beginning, after a day, we throw away the notice message and the warning messages. After a couple of days, we throw away a little bit more. And then finally, we throw away everything that's a week old or more. So by default, we only keep messages for a week in that data store.
I should just mention that we're working on some improvements in that data store for Leopard and beyond. Right now, it's pretty simple. It's actually mostly a flat file if you go and look at it, and I know a number of people have. We're probably going to turn that into a slightly more sophisticated database in Leopard and beyond so that we can do a little bit better management of duplicated strings and make pruning a little bit faster and easier. With that, I'll turn the podium over to Dave Jarzynski, who's going to tell you about service management, and I'll be available in the Q&A session afterwards. Thanks. Thank you. Okay.
Thanks, Mark. So service management in Mac OS X. Well, what are services? Services are background processes. We need to manage them somehow on the system. In Tiger, we've introduced LaunchD. It's our unification of background process management. It is the daemon to end all daemons. It is XML-based instead of shell scripts. These are our key points we'd like to bring up, actually. It's XML based instead of shell scripts. It is less work for you as a software developer, as we'll demonstrate later.
It is easy to use for simple scenarios. Not very many knobs you need to tweak just to get your daemon up and running. It has more flexible options than any configuration daemon before it. And it has new kind of launch on demand criteria if you don't necessarily need your job running all the time. It has a simplified notion of dependencies that we actually feel can lead to a better system as far as reliability is concerned. And it has support for user-supplied jobs, which some of the previous daemons to manage all other daemons didn't support. Thank you.
So XML, why XML? Well, structured data is a good thing. It's very easy with XML to quickly introspect every job, daven, agent, whatever you want to call it on the system, and look at, let's say, well, what user is this job running as? Well, there's a key and a value for that. Is this job nice at all? Well, there's a key and a value for that.
And let's say if we want to bulk modify some daemons, well, we can script that, we can quickly do that in a structured way. And it's consistent above all else. That's why we can do things like this. In the past, we had shell scripts. And with the shell script, it's a language. It's really hard to just run into any kind of file with a language in it and just find where a particular variable is tweaked. Or there could be a whole multitude of ways those things are accomplished. And we brought that all together with XML so it's done in a consistent manner. It's also faster too, because now you don't need to run through an interpreter of some kind and actually try and run through and do, we're probably amounts to a few simple things like just, hey, run this process. We can just launch it directly.
Now, on the topic of less work, how can you as a developer find yourself with less work? Well, you don't need to fork and have the parent exit, or what some people call daemonizing. Setsit, another common daemonizing task. Nope, you don't need to do that. You don't need to close stray file descriptors. You don't need to reopen standard IO as dev null. And you don't need to change the working directory to slash. There's a whole bunch of things. No, you're pre-daemonized. You can hit main, if all you want to do is say sleep for a million years, fine, that's a daemon. You don't need to do anything special to become one now.
And in fact, when you don't do some of these things, like close standard IO, that allows us to actually, before we launch your process, maybe direct those to interesting places. By default, we'll have them go to dev null, but if, let's say, an administrator says, no, for this daemon, I want his standard out and standard error to go to this file. Well, now we can do that. Whereas before, the daemon would just say, bye-bye, I closed it, I know what's going on. And that would be difficult to do, but now it's easy.
Now it's easy to use. Here is minus the header and footer an XML property list. We have the label, which uniquely identifies the job to launchD. We have the onDemand key, which tells launchD, no, this is not onDemand, we want this job running all the time. And what program to invoke, in this case, somethingD.
And what LaunchD will do is when this property list gets loaded up into LaunchD, it'll say, "Hey, this is not an on-demand job. We need to have it running all the time." And it'll start something deep. And if for some reason something deep consciously or accidentally exits, LaunchD will notice and say, "Hey, this job's supposed to be running all the time," and start them back up again.
So now you as a daemon writer cannot worry about, you know, if you die accidentally. The system will restart you. Increased flexibility. Well, as we talked about, we now have a standardized schema with XML to represent how a daemon started. We also have a lot more parameters that you can tweak than other previous daemon managers have supported. As I mentioned earlier, we have the standard out and standard error that we can set on a per job basis. There's the nice level, the working directory if you want a daemon to maybe have a little sandbox over here. Even the root directory, if you really just want to root the daemon off and have them not affect other parts of the system. The UMass, you can have per job environment variables that are easy to set up and see in a consistent manner with the plist. There's actually a huge multitude of things you can adjust. And where you can look for these variables on the system is the launchd.plist man page. It's in section 5. And you can find where all this stuff is documented and what they do and how they behave.
Now, launch on demand. We want to help you help us save system resources. When you launch on demand, you can accomplish that. You're no longer consuming practically any memory. I mean, there's a little trace amount in launchd representing your job, but you're not consuming a process, all the memory associated. That might seem all well and good, but where it really helps is when we boot up. Because when we boot up, we're loading stuff off the disk, And the less we can load off the disk, the faster we can boot up. So, yeah, you provide a net win for all our customers every time you make your daemon launch on demand. Well, what criteria do we have to launch on demand? We have sockets, IPv4 and IPv6. We also have support for Unix domain sockets, something that has been lacking in previous super daemons like INEDD. We have the ability to monitor when a file system object changes. So let's say you want to run a script every time Etsy host config changes. Well, now you can do that.
We also have the ability and the notion of something we call a queue directory. If you look at SendMail, PostFix, any kind of, or Cron even, they have a notion of setting a directory aside and saying when a file shows up in here, I have work to do. But if this directory is empty, I don't have any work to do. And we take advantage of that on Mac OS X for our default mail server. We specify queue directory and when mail shows up there, we start Postfix up and Postfix tries to drain the mail queue. And when Postfix decides that it's bored after a minute, it exits. And then we just wait for it to launch on demand again.
We also have interval timers. Now, some of you may have in the past in cron sat there and calculated out the exact interval you wanted to run for every five minutes. So you say 5, 10, 15, 20. No, no, no, you don't need to do that anymore. Just say run it every 300 seconds and LaunchD will do the rest of the work for you. And in fact, if you want to be weird and say every 137 seconds, you can do that. Also, we have still, if you want the more cron-like semantics, you can specify a calendar-based interval timer. So that way you can run it on the third of every month.
Now dependencies. Let's first talk about the old way. Since we have been talking a little bit about system boot up, it's important to talk about dependencies because we came from a world with system starter. In system starter, dependencies were explicitly declared by the startup item writer. Well, in reality, this didn't work out well. What ended up happening is that one of the following things would happen. dependencies were overstated. People weren't sure, so they'd just throw some more in there until it worked.
Dependencies were sometimes understated. Sometimes a developer didn't know and it just happened to work on a system. So he missed something, but it wouldn't work for someone else. And dependency meaning was often vague. Well, I need the network. Well, what does that mean? Do I need a DNS server? Do I just need an IP address on an interface? I don't know. And that was difficult for people to both decide on what dependencies to export and what dependencies to consume.
Also, dependencies sometimes were just virtually assumed by other dependencies. Well, I need, let's say, directory services. And there were some startup items out there that we found inside the company that assumed that if they needed directory services that the network would already be up, whatever that means. Well, that was a problem and we need to fix that. So how did we solve all this?
In Mac OS X, we've essentially inlined dependencies. We made the observation that, well, you don't really need the network. You need the ability to talk to configd. And configd knows when the network's up and can tell you when that changes. We noticed that some people need to find out when disk came and went. The device, you know, disk were mounted or unmounted. Well, you don't need to wait for all the disk to be mounted. you need to talk to disk arbitration D and find out when these events happen.
So this observation was IPC was the key. And what we do now is we register all these IPC handles of configd and other daemons with launchd. And we get all those registrations done at boot up. And then we allow daemons to start up. And because the communication handles are registered now with the system, daemons, when they start talking to each other, now find each other because their handles are out there.
And this allows us to boot up really, really fast, because what we can do is start up a lot of daemons in parallel. And as they're initializing themselves and start talking to other daemons, it doesn't matter if the other daemon isn't done initializing itself. It won't answer its IPC query until it's ready. And that way, the thundering herd, eventually the dust settles, and your system's booted. Yeah.
Now, again, what this means though, as a daemon writer, if you expect people to depend on you, you need to declare your sockets and your communication and your plist. Because if you don't, people can start up and they'll try and connect to your socket and it'll fail. And then you end up with all these degenerate code paths that probably haven't been tested in a very long time. So, please, please just declare your sockets in your plist and everything will just be groovy.
Now, if we want to talk about a case study of declaring P-LIS, sockets in P-LIS, we can talk about the SSH agent. In the past, there was an interesting dependency problem that people would have with trying to start their SSH agent. What people would do is they'd put a lot of complicated shell logic in their shell startups to figure out if SSH agent was running, and if it was, use the existing copy, and if it wasn't, start a new one, and well, this is all actually a lot easier with launchd now with a small patch to SSH agent. And I'll demonstrate this later in a demo.
But now it's just, you put a plist in your home directory saying here, launch SSH agent when somebody connects to this socket and SSH agent will then start up and accept whatever connection came in. And it's now a lot simpler for a user to use the SSH agent. And if you want to actually play with this, you can go to one of the OpenDarwin developers, Landon, and grab the patch and the plist and play with it yourself.
Now, this SSH agent that is talked about, it needs to get its socket from LaunchD and actually then demux and dequeue this IPC. Well, how does it do that? We have an IPC API to talk to LaunchD. It's a very simple RTTI runtime type information based object graph system to support message passing. That's all it's designed for. It's not designed to be core foundation or any of these other stratosphere level frameworks. It's just IPC, you know, hello, goodbye, you know, that kind of thing. Here's the simple C APIs. As you can see here, the very first one, the one and only really one to talk to LaunchD is launch message. It'll take an opaque object in. It'll return you an object from LaunchD. An example of creating an opaque object as we see here is, I'd like to create a new integer, and you get an object back. And I'd like to get from the object what integer is in there. There you go. It's really simple.
Now, if we want to talk about the semantics of that launch message and launch data T, here we go. Launch data T just represents an object graph. It could be a single object, like a string or a number, or it could be a more complicated tree. The launch message API is synchronous for the common case. It returns null if there's some kind of IPC failure itself to talking to launch T. And you can check error no if that happens. You can get asynchronous messages back from launchd if you've requested them by passing null to launch message. And then the first message that was asynchronously sent to you will be returned, and you can keep calling that in a loop until you've drained the queue of asynchronous messages. And at that point, null is returned, and then you can check error node to see if it's not equal to zero to differentiate whether there was an actual error or if there's just no more asynchronous messages.
Now, let's dive a little bit deeper into LaunchData T. What does it support? It supports dictionaries, key value pair-based entries. Supports plain old arrays, if you don't actually need keys. Supports file descriptors as a unique object type. It's not just a number, it represents a file descriptor. Integers, real numbers, Booleans, strings, opaque data. This is all that we needed and actually a little bit more for talking to LaunchData.
The, again, this is just IPC we're talking about. We have get and set for the basic types. That's all we need. For dictionaries, the only thing we needed was the ability to insert, look up, remove, and iterate. Again, we're just talking to launchd. We're not trying to solve all the world's problems. Arrays, get and set index, and get the count.
Now, stepping back a bit from the APIs, I'm going to talk about the XMLP list. There's only two keys that are actually required in a LaunchD property list. One is the label, which uniquely identifies the job. And the other is what to launch. And by default, if you don't specify the on-demand key, you will default to being on-demand, because we'd just like to push you to be that way. Because again, it helps you help us. But the common case is the people who include the label, the program arguments and the on-demand key with the on-demand key set to false. So that way they can have their daemon running all the time.
But if you do have the on-demand key set to true, we then have some optional keys you can then use to specify how to launch you. As we talked about earlier, there's the sockets or the timers or watching a file. There's also many other optional keys as we've talked about earlier. There's even things like the username and group name if you're having a daemony type launchd job. And let's see, yeah.
Now, sockets. Sockets are actually a complicated thing to set up, but we tried to do a lot of same defaults. We default to example, do the stream type, but if, because that's the common case, but if you need a datagram socket, you can set that up. It contains a lot of details of like, I want to connect to this machine and then a file descriptor will be created. Or maybe you want to listen on this IP address with only the IPv6 and a datagram socket. Well, you can then also specify that. And to repeat myself, since I think it's important for this talk, all these configuration details of how to specify your property list are in the launchd.plist man page. Now, once you have your sockets set up, we get to an interesting thing. What we do is we actually take this property list and do kind of a quick transform of it over to launch data t, since it maps really well. The only difference, though, is we take the socket specification and turn them into launch data T file descriptor objects and that has a very interesting and powerful effect. What that does is it makes it so launch T is not aware of socket types at all. It's just an opaque file descriptor.
This future proofs launch T such that you could have a long-running system and if you wanted to have a new file descriptor type as maybe a kernel extension, you could do that, load the kernel extension, create this new socket, and hand it over to the running Launchty, and now he can launch your job on demand. It also means since you can programmatically talk to Launchty, you could also come up with a new file descriptor that Launchty hasn't heard about and just hand it over and say, hey, yeah, when this becomes readable, launch this job. So we're very excited about this particular aspect of LaunchD and how potentially people might use it.
But other than that, we take the XMLP list that's now transformed, and we can do something like a submit job as an example message. And messages are simply dictionaries with the one key, and the value is the message. So the key might be submit job, and the value is the now launch data T that represents the XML property list. You can also do something like a remove job. That's again a dictionary with one key saying remove. The value is the string representing the label. You can do other things like get jobs. And get jobs is so simple it doesn't even take an argument. So the way we do that is we just throw a string at launchd saying, hey, this is the message. It's a string.
It says get jobs. If you want a specific job, again, back to the dictionary, hey, here's the key. The value is the label. And you can get a specific job and find out about its attributes. Checking in is how a native launchd job gets its configuration parameters and its file descriptors back from launchd and then goes about its business.
There's also things like start and stop, which if we rewound a few slides and how I pointed out that a label and program arguments is all that you need. Well, if you don't specify the on-demand key, your job's just gonna sit there not running. And if you really want, you can manually poke a job and say, hey, launchd, start that job. And launchd will start it. Or if you wanna say stop it, launchd will stop it. But that's primarily meant for testing purposes. If you have an on-demand job and you want to manually start it up to make sure that it works, your keys and your criteria should ideally launch your job on demand.
Now to go into an actual programming example of how you would talk to LaunchD, again, we're going to allocate a dictionary for the message. We're going to insert a string, com.example.helloD, and start job is the key. And here we go, launch message, send the message and get a response. And now we can free the message and the response. It's really that simple to talk to LaunchD.
Now to rehash. LaunchD, we see this as the future. It has less work for you. You're pre-daemonized out the get-go. If you just want to sleep forever, you can do that. There's nothing else you need to do to become an official daemon as far as we're concerned. If you've gone native with us, just check in and go. That's only one message you really need to send and get back. Other than that, you're done talking to LaunchD. There's automatic restarting if you're worried about your daemon staying up. Again, a very powerful feature that many people will appreciate.
There's more flexible criteria than ever before to get your job running. We hope that you can take advantage of it to help us help everybody save system resources. We also have the ability to monitor multiple, lots of file descriptors if you want. So if you want half a dozen connections scattered around the universe and be able to launch on demand whenever any of them become readable, we can do that. And finally, user agents, like I talked about with the SSH agent. that is a powerful concept that we haven't ever had before with daemons like INETD or INIT. So let's jump into a demo about this SSH agent and some of the other powerful things LaunchD can do. So system two. Yeah. Okay, so just to show you what can happen with the SSH agent, we've dropped a property list describing it into the system. And when we log in now, we have an SSH agent ready to launch on demand. And I'll show you with a quick PS output that it's not running, but when we say SSH add, it'll get launched on demand.
Let's see, that's readable, right? All right. no SSH agent, but if we do a set pipe grip SSH, This is the fun of demos. Library launch agents. Yeah. Apparently it wasn't, oh, I'm sorry. One of the things I haven't done that I was going to demonstrate is how this environment variable gets set. It's actually loaded, as I can show you right now with launchctl list. But the environment variable that SSH add uses to talk back to the agent isn't set.
And if you remember before, I said that people needed a lot of shell logic to figure out whether the SSH agent isn't running or what, or whether to start it. Well, we don't need that anymore. In fact, the only thing you need to do at the top of your shell startup scripts now is say launchctl export.
And what that did, just to run it again and show you, is it did a quick shell script output of all the environment variables in launchd. And now we've set them in our own local shell. And the important one would be this ssh authsoc that is using launchd. And now we can say ssh add.
And now, the SSH agent is running. And the -l flag is the flag we created to say launch on demand. And just to prove that that is the case, we can say list, there's the key, and if we kill that, and then list, now there's no more identities. And it's, obviously a new version is running. Oops. and 371. So yeah, new agent version is running. Now, just to show you what this looked like, this same file can go in your home directory, but I just dumped it in the system-wide one.
This is what we used. Here's the program arguments to launch SSH agent on demand. As you can see, they're pre-tokenized, which saves everyone a lot of work, because that way we don't need to launch a shell just for every daemon we launch. Now we can launch the process directly. And for the sockets, again, I mentioned earlier that we have a lot of defaults, like SOC stream. In this case, secure socket with key implies a Unix datagram socket, and the key is the environment variable we want set. And this is all we need, oh, this and service IPC, which contractually obligates that this guy will do a check-in with launchd. But other than that, this is all we needed to get the SSH agent running. to now move on and show some general launch D, launch CTL stuff that you can look at on your own Tiger system. We can move to the stuff that Apple provides.
So if we were to look at the Postfix example, we can see that we launch post, in fact, we specify where exactly the program is with the program, but the actual arguments are right here. And what this dash E says is, and 60, is that after 60 seconds of nothing to do, Postfix will exit.
And that wasn't even something we needed to add to Postfix. They already actually had that built in. And again, here's the Q directories. If we just watch this directory, and any time something shows up in there, we launch Postfix. Another interesting example, actually using a combination of launchd criteria, is the cron example. Now we-- oh, that's right.
we have cron, the thing we're running. And if we notice the lack of the on-demand key, it now defaults to on-demand. So this is an on-demand job. What we did tell LaunchD to do is try running this at least once. And what that does is when cron is run at boot, it looks it around, particularly its Etsy cron tab, and if it finds that it has no work to do, no user cron tabs, no system cron tabs, it exits. And then we can take advantage of this launch on demand criteria to watch this file and to watch this directory to see if it goes non-empty. And if either of these criteria fire, we then run cron again, it goes out and looks around and finds some work to do and then sticks around and does its cron thing. But again, if it finds no work to do, it'll exit and it'll launch on demand the next time particularly this file changes. Another example that I'd like to demonstrate is the... Oh, the NNB, I believe, oh, sorry, SMBD. So this one's a little bit interesting. The front part isn't as interesting as much as the next part. So to demonstrate the Sockets Dictionary of how that works, we have two keys here. Notice net bias and direct. These keys are up to you, the programmer, to use. You can use these keys to describe whatever you want. Ideally, it should be a protocol. In this case, these are two separate protocols. And what happens is when you check in, you get the Sockets dictionary back and you can look up these keys and find out the file descriptors that actually correspond to that protocol. So this dictionary will end up allocating some file descriptors, probably just one, since we're only doing IPv4. But again, here's the service name. We'll look this up, translate it to a port, bind to a file descriptor, and that'll get passed along to when this guy checks in. Direct, again, we'll look up a different port, represents a different protocol, so we stuck it over here, and we bound it to IPv4 specifically. So that's an example of potentially having one daemon handle multiple protocols using the configuration file syntax.
Is there any other fun examples to show here? Oh, an example of the-- periodic, the calendar-based stuff. We have actually a few extra keys here, which are interesting. One is the low priority I/O. That's a feature we have in LaunchD. It tells the kernel-- well, it gives a hint to the kernel-- that this job is not terribly important as far as this disk I/O is concerned.
So try and make it a second tier as far as disk access is concerned. And the NICE does the same thing for the CPU. Now we have the plain old cron stuff before on the-- at 3:15 in the morning, let's run this job. And finally, let's see. Oh, I should at least show the simplest example as I demonstrated earlier.
Here's a job that just. Here's the label. Here's the program argument. On demand false. And this is how the kernel event agent starts. So I think that's enough plist to show. The only other fun thing to show is you can use LaunchCTL to talk to LaunchD. And there's a whole bunch of things you can do. You can say load to load a property list. So for example, you could say load system library launch daemons nmbd, and that's how you would load up a property list into launchd. If it were disabled, which is a key, you can use -w to remove the disabled key and then load it up. Because at boot up, what happens is this whole directory of system library launch daemons gets evaluated to decide if we want to load a job. And if the disabled key is true, we won't load it. and the -w flag removes the disabled key. But some other fun things you can do, for example, is you can list all the jobs, you can look at its environment, Yeah, export is one way of doing that. But if you wanted to, for example, set mv foo bar, now you can get mv foo. And that came all the way from launchd. We set it over there, it came back, and you can even unset it, which is useful for any job launching out of launchd, if you want to just have it set globally, not have to set it in every plist.
You can do things like adjust launchd's limits if maybe you want to turn core dumps on and not have to do it on a per-job basis. You can use the launchctl command to adjust launchd itself. And let's see, if you want to... Here's what launchd's logging, the following log levels. If you wanted to adjust its logging, you can use the log command, log level debug. and now it's going to log debug messages.
Let's see, help. But all these things here are adjusting launch data to kind of a global level, not at a per job level. You can also-- oh, one of my favorite things to demonstrate is if we use the lsof command to look at launchd, you can see the standard out and standard error are dev null. Whereas now we can use launchctl standard out var log launchd.
log and do the same thing for standard error, r log launchd. And now if we use LSOF to look at the file descriptors, we can see that those files are now indeed opened up to that. And now any daemon that launches at a launchd that doesn't manually override its standard out and standard error will inherit this log file, which can be useful sometimes if you just want to get the standard out and standard error of every process on this, every background process on the system. So I think that's it for doing a demo. If we can get back to the slides. Thank you. So for more information, we have documents, sample code, and other resources available at the following URL. And we should talk to Craig Keithley. Oh, there's his email address.