Advanced Troubleshooting for System Administrators - WWDC 2006

Information Technologies • 1:03:53

Mac OS X Server has an incredibly rich set of configuration and diagnostics options available. Find out what the optimal preference settings are to help prepare for the unexpected, and discover monitoring techniques that will make troubleshooting easier and more efficient. Learn from those who interact with system logs on a regular basis.

Speaker: Nicole Jacque

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good morning. Welcome to Advanced Troubleshooting for System Administrators. My name is Nicole Jacque, and I work with the AppleCare Enterprise Support Engineering Group. And if you haven't noticed, the title and the description for this session, they're pretty broad. As much as I'd like to, there's really not a way to cover troubleshooting the entire OS X and OS X Server systems in one session. There's not a magic silver bullet that I can tell you that you can go back and fix every problem that you're ever going to have. There's not a magical key combination or hidden checkbox that says, make things work. Sorry. Okay. Feature request.

So what we're going to do today is we're going to focus on some areas of troubleshooting that My group sees very commonly from our enterprise customers, and we're going to talk about the tools that you can use to troubleshoot these parts of the operating system and these issues so that you can either fix these problems yourself, or you can talk more effectively to us so we can fix them.

And even though I said that there's no magic command, no silver bullet, there's one thing that we're going to be talking about throughout the entire session, you're going to see it keep coming up, and that is DNS. You need to have DNS running, working, and you need to have forward and reverse records for all your servers, and the relevant service records for any Active Directory domain controllers that you might have in your environment.

So the areas that we are going to talk about, we're going to talk a little bit just about logging in general, just to make sure we're all on the same page. Then we're going to move on to Directory Services Troubleshooting. So the areas that we are going to talk about, we're going to talk a little bit just about logging in general, just to make sure we're all on the same page. Then we're going to move on to Directory Services Troubleshooting. So this is going to be an opportunity for you to get your questions in, it just will not be in this room. So we're going to start just by talking very briefly about logging.

So you're all, since you're advanced system administrators, you all probably know where the logs are. Var log is where most of your standard Unix type utilities and processes put their logs. That's where your system log is, that's where your console stuff goes, that's where the secure log is.

Library logs is where a lot of OS X specific processes put their logs. And then if you have some user specific logging, that will usually go in the user's library logs directory in their home directory. Now, every process can handle its own logging. It can put a log file in any one of these directories, and it can log at any level of verbosity that it wants to.

They may have totally separate ways for configuring logging to be turned on and off, or at different levels of logging. But one thing that a lot of processes have in common is that they use the syslog system to do their logging. And you can actually configure this. So the configuration file is in the /etc/ directory, syslog.conf, and it looks something like this.

But let's break that down a little bit. You'll see a lot of these pairs of something dot something. And so the first part of that is called the facility, and that's just a fancy name for the part of the OS that's doing the logging. The second part of that is the level of logging that you want to do.

You have a lot of different levels of logging, all the way from none up through debug, and in this progression, the lower you go down on this chart, the more logging that you get. You can string these pairs along, put them together with semicolons, And then, at the end of the line, you specify where you would like the logging to go. So, in this default syslog configuration, you'll see what goes into the system log. You can feel free to change that, make more information go into the system log, or possibly less. Or, you can make your own totally new log, and log whatever you want to it.

You can also use asterisks for wildcards. So in this case, kern.asterisk is going to log every single level for the Kernel facility. Even more useful, you can use the asterisk as a wildcard for the facility. So asterisk.debug is going to log messages at the debug level for every facility.

This can be really useful if you have a problem in the OS, and you're really not sure where the problem is. You just want to get as much information out of the system as you can. So you can set asterisk.debug logging, and hopefully whatever process it is that's having the problem logs to syslog, and you'll get some information about it in your system log.

So this is basically just to get us all on the same page at various points in the presentation. I'm going to refer back to this. I'm going to tell you you're going to want to configure syslog. This is what I'm talking about. And now it's time to move on to Directory Services. So like I said, we're going to talk very widely about Directory Services, client, server. And one thing, just to give me sort of an idea, how many people in the audience use Open Directory? Okay, quite a few. How many people use Active Directory?

How many people use something else? OK. Just out of even more curiosity, how many people use more than one directory system at a time? Okay, good to know. So the very first thing that I get asked when people start trying to troubleshoot something with directory services is they'll say, "You know, I was somewhere, I was in Work Group Manager or something, and I got this error message, and it was something like negative 14.002. So what is that?" Well, you can actually look up all those error codes. They're in the MAN page for Directory Service. That has a very brief description of them. There's a little bit more information in the open directory reference on the developer site.

So you can actually, any time you get these errors, look that up. It might give you some more information. And we'll be talking a little bit more about some of these errors throughout the presentation. So the other basic information about directory services that you need to know is, well, where do all the files live? So library preferences directory service is where the majority of the configuration files for directory service go. This is where if you configure directory access, all those settings are stored here.

If you are using open directory, you're using a password server, and so there's some additional config files that are kept in var db auth server. If you're using Kerberos, which could mean you're using open directory or active directory, or something else, just some other Kerberos server. Then you have an edu.mit.kerberos file, which is your basic configuration for Kerberos. And if you're a server, you may also have a keytab file in Etsy, krb5.keytab. That's where your service principals live.

So these are files that you can go and look at if you're curious as to how they're stored. A few of the files have a few options in them that aren't exposed to the GUI, so you can look at and change things there. Another reason people want to know where these are, though, is just that at some point, They need to start fresh with directory services.

And so if you remove these files, This is what you need to get back to a clean slate as far as directory services is concerned. Once you remove the files, you either need to reboot the system, or you can just restart directory services, and to do that, sudo killall directory service. Launch D actually takes care of restarting it, so basically you just need to kill directory services.

The next thing people want to do is get some information out of Directory Service. By default, it doesn't really log a whole lot of information. But you can actually turn on debug logging with the USR1 flag sent using the killall command to Directory Service. This is essentially an on/off switch. Doing it once turns it on. Doing it again turns it off. If you restart, it will be back off.

If you want the debug logging to start automatically at startup, you need to create a file in the Library Preferences Directory Service folder called .dslogapi at start. You can use touch to do it, or whatever method you like. And the log file is going to go to Library Logs Directory Service, and it's going to be called directoryservice.debug.log.

So, great, you turn on debugging, and you're going to get something that looks like this. Which is really exciting, and everybody looks at it and says, oh yeah, that totally solves my problem. So we're going to actually break this down a little bit more for you. Basically, what Directory Services spends most of its time doing is it gets questions and it answers them. So those are the two main things you're going to find in the debug log. So for a query, you'll see a timestamp.

You'll see the client or the process that called Directory Service, and you'll see that client's process ID. You'll see the actual call that's made to directory service, the plugin that's being called. In this case, it's LDAPv3, but it could be the Active Directory plugin, the NetInfo plugin, whatever plugin. And then you'll see the query itself. In this case, it's querying for John Smith with an exact match, and it has to be a user record.

So, directory service gets the query, and it answers. You'll get most of the same information here, except that instead of seeing the query, you'll see whether or not it found any results. So you'll either get number of found records, zero, or you'll get number of found records, however many records it found.

Now, the different plugins for Directory Service can choose to log even more information than that. One really good example is the Active Directory plugin, which actually does an awful lot of logging, especially during the process of binding to Active Directory. And it's actually pretty human readable. So in this case, you can see that it's trying to talk to some servers, and the servers aren't responding, they're missing some of their service records, and so it's picking different servers.

And here it's telling you that the network isn't reachable. So it will tell you a lot of information. If you are having a problem with the Active Directory plugin, particularly with binding, this is the first place to go. If you escalate a case, either through the bug reporter, through the developer channel, or to AppleCare, this is probably going to be the first thing we're going to ask you for.

So now that we've looked at that, you've configured your directory services, and you want to know, are they working? So a really easy way to see if they're working is just to try and resolve a username. So you can use the ID command for this, and you can see in this case, DogCow exists.

Unfortunately, it looks like CyberDog does not. I think we have one CyberDog fan here. Any other CyberDog fans? All right, give a hand for CyberDog. So another way that you might find that you have no connectivity to your directory service, you're not able to look up users, is you might go into something like Worker Manager, and instead of showing you a nice list of your users and groups and such, it gives you a nasty looking error, which usually has an error number something in the low negative 14,000 range, like a 14002, 14008, and so on. So the first thing that you might need to troubleshoot is why am I not talking to the server?

You need to know what server you're talking to first. So if you're using Open Directory, You may have an Open Directory Master and maybe a bunch of replicas. So you can find out what servers Directory Service knows about by looking at the DSLDAPv3pluginconfig.plist file. I know that's a mouthful. That's in your Library Preferences Directory Service folder. And you'll see a list of all the replicas in there.

Similarly, for Active Directory, you'll see a list for every domain. You will have at least three different lists. One list of LDAP servers, one list of Kerberos servers, and one list of Kpassword servers. They may all be the same servers in those lists, but you'll have essentially three different lists. You may also have Global Catalog server lists in this file, and this is the Active Directory.plist file, again in Library Preferences Directory Service.

So once you've figured out what servers you're talking about, The very first thing that I would like all of you to do before you go any further, is Check your DNS. You do not know how many cases we get. I would say probably 90-some percent of the cases that we see, especially with the Active Directory plugin, there's a DNS problem. So before you go any farther, check your DNS.

The next thing that you might want to do is make sure that you can even get a connection between where your client is and where the server is. So for that, you can just do a telnet on port 389, because that's the standard LDAP port, and see if you can get connected.

So if you get something like this-- The port is open, you can reach it. If you're not able to get that, that means something like the server's down, the firewall is between you and your server, The network is down between you and the server? You have some sort of connectivity issue to look at before you can get directory services working with that particular server.

But assuming that you are able to successfully get a telnet connection here, You could try doing an LDAP query. You can use LDAP search. We're going to talk more about LDAP search later. But you can just do a basic query. What's important here though is that when you do LDAP search, to make it a good test, use the same information that you used in directory access.

So use the same host name. If you're using host names in directory access, don't put the IP number here because maybe you didn't fix your DNS. And use the same search base that you find configured in directory access because a lot of times that can simply be the case of the search base is messed up.

So once you've got directory service working, it's happy, it's resolving users, the next thing you can do is look in a little bit more detail at those users. So you're all familiar with WorkRoute Manager, I would hope. But from the command line, there's a utility called DSCL. Now, one thing we're not going to talk about is LookupD. If you were all in the open directory session, you might have heard NetInfo and LookupD are going away. Don't be too sad.

So if you have not been using DSCL and you've been using LookupD, now would be a really good time to start using DSCL. Check out, it's got a really good man page. And there's also information in the command line administration manual. But if you're more of a GUI person, there's the Inspector Mode in WorkRoute Manager. You can turn that on through preferences.

What happens if you have, this is a problem that we've been seeing quite a bit recently. You're looking, you're using Active Directory, you're looking at Active Directory users and groups, and you know you just configured a home directory for them, and you've put them in some specific groups. And the classic symptom of this is that you go to the client, they try and log in, they don't get their home directory, maybe they aren't showing up as members of certain groups, not getting their managed client based on certain groups.

And yet you look in Active Directory Users and Computers, it's there. And you look in DSCL or Inspector Mode and Worker Manager, and you're not seeing it. Well, the important thing to remember is that LDAP, just like any other server, has credentials that you need to use to access it, and it gives you different rights based on the credentials that you use. So the question is, what credentials is Directory Service? So that's going to depend based on what plug-in you're using and how you're using it. If you're using the LDAP plug-in, by default, anonymous access.

Now, if you're using a third party, something other than open directory, you very likely may have had to specify a particular user account in the security tab because your admin didn't want to allow anonymous access. So, you may have that. If you're using the authenticated binding feature of Open Directory, then that actually creates a computer account in Open Directory, and it authenticates as that account. And if you're using the Active Directory plugin, That again is going to create a computer account in Active Directory and use that.

So this is a direction that OS X is moving more and more towards, you know, away from anonymous binding into more secure methods. So this is something that you're going to start seeing more coming in, for example, Leopard. But right now, this is primarily something that you Active Directory users are going to see.

How do you tell? Is this really a permissions issue? And if it is, what am I actually seeing? And how, if I'm not the person who has admin access to my Active Directory, how do I explain this to my Active Directory admin so that they can fix it for me? So we can use our old friend LDAP search, and we can authenticate using a username, such as, for example, an admin username or just your regular account username.

And normally you would specify this in the regular distinguished name LDAP syntax. Well, that can be kind of tedious, especially in Active Directory where maybe they've gone kind of crazy with the nested OUs and you're down about five levels. So you can actually use a little bit shorter syntax and just say username @domain.

So once you do that, you'll get a listing of what you can see as the admin user. So that's great, but now what you need to do is, what are you seeing as the computer account? Well, if you're going to connect as the computer account, you need to know the username, which is pretty easy, that's the computer name.

But how do you get the computer account's password? So it turns out, that in the Active Directory PList file, you'll find a key called AD Computer Password. And you'll see this data section. It looks kind of weird. Maybe it's the password, but you try using that, and no, it's not the password.

Well, it turns out that that's actually base 64 encoded. So you can get that by using the OpenSSL utility, decode it, and you'll get the actual password that you could then plug in to an LDAP search command. One thing to remember here, is just remember you have to specify the computer account name here.

And unfortunately, Active Directory will only allow distinguished name syntax for computer accounts. So you will actually have to use the CN equals and so on syntax here. By the way, this is also a good way of troubleshooting cases where you have a machine, it's been working with Active Directory, suddenly it's not.

And one thing that might have happened is it's possible that somehow the password in Active Directory has been changed for the computer account. Either by some other computer that says it's that computer, or for example as we're going to talk about on the server side, by Samba. So you can use this technique to just test and make sure that you have a valid computer name and password for your computer account.

So once you've got connectivity to your directory server, you've got lookups of users and groups and everything working, the next thing that everybody wants to be able to do is actually authenticate users. So Apple provides a utility called DIRT, which just stands for Directory Tool, and you can use this to test standard authentication and NTLM authentication. So you need to supply -m and then the plugin/node. So keep in mind if there is a space in the name of the plugin or the node, such as Active Directory or All Domains, remember to use the quotes.

Then you need to supply the username and the password, and in the case of if you want to test NTLM authentication, it's -ANT. I also want to point out that there is a man page for DIRT. It does say in it though that you can actually omit the password and it will prompt you for it. That actually is not the case. A bug is filed.

So if you want to test Kerberos authentication, on the other hand, we have the standard Kerberos utilities. KINIT lets you test getting a ticket or just get a ticket. KLIST lists any tickets that you might already have. KPASSWORD lets you change your password over the Kerberos protocol. These are great, but some of you are more GUI inclined, or maybe your users are more GUI inclined. So there's actually a nice little GUI utility that's often overlooked because it is in System Library Core Services. It's called Kerberos.app. And for some reason, people don't go there to look for applications very often. So this may be a good option for you.

If authentication isn't working, there's some config files that you may want to take a look at. And so if you're using Open Directory and password server, you're going to look in var db authserver, and you're going to have authserver replicas, .local and .remote. And if we take a look at one of these files, It's pretty interesting, doesn't it?

A lot of As. So, turns out that that's not just a bunch of gibberish. That is actually Base64 encoded again. So if you look at the very first part of that data section, that is actually the IP number of your password server, or at least it ought to be.

So you can actually decode this, again with the OpenSSL command, and check and see whether or not that's actually the case. If it's not the case, you can delete this file, it'll get regenerated, but you may find that it still has the wrong IP number. And that is because you probably have an out-of-date config record on your server, and we'll talk about how and where to change that when we talk about Open Directory Server.

For Kerberos, you have the edu.mit.kerberos file in library preferences. What's important here, you should have a default realm. It's really important, especially if you're mixing open directory and active directory. The default realm is going to need to be the active directory realm. Otherwise, if you have multiple realms, make sure that that realm is actually the domain or the realm that you want to be using primarily. Then, you should also have a Realm section for every Kerberos Realm that you're using. If you're using Active Directory, you may have lots of these, because you'll have one for every domain in your forest. If you're using Open Directory, you'll generally just have the one.

So, if you've done all this, it's still not working, you're looking to take it to the next step to try and figure out what's going on, you can always go to a TCP dump and see what's happening on the wire. So, there's lots of stuff going on on the wire, so really all you need to look at are what's going on for Kerberos, which is port 88, LDAP, which is port 389, or if you're doing it over SSL, it'll be port 636.

If you're using DNS, which is port 53, and if you're using Active Directory, the Active Directory global catalog, which is port 3268. There's a little bit of information on how to use TCP dump on our developer site. It's developer article QA 1176, Getting a Packet Trace. So check that out.

We're now going to move away from talking about some just general directory service stuff, and we're going to talk about login troubleshooting. Because there's a bunch of stuff in login troubleshooting that's not strictly speaking directory services. It is stuff like getting the home directory mounted, managed client, portable home directory syncing. And so we're going to talk about that in this section.

So the first thing we're going to talk about is just getting that home directory mounted. And how many people in here have had a problem getting home directories mounted before with their clients? Yeah, pretty popular problem. So, just take a guess. What do you think the first thing that I'm going to have you do?

You're all brilliant. So yeah, check your DNS, because your home directories are going to be trying to mount your home, or your client is going to be trying to mount your home directories based on the DNS record, and that's going to require both a forward record and a reverse record. So that's the first thing you want to do. But since you're all advanced system administrators, I know that's what you'll go and do.

And so the next thing you're going to want to look at is just... Can you actually manually, like doing connect to server on the client, actually connect to that user's home directory? It's a really simple thing, but that will catch a lot of stupid mistakes that I know never actually happen, but things like the service is not running on the file server, or permissions being changed on the file server, things like that.

Next, what you want to do is actually check the user record and the mount records, if there are any. So in this case, we're talking more about open directory. And you want to check that home directory and NFS home directory have the right domain name, and that they match the mount record. So what that should look like is you should have a home directory attribute, which should be sort of an XML sort of string. And that should have, in the URL, the DNS name of your server.

should have the same DNS name in your NFS home directory. If you have multiple DNS names for a server, don't go using one in one place and one in the other. Make sure they're both the same. It's also important that you don't have, for example, the DNS name in one of these and the IP number in the other, because it won't be able to match them up.

So then these need to match the actual mount record. So again, you'll have the DNS name in a couple of places in the mount record. So if you check all of that, everything looks good. The next thing you need to do is troubleshoot everybody's favorite process: Auto Mount.

So one thing you can do is actually run AutoMount manually. So you can supply -m AutoMount servers, -mnt private var AutoMount network servers to give it the correct paths to match up with where the home directories are going to get mounted. And then you use the -d flag to actually get debug information.

Another thing you can do is if you go back to the syslog things we were doing before, if you set asterisk.debug, automount logs to syslog. So you'll actually get a whole lot of debug information in the system log or if you choose some other log. Another thing you can do is you can actually look at what options are being passed to Automount when it starts up.

And it may be the case that changing those options may be helpful. There's a KBase article 303841, Resolving Login Issues with the Active Directory Plugin. Despite the name, this actually can apply to Open Directory as well. So what this will help you do is change a couple of options in there that may be causing your issue.

[Transcript missing]

But if you look in these network folders, you'll find the hierarchy. You'll have network, servers, server name, SharePoint. And then, in the case where something has gone wrong, you'll find maybe a few user home directories. And they're not on the server. They're on the local client.

That's probably not a good thing. And you probably, if that's the case, want to save the user's files rather than just delete them. Just an idea. But then you want to move or delete these files out of there, because if you get files in the place where the links are supposed to be created, from that point on, Automount is never going to be able to create the link. So your home directories will never work after that point.

So look for what we call bogus directories in any of these locations. And if you were in the network file systems discussion, AutoFS is going to take over all of the duties of the automount process for Leopard. This will actually give us a bunch of new functionality and so on. So look for some exciting things in Leopard.

So, you get past all of that, think your home directories are working good, and... The users, you know, they're getting past the login window, but they get the lovely, soothing blue screen, and that's as far as they go. Or you start loading the desktop and it never quite finishes. So what you want to do is SSH in, because you don't have control of the machine from the GUI, and try top. See what processes are using a lot of CPU. See if they have the hung flag. See if you see processes running that shouldn't be running.

Another thing that can cause this kind of a symptom is just bad preferences in the user's home directory. So move aside their library preferences folder, have them try and log in again. If that solves your problem, it's probably bad preferences, and you can actually use the PLUtil with the lint option to check for corrupt plist type preferences.

If you think your problem is actually something with managed client, for example, your mobile accounts don't seem to be getting created right, or you're not getting the correct managed client preferences enforced, you can actually use some utilities that we give you to help debug this. So MCX Cacher is one, and that's kind of hidden away in System Library Core Services, MCXD.app, Contents, Resources, MCX Cacher.

And in all the little commands here, I did not put that full path on all of them, but because that's not going to be in your normal path, you will want to include, you'll have to include all of that for it to work. I just didn't want to do it because it kind of took up the entire slide.

So if you want to test creating a mobile account, you can do mcxcacher-U username, and then you can supply to the -h option if you'd like a location where you want the home directory to be. So if you're going to have a home directory in a non-standard location, this would be how you could specify it.

Also with MCX Cacher, you can use the -f option, which essentially just will make your client unmanaged. And you can also force a refresh of the cache at the next login from the client side by using the "-d" option to dirty the cache. Another way of doing this is also to delete any MCX cache objects that you might find in the LocalNetInfo database. You can also look at these cache objects just to see what MCX settings they have in them and see if they look correct, or maybe they're just totally out of date.

So, if this doesn't do it for you though, you can actually get a bunch of information from the MCXD process by setting the debug output option using the defaults command. So you want to say defaults write library preferences com.apple.mcxdebug, debug output, and then choose a level of logging from 0, which is none, to 3, which is the most verbose. All of this information is going to go to your system log.

The next thing, and this is something that was new in Tiger, is if you're doing portable home directories. So this is where you have the local home directory, the network home directory, and you're syncing between the two of them. So sometimes the sync database can just be corrupt. So in the case where either things are not syncing or things seem to be syncing incorrectly, you may just want to reset that database. So if you are syncing the library folder of the user's home directory, you can actually do this really easily.

Just hold down Option Shift when you log in. If you're not doing that, then you're going to want to move aside the library mirrors folder in the user's home directory. One thing to be aware of though, is that this also holds the sync databases for .mac. So you will be resetting that as well here.

And then, like MCXD, you can actually get debugging out of Mirror Agent, which is the process that handles the syncing of the network home directory and the local home directory. And so again, that's a defaults command to com.apple.mirroragent. This has a slightly different range of logging options, 0 to 4. 4, though, is really, really verbose. I would say 3 is probably where you're going to want to start. Logs for this will actually go into the user's home directory in library logs mirroragent.log.

So now that we've talked about the client side of things, we're going to switch gears and talk about the server side of things. So, Open Directory, really just three servers all tied and integrated together. First, you've got an LDAP server, and that's basically SLAPD, which is your standard Open LDAP LDAP server.

Then you've got the password server, which is basically Apple's mechanism for dealing with all different types of passwords except for Kerberos. And then to deal with Kerberos, we have your standard MIT KDC. The problems that you have with Open Directory Server pretty much all come down to one thing. And what do you think it might be?

You guys are catching on real fast. So, the very first thing is you want to check your DNS. And the thing to be aware of is that even if DNS is working now, Was DNS working when you set up your server? Or at some other point when you made some other major change?

And so DNS is so important that we've actually made some changes in 10.4.6 for this. But the important thing here is after you've checked to make sure your DNS forward and reverse records are correct, check the output of hostname on the server. Make sure it's correct. Make sure it's your DNS hostname, and not, for example, your Bonjour hostname, or localhost, or some out-of-date DNS name. If it's not correct, you're going to want to fix that with scutil, and it's the set hostname command.

And like I said, as of 10.4.6, we've made some changes. And so we've made the server a little bit smarter about resolving its host name. We also have made some changes so that if it appears that there is a problem with your host name, we will start putting log messages in your system logs saying, "You got a mismatch. Can you please fix it?"

It is the FQDN, the Fully Qualified Domain Name. So there's a KBase article that details all of these changes. It's KBase Article 303.697, Mac OS X Server 10.4.6, Changes in Server Host Name Discovery. So, now let's look at each part of the server individually. For SLAPD, you've got your standard OpenLDAP config file in /etc/openldap. You've got your logs in /var/log/slapd.log. And you've got the actual database and transaction logs in /var/db/openldap/openldapdata.

One thing that you should be aware of is in addition to just the database for OpenLDAP in here, you also have all the transaction logs that are used by the Berkeley database back end, which is what we use for our SLAPD database. These transaction logs, they're all 10 megs each.

And if you're making a lot of changes, you may get quite an accumulation of these transaction logs. And you may not actually need all of them. So if you're looking for, where has all my space gone? You can actually use the dbarchive command to get a list of which ones are not needed. And you can actually, if you supply the -d option to it, it will take care of deleting and removing those transaction logs for you.

So digging deeper with slapd, we have the slapconfig command. This is actually what's used to set up your open directory server. The enable slapd log option will turn on logging to the slapd log. The BackupDB option is actually the command line equivalent of the archive of the Open Directory server that you see in Server Admin.

So if you're looking for a way to script, A backup of the open directory databases, this is what you want to look at. Then you also have some standard LDAP utilities. SlapCat basically will just export everything in the LDAP server out to an LDF. SlapAdd just lets you add some records. And LDAP Search we've already talked about, that's just a way for you to directly query the LDAP server.

So if you need even more troubleshooting, because slapd is maybe hanging or dying on you, and you want to see what it is that's causing that, you can actually run slapd in debug mode. So what you want to do is stop slapd and restart it with the -d flag.

Problem is, if you try and just do that, launchd is going to be very vigilant and just restart it for you before you can turn it on with debugging. So before you try to turn it on with the -d flag, you're actually going to want to unload it using launch control from launchd, so that launchd isn't sitting there trying to restart it on you. started on you.

When you're all done troubleshooting, you can have LaunchD load it back into its Watched list just by changing the Unload to Load. So again, you can use the -d option. There's a lot of different levels of debugging. Unfortunately, they're rather complicated, and I couldn't fit them all in the slide. So check the slapdman page. It gives you a very nice, detailed explanation of what they are.

Moving on to Password Server. The database is stored in var/db/authserver/authserver/main. There's a config file in librarypreferences.com.apple.passwords erverplist, and then there's also the var db authserverreplicas file that we were talking about that are the same as what you would see on the client. The biggest issue that we see with password server, because there's really not a whole lot that you can change or configure specifically just with password server, you don't really interact very directly with password server. The number one issue we see is, surprise, surprise, a DNS issue. And what happens is, at some point, DNS has not been working quite right. And so in your com.apple.passwordserver.plist, you have, in the SASSL realm, the wrong host name. Maybe you have localhost.

And usually where this happens is if you have a master and multiple replicas, and you'll get some of the replicas have the wrong-- this should actually be the DNS name of your open directory master. So all the replicas should have the same master listed there. And some of them, instead of having the DNS name, have something like localhost. So this is something that you would want to fix.

There's a few other utilities that you can use that interact with password server. MKPathDB, very powerful, lets you look at all the slots in password server, see all the password server users. Can help you figure out, for example, if you have users that aren't authenticating, and especially if you've maybe just done a big import, did they actually get a password set for them at all? Are they in password server at all?

So you can look and see if there's a list of them in mCapacityB. You can also do things like set passwords from there. This is the tool that you would use if you somehow lost your DER admin password. But in general, MKPastDB, Nest, which lets you set some configuration options for password server, and PWPolicy, which lets you set password server policy options from the command line, are things that generally you don't need to use. And if you're going to try something with them, you can either maybe magically fix your server, or completely break your server. So before you do anything, especially with something like MKPastDB or Nest, I highly recommend you back up your open directory.

And then finally, the Kerberos Server has its own separate database, var db krb5 kdc. Even though it's the same users and the same password, it's two separate databases. So one thing to be aware of is you can actually have passwords that somehow get out of sync. So you can have the case where Kerberos authentication works or doesn't work versus regular authentication works or doesn't work.

If that's the case, you need to go back and reset the password in whichever one has the wrong password. You have some standard Kerberos utilities, KAdmin, KAdmin.local, KT List. These are what you could use to, for example, reset the Kerberos password. But in general, you don't really need to use these very much. SSO Util you can use to set up Kerberos single sign-on for services.

And last but not least, there are some configuration records in Open Directory that are used by clients to set up their config records, particularly for things like password server, It stores the IP number of the password server. Now, if this case where you have the wrong IP number showing up on the clients in their password server config file, this is where you want to come to check to fix it. If you move servers around, change IP numbers and so on, you probably are going to want to double check here to make sure that this has gotten updated.

Same thing, the edu.mit.kirberos file is also generated for Open Directory clients from a config record, and that stores things like the Host names of all of your Kerberos servers, so basically all of your masters and replicas. So you're going to want to, if you've been making a lot of changes there, check and make sure you've got the right names here.

So, now that we've talked lots and lots about directory services, it's time to talk about something else. And this is probably the next most common area that we get questions about, which is file services. So we're primarily going to talk about AFP and SMB here. And for AFP, On the server side, there's just, there's not a whole lot of logging.

On the client side though, you can get quite a bit of detail. And you can use the defaults command to set a debug level and the syslog logging with the com.apple.appleshareclientcore, which is basically just your Apple file service client on your OS X clients. Keep in mind, the debug level can be set from zero to eight.

Pick an appropriate level. 8 is going to be awfully verbose, so you probably do not want to start at the highest level here. Also keep in mind that you need to change syslog, do asterisk.debug logging, in order to get this to go to your system log, and you need to either restart syslog by hopping it, or restarting your system before the logging will start.

On the SMB side, in Server Admin, it will let you set low, medium, high. And these basically correspond to the Samba logging levels of 0, 1, and 2. That's great, but Samba actually allows for you to log levels all the way up to 10. So you can actually go into the smb.conf file in the Etsy directory and set that to whatever level you would like.

If you want to figure out where your actual shares are configured, they're really stored in NetInfo in the config SharePoint's record. Samba also keeps a copy of these in the smb.conf file, so this is something that you may want to look at to make sure that the two are not out of sync.

In Leopard, the share points will probably be stored in a flat file on the local system because net info is going away. You can use the sharing command to list SharePoints, configure SharePoints, and so on from the command line. And then, for just basic configuration, all of the settings for AFP are stored in library preferences com.apple.applefileserver.plist.

Generally, you don't need to make any changes in here, but there are a few options that are not exposed in the GUI. An example of this is you can set the number of threads that are used by the file server. This can be a great thing to increase the number of threads to make your performance better. But in general, If you see an option in there and you're not really sure what it does, you probably don't want to change it.

If you see an option in there and you're not really sure what it does, you probably don't want to change it. So one big problem that we see with file services is we get a lot of questions about authentication. And I'm going to give you one guess. What do you think the big thing that you need to check for-- I almost gave it away there. What is the big thing that you need to check? .

Good guesses, but again, it's DNS. This is particularly important if you're doing Kerberos. So let's look at, first of all, AFP, two authentication types, standard and Kerberos. Generally, there's not really any reason to change this. You can change it in the pulldown in Server Admin, but you know, if you want to use both or just Kerberos, you can set it. One reason why you may want to change this is if you're in the situation where you are using Active Directory and you have users in a large number of groups, Active Directory stores that group information in the Kerberos file, which is sort of a non-standard Kerberos ticket.

And the ticket just gets too large for the AFP server. So Kerberos authentication fails. So you could actually use this to disable Kerberos authentication so that your clients can at least authenticate. Kerberos won't be working, but at the very least, your clients are authenticating and getting to their files.

SMB is a little bit more complicated because of course we've got more authentication methods. We've got the old land man authentication, which hopefully nobody's using. We've got NTLM v1 and v2. And finally we've got Kerberos. So again, this is something where you need to remember, you should test the different authentication methods separately. So to test NTLM authentication, you can use the dirt command to test Kerberos. You can use the K in it to try to get a ticket.

But it's important that your Samba config file is configured correctly for authentication to work. So the four options that are most important: Security, Realm, Workgroup, and Use Spanago. For security, if you are using Active Directory and you're going to use SMB file services, make sure security is set to ADS.

Sometimes people will have it set to user or domain. If you have it set to domain, That, by default, will cause Samba to try to change the password periodically, which is fine, except it kind of forgets to tell the Active Directory plugin it's done that. And so, yeah, oops. Suddenly, your whole server can't authenticate. So usually you want to avoid that. Now, if you're not using Active Directory, you can use the user or the domain option.

If you're using Active Directory, make sure that you have the correct Kerberos Realm here. It should be in all caps. And that your workgroup reflects your NetBIOS name of the domain, not the DNS name. And use Spanago. You pretty much always want that set to "YES" in order to have Your client and your server work out what authentication method they want to use.

If you're using Active Directory, use the dsconfigad-enablesso command. That will set up all of your service principles. It will also edit your smb.conf file for you. And just one last thing that can be a troubleshooting tip is if you're having problems authenticating with Samba and the Active Directory plugin, you might have a corrupt vardb-samba-secrets.tdb file. you need to unbind, delete it, and rebind.

If you're having problems with users not getting access to files that they should have access to, remember ACLs need to be enabled at the volume level. If you're going to look at them from the command line, you need to use the "-le" options with "ls". And make use of the effective permissions inspector in Worker Manager, because ACLs can be pretty complex.

Finally, if you have issues with group membership, where users in a group are not getting access to something based on their group membership, the authoritative way of telling whether or not a user is in the group is to use the DS Edit Group with the Check Member option. This basically will just respond to you, "Yes, the user is in that group," or "No, they are not."

Also keep in mind that with TIGER, although we get you over the typical UNIX 16 group limit, Samba is still limited to 16 groups, except... You heard the little announcement that there's a universal version of OS X Server? The universal version has this fixed, so Samba will now be able to use more than 16 groups.

I take it some people had that issue. Just two last things that you should be aware of. If you're having weird refresh issues with the finder on your clients, look for corrupt.ds store files or .ds store files that have the wrong permissions. And if you have issues with resource forks, because you have Mac clients and PC clients, and so you get the data and resource forks separated, you can rejoin them with System Library Core Services fix up resource forks.

And last but not least, if you still are having other weird issues with file services, you can use the TCP dump command to look at what's going on in the wire. You can use FS usage to look at what's going on in the file system. And LSOF will tell you what files are open and in use, and by whom. So, if you have any questions, here's the contact information.