Xsan Best Practices - WWDC 2006

Information Technologies • 1:14:23

Xsan and storage area networks can easily be tailored for use in a variety of configurations. Come to this session to learn best practices on SAN setup and configuration, Xsan deployment, and effective administration of your SAN.

Speaker: JD Mankovsky

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good afternoon. My name is JD Mankovsky. I manage the Apple's enterprise consulting services team. So we actually do a lot of Xsan deployments, probably two or three a week. And I'm here to share with you a lot of the knowledge that my team has acquired over the beta cycle as well as the 1.0 to 1.3 life of Xsan and looking forward to Xsan 1.4 in the next couple of years. So how many of you here are running Xsan today? Wow. Excellent.

So thank you very much for that. This is pretty, pretty amazing. And so obviously you guys are pretty familiar with the Xsan and the file system. But what I wanted to do just to put things in context is talk a little bit about kind of the Xsan underpendings first. But first let's go through what we're going to talk about today.

To make sure that it hopefully meets your needs. First we're going to quickly talk about how Xsan works and how does it actually write to the Xserv rates, the storage. Then we're going to talk about some of the best Xsan volume configurations that we've seen around as well as talk about affinities, talk about permissions, talk about how to optimize your switch and things not to forget when you configure those switches. As well as talk about some heterogeneous deployments.

So we're going to talk about some of the best Xsan deployments that we've done using the StoreNext FX product. Quickly touch base on backup and then spend some time on high availability. Because I know a lot of you try to deploy Xsan products in critical, mission critical environments. And I'm pleased to announce that today we actually have a solution for you in terms of mirroring your storage and making sure that you have zero points of failure in your Xsan deployment.

I figured that would pique your interest. So I put that at the end just to make sure you stick around. So how does Xsan work? Well, a lot of you are very familiar with how a traditional file system works. And every day when you plug in your iPod, when you plug in your FireWire hard drive, it just all works.

And a great example of that is HFS and UFS. Now, I don't know if any of you have tried to take a FireWire hard drive and you have two ports and hook up one to one machine and one to the other. That's not the kind of stuff you really want to do, especially if you don't want any data corruption. And we've seen that happen.

I mean, we have seen that happen, especially with XServe RAID, where people will just hook up the RAID to a fiber channel switch and forget to do any LAN masking. And they have this beautiful volume show up on a couple of machines. And they're like, oh, cool. I've got my... I've got a cheap version of Xsan here. It ain't going to last very long, let me tell you.

Obviously, network attached storage, you know, AFP, SMB, NFS, those are great protocols. Trying to run Final Cut on top of that, and not quite supported. Okay, so you really want something like a Xsan to do that kind of deployment. Obviously, from a permission perspective, POSIX, ACLs are fully supported since we shipped Tiger. with those protocols.

Now, on the SAN file system, obviously, the disks are attached to all the compute, well, they're attached to the fiber channel switch, which is kind of hidden to the end user. All the end user sees is a volume, and they have no idea how the whole underpinnings work on the back end, and that's what we'll talk about very shortly.

The good news is, on the permission side, with Xsan 1.4 shipping very shortly, you can now have full ACL support in the 1.4 version. So that's great news. That's a huge, just a great new feature for the new product. So how does Xsan work? Well, here's a very simple Xsan deployment. You have, you know, right now you have one client machine.

You have an XServe or two XServe metadata servers. You have a LUN1, which is your metadata, and you have three data LUNs. The LUN1 is always configured in a RAID 1 setup. So we see a lot of customers who just make everything RAID 5. That is not a good idea. Okay, so make sure you always configure your metadata LUN as RAID 1.

Also, we see a lot of people try to... take advantage of those extra four drives that are in the bay. Also not recommended. Okay, so when you're writing data to the metadata LUN, the packet sizes are a lot different than data packet sizes. So yes, you have to spend half of a RAID dedicated for your metadata, and that's just the way it is. So don't try to run anything else on those four other disks. Use them as hot spares. You know, you get extra, extra redundancy.

So you go into the Xsan admin and you select your first LUN as your journal data, as your metadata. And, you know, obviously that's the orange color coded here you see on screen. And, you know, the client has absolutely no knowledge of, you know, of the file system layout. And so when the client wants to write data to the SAN, he doesn't, he'll ask, he will ask permission to the metadata and ask the metadata, tell me where do I need to write this 12K file, for example.

And so the client talks to the controller, the controller then, you know, accesses that LUN, that RAID 1 LUN, that metadata LUN, and the controller will read the catalog and give that information back to the client and say, you need to write that file to, you know, offset to LUN 2, for example, at offset 100. And then the client will then write that data directly to that data LUN. So what I'm trying to say here. Is the, the clients will never hit the metadata LUN. Okay.

Only the metadata controller will hit the metadata LUN. So, for some security purposes, you could potentially, you know, create a, a zone on your switch to prevent people from going into disk utility, for example, and, you know, doing some nasty, nasty things to the metadata LUN. Just, just, just a recommendation, something you can potentially do if, if, if you feel like people are. You know, a little crazy out there. Or just remove disk utility from all the, the client machines. That's another way of doing it.

So then the client finally just goes and writes to that data line. Another important point here is, as you can see, when you're writing directly to a file system, you've got direct access to the disk. Here, you pretty much have a four-step process to actually write data to the SAN.

So obviously, when we're using Xsan in a file serving type scenario, the performance is really never going to quite be as fast as direct attached. So just keep that in mind because you're having to go through the metadata. You're having to ask permission and ask for where do I write that file each time. And that obviously requires some CPU cycles and requires some time to get the information back.

Now, the beauty of the SAN is obviously when you start adding a lot more of those data lawns. So, for example, if I stripe, well, if I create a data lawn of two, you know, two raids or two controllers in this case, two data lawns, I am now striping my data across, which means that when I'm actually writing that 12K file, again, the client will ask the metadata, where do I write the file? The metadata now will say, well, that 12K file, I'm going to split it in two chunks of 6K each.

So, you write 6K to one of the lawns and 6K to the other lawn. So, right now, you've basically, you know, quote, unquote, doubled your performance. Okay? So, from a theoretical perspective, we always say when you write to data lawns, you're getting about 80 megabytes per second. Okay? So, now that we've striped two lawns together, you're getting 160. Right?

If you have four lawns, you're getting 320. Now, obviously, you can't go, you know, to like 2,000 or 4,000 megabytes per second. You run into some other issues down the road. But that's kind of the theoretical way of how to stripe your data and how to add more lawns to gain more of that throughput.

So the point is that from a client perspective, it's totally transparent. So as you need more storage, you can easily grow, buy more XR raids and grow those data lines by creating new storage pools. And it's totally independent of the client because, again, the client always talks through the metadata. So you're changing the metadata. The metadata has a new config file that says, hey, instead of writing from block zero to block one, you're going to write from block two to block three.

So instead of writing from block one million, I can now write data from block zero to block two million. And six months from now, you add a couple more raids, and you can write to block three million. So you get the benefit of a global file system by using Xsan.

So that was just a quick high-level overview of how the file system and how Xsan writes to the storage. When we engage customers and they tell us, hey, we want to deploy Xsan, we went to NAB, you know, we love what we saw, we've got the big, we've got the checkbook, we're ready to spend some money.

We like that, right? Obviously, we all like that. But the most important thing to do first is to really understand, you know, the customer's requirements and their goals when they say, hey, I want to go spend, you know, half a million dollars or a million dollars on a SAN deployment.

Really understanding, you know, what they're using today, right? If they're using, you know, just a bunch of disks and they have direct attach XServe rates to Final Cut machines, what are their requirements, what are their needs, right? Obviously, you know, understanding, you know, if they're a 24 by 7, you know, operation. You know, if they're early adopters of new technologies, you know, do they have any Mac experience? Obviously, we're getting to that.

We're getting a lot of new customers, as you saw, you know, and understanding if they're a Windows shop or a Mac shop. Those are all important key tidbits that will let you know how easy or how hard the Xsan adoption and deploying a SAN will be within that customer.

Also, the scope is really important. So understanding if it's a four-node SAN, a 10-node SAN, or a 60-node SAN, things are a lot, you know, very, very different in terms of how you approach the deployment. So, how is their IT staff? How technical they are? So by going in and meeting them, you can very quickly gauge, you know, how technical these people are if they've deployed SANs before. You can bring that conversation to a whole other level. You know, very important to understand, you know, how you're going to be able to do that.

Xsan Mankovsky: How they're going to react towards the this this this new technology. Xsan Mankovsky: And like I said, really understanding if they know the Mac or if they don't know the Mac. What kind of training and certification. Are they willing to go through. That's something really important. So You know, you can go and do the deployment. But once you leave and you go to another customer to go to the next deployment.

You don't want to start getting those calls, you know, on the weekends or, you know, at three o'clock in the morning because the sand's down Okay, so it's really important to make sure that you leave documentation and that the people know how to maintain How to maintain the system. Now, obviously, if you sell seven by 24 support, you know, and you're willing to do that. That's a whole different ballgame. Okay.

So training is important. Understanding how Xsan works is important. Understanding their workflow. How are they accustomed to work with their data? So if they're in a video editing scenario, how are they moving those files? How are they ingesting their data? Those are all important things to understand within their workflow. And obviously, a great situation is, are the people that are going to be working on the SAN full-time employees?

Or are those freelancers that they hire as they have a big job coming in? Or they just hire people off the street to come and do work? That's also another behavior because maybe those freelancers have never used a SAN before, and there definitely are some You know, they have to understand again how to behave and how they write to the SAN and if they copy files from an external hard drive onto the SAN, there's some technical needs they need to really focus on. We'll talk about that a little later.

Up time requirements are also very important. Down time, understanding their down time window. So when I talk to a customer and they tell me, you know, we have to be up and running 7 by 24 or we have a 6 hour backup window, you have to make sure that you build in the right, you know, tape backup system to make sure you can fit within their 6 hour backup window. So those are all important points to bring up.

[Transcript missing]

mention who it is but it was it was and literally four months later that you know they finally had the power so we had to literally take each machine configure it turn it off take another machine out of the box and we return it off until they had the right you know proper cooling within the organization so you know we had we had another customer who same thing we showed him the spreadsheet and we showed him the Btu requirements and you know they told us well you know that room that we had in mind just is not going to work and you know what it was great because they took the time to go back find the right room and say okay we'll deploy this in four months but unfortunately most of the time when we work with those big agencies they want the sand yesterday okay they have the money they're ready they got the check they finally they're finally ready to spend it and they don't care right they don't care they're going live September you know September 8th are going live and no matter what it has to go and it's like make it happen and those are the scary ones those are really the scary ones you have to be really careful about Understanding their break-fix requirements.

Again, if they want someone on site within the next two hours, those are very expensive support contracts to put in place. With the XServe and the XServe RAID, we have a RAID kit for spare. We have an XServe kit. Make sure that when you deploy your SAN that you have spares kits.

We make it very easy on the XServe and on the RAID to quickly replace a motherboard, replace a power supply, replace a controller. Our recommendation is if the customer buys five XServe RAIDs, get one spares kit. If they buy ten, buy a second spares kit. Our ratio usually is one to five. If you think about it, five XServe RAIDs is ten controllers. Having one spare controller for each ten RAID controllers is probably a good number to have in mind. Plus, make sure you have plenty of spare disks.

And back on the cooling thing, make sure you monitor those XServe rates. So many times have we seen those XServe raids run at over 100 degrees Fahrenheit or, you know, and it's really, you know, it really drastically lowers your mean time between failure on the disks. So make sure you monitor that.

Data migration is another great one. That's when the other bell, you know, the other alert comes out in my head when customer says, well, we'd like you to do data migration. That's also very, that also is very time consuming. So make sure that you plan that properly and that the customer can, if possible, do a backup before you show up to deploy the SAN and they have all their data backed up properly.

Obviously, budget is very important. If the customer has a fixed budget, it's a lot harder because you have to make sure you don't forget any single cable, any single transceiver. So make sure you build always kind of an N plus one situation, I call it. So when the customer buys 20 PowerMax and 20 fiber channel cards, make sure you put in an extra fiber channel card for a spare.

Because if a fiber channel card goes bad and they're in a 7 by 24 shop, it's better for them to buy a 7 by 24. So just to have a spare. They can always call AppleCare afterwards and get a swap. But it's always good to have some spare transceivers, some spare cables, some spare fiber channel cards. Just make sure you have that built into your quote.

Another really important point is stay within a certified and tested environment. Try not to deviate too much from our best practices documentation that we have out there. Make sure to stay within that. If you go too far out, then you're going to run into issues. There's too many components in a SAN.

You've got the PowerMax, you've got the fiber channel switches, you have the cables, you have the RAIDs, you have this whole environment. And there's so many variables that you want to make sure that you pick the components that Apple has certified. So when I have a customer that tells me, hey, I'm interested in buying a MGE UPS.

Because the APC UPS is, you know, 100 bucks more. And I'm going to go and tell them, well, you know what, we've certified, you know, the APC UPS. You know, it's been tested. It works. You can do, you know, controlled shutdown. That's all been tested. We haven't done any testing with MGE. So we'd rather you get, you know, an APC UPS.

Right. So you have to be, you know, really honest and say, we'd rather just, let's just stay focused. And to make sure this thing works within your timeline. Let's stay within, you know, this tested environment that we feel comfortable and that we've been deploying for the past, you know, 12, 18 months.

Also, make sure you understand their future growth and future requirements. Okay, so when you buy a Qlogic switch and, you know, the Qlogic switch comes in an 8-port config, 12-port config, 16-port config, or, you know, the fully unlocked 20-port. If the customer is going to be deploying this within a small environment and you know they're never going to grow, that it's really just for two or three workstations, then fine.

They can buy, you know, the entry-level Qlogic switch. But if they're going to grow the system in six months, you know, just have them spend the money, you know, on the fully unlocked Qlogic switch. Don't have them go and spend $2,000 for each four ports after that because then the price gets, you know, exorbitant. So make sure you understand their requirements.

For the next year, for the next, you know, 18 months to make sure you build this properly, right? If you, if they have 10 machines and you're putting in a 24-port Ethernet switch for your metadata, and, but they're planning on deploying 30 machines in six months, just put in a 48-port Ethernet switch, right? Don't go and put a 24-port and then cascade another 24-port. Those are, that's a bad idea, right? Especially because, you know, as you know, the metadata network has to be very efficient and very fast.

So by cascading this, you're adding more latency. So just those kind of little tidbits that you just, you know, you want to make sure you get it right the first time. And again, set proper expectations, okay? Rome was not built in a day. So make sure they understand that.

And that's not always easy, especially, you know, in the movie industry and in the, you know, the TV industry because, again, they want everything now or yesterday. So let's talk a little bit about some third-party add-ons that we feel are a requirement every time you put a SAN together.

Before that, I just wanted to put up there the power distribution to make sure you guys understand some of the requirements for some of the products up there. Okay, so you're putting a sandbox to, that's a 7,000 watt product. Okay, so make sure you're aware of that. You put in a quad core, you know, G5, you know, that's 550 watts. But that's while you're using the product. Okay, on startup, that thing will draw seven amps. Okay, so make sure that you don't put four machines on a 20 amp circuit, because guess what?

Your circuit is not going to-- I mean, if you have to reboot those machines, your circuit will just blow up. It's that simple. So make sure you keep that in mind. It's not just while I'm running it. It's what happens if the power goes down, like we had a few weeks ago with the heat. If the power goes down, the bring up procedure is really important as well. And making sure that you can bring that up without bringing your entire building down is probably a really good idea.

On the cooling side, same thing. My recommendation, just put a spreadsheet together, make sure people really understand that. This is another really funny story. We're in the data center and we're like, damn, it's pretty hot in there. We're looking at the temperature of the rays and they're like 90 degrees. We're looking at the air conditioning up there and we're like, that thing's not blowing. It's not cooling or barely.

Well, yeah, because the air conditioning was pointing towards the air conditioning sensor in the room. So basically they were cooling the sensor and the sensor was like, oh, cool, it's really cool in this room. But the rest of the room was like running at over 90 degrees. So it seems really basic, but it happens every week. It just happens. So I thought I'd share that with you. The customer's not in the room, by the way.

Hardware setup. Again, you're not going to run in this with the Mac Pro line, but this is something that we ran into. And I'm sure a lot of you have Power Mac G5s and are using that today, obviously. So when you BTO a Power Mac, BTO is built to order. So if you get it from the Apple store, as you know, there's a 133 PCIX slot and there's some slower PCI slots. So make sure that you put the gigabit Ethernet NIC in there.

Hardware setup. Again, you're not going to run in this with the Mac Pro line, but this is something that we ran into. So if you get it from the Apple store, as you know, there's a 133 PCIX slot and there's some slower PCI slots. So make sure that you put the gigabit Ethernet NIC in there. Thank God we have built-in video in the new XServe.

So I want to personally thank Doug Brooks for that. But the video card is a 66 megahertz card. So make sure that if you don't use the video card, if you don't need it, remove it. Because obviously, if you put a fiber channel card that's a PCI-X card, the lowest common denominator between the VGA card and the PCI fiber channel card, it'll run at 66 megahertz. So just get rid of that video card. Use Apple Remote Desktop. It's fantastic.

When you put in your code and you buy XServe rates, make sure you don't forget the cables. Okay, the cables come with the XServe. There's no SFP to SFP cables that come with the XServe rate. So make sure you don't forget to add some additional SFP to SFP cables.

When your customer, when you're going in and this is a new installation and the customer has to pull fiber optic cables, this is really, really important. Okay. I've been to customers where first of all, they're like, yeah, you need a pair of cables. So in their understanding is one pair of cables will work, will actually allow you to plug in both fiber channel ports in the back of the Power Mac. So we specify in our statement it works.

You need two pairs of fiber channel cables. Because again, some people have never dealt with fiber channel cable. So they'll just take one cable and they see two strands of optic and they're like, okay, what do I do with that? Right. So just make sure you specify two pairs of optical cables. And then the other thing we see is if you see a customer pulling cable like the day before you show up, you know, through the ceiling or something, that's when you get a little concerned. Okay.

Once you deploy the SAN and the customer starts dropping frames or the volume doesn't mount on the machine, it usually points back to who pulled your fiber and what happened while that fiber was pulled. Right. Who kicked the fiber when they were sitting at their desk? Okay. That's another thing that we've seen quite often is make sure that Power Mac is in a corner and that the fiber coming is well protected so that when the editor is editing at the workstation, they're not, you know, kicking in it with their feet and having you pull a whole new strand of 100 meter, you know, optical cable. That would not be good.

On the XServe RAID, make sure that both Ethernet's are plugged in. So that's really important. So, you know, you don't need a gigabit switch for the XServe raids. You know, if you have some nice 100 base T switches, I'm sure plenty of you have old 100 base T switches lying around. Those are perfect to monitor your XServe raids and make sure that they're able to send notifications if something would go bad. That is something else that we see where you want to make sure your XServe raids are on the LAN network.

Okay, don't put them on the metadata network unless you set up, you know, a mail gateway of some sort so it can send notifications. Xsan, you're going to want to make sure that you're sending notifications out because you'd like to know if you have a drave that's getting ready to fail. It'd be nice to know that it is failing, right? And the email notification is a fantastic feature that we have built in, so make sure you take advantage of it.

When you deploy Xsan, the most important thing to have part of your Xsan is a UPS. That is really important. Again, we meet with a lot of customers who tell us, "Oh, we've got central UPS. Everything is cool. No problem. We don't need any UPS." Well, I always take that with a grain of salt.

Too many times I ran into a situation where the customer said they had central UPS and you know what, it wasn't the central UPS that failed, it was a fuse between the UPS, the generator, and where all the hardware was plugged in that blew up and the customer got into a pretty bad situation in terms of data loss. Because obviously when you go into the RAID and configure your RAID, it says make sure you have UPS before turning drive cache on.

Well, I mean, that is, you have to make sure you have a UPS in the middle, right? So I would at least recommend to have UPSs, you know, at least on your XSERV RAIDs and on your metadata servers. And obviously, When you're when you're to make sure you have a schedule shut down and proper shutdown working, the XSERS will shut down using this network management card, which is that AP 9617 you see up there. That's a network management card. And that goes in the back of their on each APC UPS is a slot in the back. So you buy that it's about a $230 product you slide that in there you configure it. We'll talk about that in a second.

And then you you install some software on the XSERS and then you have this parachute software which will gracefully shut down the XSERS when when you know your power gets below, you know, a specific threshold. So that makes basically means that, you know, with the XSERS being shut down. There's no more data being written to the SAN and that makes, you know, the fact that, you know, there's no Possibility really have you losing any any of your data. So that's that's really critical.

Another component. Is the AP 9270 or 9207 I'm sorry. That is the what's called a serial concentrator. You guys have noticed that on the back of the XSERS RAID. We've got this little serial port. Okay, it's it you should use that. Okay, so using the little cable, which is the Well, that's the Apple part number TC 547 LL/A that's it's called a simple signaling cable.

On the back of the APC serial consider you have eight ports. So what you do is you hook up the serial box that serial concentrator to the back of the UPS and then you have eight ports available for you to plug in up to eight XSERS RAID. Okay, you only need one cable for each XSERS RAID. You don't need to plug both of them.

Okay, so That will make sure that when the power goes down, it's going to send a flush cache command on the RAID. And so from now on the data will be written directly to disk. It won't go through the drive. Cache anymore. And when the power goes down, it's the, you know, no problem. First of all, the XSERS will be shut down because they're shut down through the network management card. And then the RAIDs have all their data flushed. So you're you're good. You're good to go.

The most important thing that we also noticed is if you have a network management card, it's obviously Ethernet. So that network management card plugs into a switch, a gigabit Ethernet switch, and then your XSERVs plug into that gigabit Ethernet switch. We'll make sure that gigabit Ethernet switch is also plugged into the UPS. Because if it's not plugged in and the power goes down, how the hell is it going to send the signal to shut down the XSERVs? Just little things. Just little things.

This is a fantastic user interface. So when you buy the network management card, make sure you configure it. Okay, so it's not the easiest thing to configure, unfortunately. It's not like APC has a default IP address that you can just, you know, connect to. You have to basically hook up the Ethernet to your PowerBook, and you send this ARP request with the IP address and the MAC address.

So on the card is actually the MAC address. There's also a little receipt that comes with it that has that MAC address written on top of it. Make sure you keep that inside of your database. And so you basically send the IP address you want to set that up.

Then you send a ping request, and then the name and password is APC. Okay, so then you can telnet into that, configure the subnet, configure the router, configure, you know, the DNS. And then finally, you can then bring up Safari and, you know, start, you know, using the web interface.

So remember we talked about simple signaling and using the simple signaling cable? Well, you want to make sure that you enable the simple signaling shutdown. Because by default, it's disabled. So make sure you go into that web interface and enable the simple signaling shutdown. If not, it's not going to give you anything. What I also like about this web interface is once you configure everything, you can actually simulate a power failure. And it'll actually calculate exactly how much uptime you have.

So that's just a beautiful tool built in. Make sure you run this so you can tell your customer, hey, you know what, you've got about 15 minutes uptime. There's also email notification built into the web interface for the APC. Make sure you configure that. Because what it'll do is that every month, it'll basically... run a self-test and send you an email saying everything is good.

I like those emails. Everything is good. Perfect. And so it just runs a self-test to make sure... because obviously the batteries have to be replaced. Every year, year and a half, you have to make sure you replace your batteries on your UPS. It just doesn't keep on working.

Upon power failure. So what happens if there is a power failure? Well, Everyone familiar with the bring down and bring up procedure of a SAN? Yes, very important. So when you shut down a SAN, obviously the first thing you want to do is bring down all the workstations. Stop the SAN if you can, obviously, with the Xsan admin. Bring all the workstations down, the client machines.

Then you can shut down the Xsers, the metadata servers, and then you can shut down the RAIDs and the Fibre Channel switch, if need be. On a bring-up procedure, you want to do the exact opposite. So what you want to do is first you want to bring up the QLogic switch or your Fibre Channel switch. You want to wait until that is booted up, fully booted up. Then you want to bring up the RAIDs.

Make sure the RAIDs are fully booted up, and then you want to bring up your metadata servers. And the reason for that is those Xsers boot pretty fast now, as you've noticed. If the Xsers boots up and it doesn't see the switch and it doesn't see the RAIDs, it's going to freak out. So make sure that you do that properly. And we have this fantastic feature built into the Apple Remote Desktop called System Setup. Set wait. And you can set that up for startup after power failure. So set that up to two minutes.

Now obviously you want to make sure you tell the customer that. Because when they hit the power button in the front of the Xsers, it's not going to boot up. It's going to wait for 30 seconds, 60 seconds, and they're going to be like, "It's not working. It's not working." And then all of a sudden it's just going to start up. So set that up so that your RAIDs come up, your QLogic switch or your Fibre Channel switch comes up, and then your metadata controllers come up, and all your other servers and Power Mac clients.

On the tape library side, I just wanted to make a quick point. How long does it take to backup 6 terabytes with an LTO3 device? Anyone know? It's about 26 hours. So make sure that, again, when you have a conversation with a customer in terms of what is your backup strategy, just keep that in mind.

Also, our recommendation is when you hook up a LTO3 device, you usually have a server running backbone or running a tempo. And what we usually do is we hook up the-- we usually either add a second fiber channel card into that backup machine and just plug in the Exabyte tape device directly on the back of that machine.

Or what we'll do is at least put the management arm, the one gig management arm, directly on the back of the machine and then put the IBM LTO3 tape library mechanism into the fiber channel switch. We've seen some issues where we have a fiber channel balancing driver. And there's been some issues with backbone where the backup might fail if you don't do that.

So our recommendation is just keep it simple and just hook up the tape backup device directly to the back of the server that will be acting as your backup server. And then again, when you quote those types of backup devices, a lot of them have an LC connector on the back. So they require optical. So make sure you don't forget the transceivers and the optical cables. And put those on the quote. Or when you buy that, just don't forget the optical cables that go with that.

So some of the best practices. So when I have a customer that tells me, hey, you know, we're going to be putting, you know, 40 terabytes or 40 or 50 terabytes, you know, volume, you know, the most important thing is to make sure that you understand the, what kind of video, what kind of codec they're going to be using.

You know, is it going to be DV25? Is it going to be uncompressed HD? Obviously, you know, DV25 is 3.5 megs per second and uncompressed HD 10-bit is 3.5 megs per second. DV25 is 165 megabytes per second. So depending on their requirements, you'll want to configure your storage pools, you know, differently.

Okay. When the customer has a requirement for 40 terabytes, just don't build one single storage pool and put all the LUNs in there. That's a really bad idea. Okay. Unless they're doing like HD, right? And then they need, you know, they need the throughput. But if they're doing DV25, it's probably better to build, you know, let's assume you have 12 data LUNs.

It's probably better to build, you know, four storage pools of four LUNs, data LUNs each, right? So you can take advantage of, you know, the rotate capabilities, you know, in the XSAN admin, you know, the rotate feature and take advantage of the parallelism when it writes the data onto the SAN.

What we've also seen is customers who already have XServe RAIDs. Obviously, the RAID has been around for many years now. So we go into customers and customers have bought RAIDs with 180 gig drives, RAIDs with 250 gig drives, RAID with 400 gig drives, and they're buying some new RAIDs with 500 gig drives.

So how do you build your SAN with a mixture of different drive Thank you for joining us. We're going to talk about the best practices for use in a variety of configurations. Come to this session to learn best practices on SAN. JD Mankovsky Thank you for joining us. We're going to talk about the best practices for use in a variety of configurations. Come to this session to learn best practices on SAN.

JD Mankovsky Well, you've now put your critical catalog, your data, right, on drives that have been around for about three to four years. You have no idea if they were just sitting on someone's desk running at 90 degrees Fahrenheit, you know, for the past four years. You know, where has that RAID been, right? Just by looking at the dust, probably built in, it'll give you a pretty good idea.

Okay, so just make sure that you just be upfront with the customer and say, you know what, why don't we get a couple of, you know, let's get three new 500 gig drives and let's just build a clean new set with brand new, you know, 500 gig hard drives and let's just shell those 180 gig drives from now on. Okay. In the RAID admin, if you haven't seen that in the 1.5, there's actually an uptime.

It'll actually tell you for how long those drives have been up, how many hours they've been running. So if you see, you know, anything over 30,000 hours, it's probably time to replace them. It's probably a good idea. Okay, so just keep that in mind. We had a customer who had three XR RAIDs of 250 gig drives and three XR RAIDs of 400 gig drives. And, you know, they were like, well, how do we do this? Well, obviously, since we're striping all the data, right, if you build one storage pool, you're not going to take advantage of all the storage available. Right.

So what we decided to do is just build one storage pool with, you know, five LUNs of 250 gig drives and one storage pool with six LUNs of, you know, 400 gig drives. And that worked pretty well. And we were going to, you know, we asked them about deploying affinities, potentially also doing an affinity, and they opted not to do that because affinity, we'll talk about that. Affinities get a little bit more complicated in some situations. Network configuration, DNS.

DNS is essential in a next-hand configuration. So make sure that your DNS is working or that the customer has DNS working before you show up. Or if you have to set up DNS, make sure that's the first thing you do. Tree spanning, make sure they turn that off or they have port fast enabled. Make sure they don't have any QoS. And especially, you know, if you sell them a 3Com switch or an Asante switch for the metadata, that's great. Or an HP switch, they have, you know, they have tree spanning.

They're going to be turned off by default. But if the customer wants to use Cisco equipment, make sure that, you know, you're either Cisco certified or you're going to be talking to their Cisco guy. Make sure they turn all that stuff off because you have to make the metadata switch as dumb as a layer 2 switch pretty much to make sure that you're getting the best performance when you're on your metadata network.

Deploy a directory. Okay, so I know it's, you know, you can probably run it without a directory in a small environment, you know, make every user 501 users, and everything's happy. But as soon as you create a new user, and that user is a 502 user, next time he creates a folder on the SAN, he'll have access and no one else will. So just make it by default, just build an open directory environment and configure it properly.

If possible, I'd rather have the DNS and the DHCP separate, if possible. Not always the case. If the customer has a specific, very tight budget, not always possible, but usually it's a good thing. On the Final Cut Studio side, obviously, we don't have any bandwidth reservation built into Xsan.

So what I would recommend is use the limit real-time video to put a specific number in there so that you don't have, again, a freelance editor going in and putting like 12 tracks of video and wanting to have a little bit of fun there and take all the bandwidth from all the other users.

So usually, though, it's the kind of question that we'll ask the customer, which is, "Well, how many tracks of video are you really going to, each workstation, what are you going to be using? Two tracks, three tracks, four tracks?" And then we'll set up that bandwidth limit in the Final Cut application.

And then obviously, once the SAN is up and running, make sure you use the Xsan tuning tool. That'll give you a pretty good idea, at least in terms of making sure that the cables and that there's no bad optical or bad transceiver or something going on. But the real test is Final Cut. That's really your real test.

So work with the editors. Once you have the SAN up and running, make sure they start doing editing and kind of get into their workflow so they can actually do some real testing. And make sure that they're not dropping frames within the environment that they're going to be working in every day.

So some of the best configurations, you know, limit the number of volumes. So, you know, We've run into situations where, you know, at the time, at the early stages of the Xsan, you know, 1.0 days, we're like, well, maybe we should do one volume for ingest, you know, one volume for editing, and one volume for playout. It was a bad idea. Okay. So our recommendation usually is one big volume for editing, ingest and editing, and then another volume potentially for playout.

Again, it all depends on the customer requirements. And obviously, if they're going live to air, on their playout volume, like a TV, you know, broadcast scenario, you don't want the editing people to take the bandwidth and start dropping frames while you're playing on air. So those are the kind of situations where you probably want a separate playout volume.

Another thing you can do is, even though we don't have any bandwidth limitation, what you can do potentially is go into the QLogic fabric and limit the port to a gig per second. So as you know, we can go up to two gigs on the performance. But if you're only doing DV 25, or a lower throughput type, you know, codec, you can just go into the QLogic switch and limit at least your clients to, you know, one gig per second, which would be plenty in many of the codecs that you're using out there.

What I also see a lot is editors with their FireWire hard drives and plugging in or copying data, large chunk of data from the finder over to the SAN. And that's something you have to be aware of or make note of because when you're copying a large amount of data from a local hard drive over to the SAN, the finder is like, oh, wow, I've got all that throughput available. Let me just copy that as fast as I can.

And what happens is that it'll kill the bandwidth of all the other editors that are working, and then they'll start dropping frames. So make sure you're aware of that. And again, by limiting to one gig per second on the workstations, that's something you can prevent because then they can't go beyond 100 megabytes per second on each workstation.

Don't cut corners. I mean, you know, I see that way too much. I see people asking me, you know, hey, JD, could we just buy one metadata server and just use one of the Power Macs or one of the Mac Pros or as my second metadata controller? That's a bad idea. Because, you know, software update comes up.

Oh, would you like to reboot your machine? And you had no idea that it actually failed over to the failover metadata, which is now one of your editor's workstations. And the editor just decides to reboot the machine. Not a situation you want to get into. So make sure you spend, you know, you spend the money and get two Xsers as your metadata servers. Make sure you configure the metadata LAN as a RAID 1. That's really important.

And again, You need about 10 gig of space for 10 million files. So that is something you need to be aware of. We had a customer who was doing, they actually, so you know when a company sues another company, they have to, they get all those big hard drives full of like exchange PST files and a lot of Word documents and they scan all that information with a bunch of, you know, servers and then all that gets transferred into a large storage volume.

So we install that and this customer had or potentially would have up to 600 million files. Okay. So then that RAID 1 configuration with two 500 gig drives is not going to work. So make sure that you have enough space available on your metadata LUN to sustain some of those, you know, extreme deployments.

Make sure you take advantage of striping. Make sure you leverage the multi-pathing that's built into our driver. Again, combine the LUNs with drives of similar characteristics. Make sure you don't combine 400s with 500s or 250 with 500s. Make sure you do it right. Here's an example of a potential typical deployment.

You've got your metadata on one, you've got your audio storage pool on another one, and here you would basically create an affinity for the audio storage pool. You have your video storage pool, which is comprised of four LUNs, and so that would give you, from a total throughput, approximately 320 megabytes per second on the video pool.

Again, if people are doing audio work, the audio characteristics are very different from the video world, so make sure you don't have 10 audio workstations sharing the same SAN with a bunch of video people, because that's going to cause a lot of issues, because the audio files are usually very small files, and it will cause drop frames on your SAN.

Xsan Mankovsky: Obviously they're all grouped in one volume. But again, with the beauty of affinities, you can have a folder called audio and you can tag, you know, you can have the audio guys right directly to to the audio folder and that then they have a dedicated, you know, lawn for their for their audio work.

Directory integration. So a lot of that stuff is going to change obviously in the Xsan 1.4 timeframe, thanks to ACLs. But up until the Xsan 1.4, you know, Xsan 1.4, what's really important is what we usually do is we create a group called editors. And let's say it's called 1025.

And we put all the users in that group, that editor group, and we make it the primary group ID. So we change the primary group ID from 80 to 1025. You know, obviously this is not a mobile home user, but what you do is you then go in and create slash users. And you can see that it'll basically create a local home to slash user slash Michael in this case, and slash user slash Dan for Dan in this example.

And then what you would do, and again, that was not a supported configuration, but the good news is you won't have to do that anymore because you got ACLs. But what we would do is we'd go in and change the defaults right of the finder and some other little fun apps to make sure that... Any user that would create a folder on the SAN, the group assignment to that folder would be 1025.

Because obviously when you create a folder in the finder, the user has rewrite access, but the group has read-only access. So you need to make sure that the owner has rewrite access as well as your Xsan group, which is all your editors. So this is the way to do that.

We had a support article as well, which had a little Apple script droplet that would change permissions on your SAN. But this was a lot easier. What we would do is go into the user default template and change that. So if the customer creates or every time a new user would log in, they'd go in and it would already be done. They wouldn't have to do anything. And we had a little package that we just pushed over ARD that would change all that on all the workstations. So it would be done once and for all.

Again, ACLs are beautiful. Don't have to deal with that anymore. Implementing affinities. Affinities are fantastic, especially if you want to make sure that the two HD workstations that you have in your environment have a dedicated set of RAIDs and that the other guys that are doing DV or SD are not impeding on the bandwidth available to your uncompressed HD workstations.

And what affinities do is you can basically assign a set of LUNs, create a folder on the volume and say, this is like my fast affinity. And anyone who's writing to that fast affinity has a dedicated set of XSERV RAIDs and data LUNs available for them. So if you're doing 2K, we had a church that was doing 2K editing.

What we did is we basically built a set of four data LUNs, set up an affinity. And their 2K editor is like happy as can be. They're doing all their editing and no drop frames whatsoever. And then all the other people that are doing SD, they have their own dedicated storage pool as well.

The most important thing to know about affinities is, well let me go to the next slide, is really when you create an affinity, And you move, let's say you have an affinity, you have a movie or a file in the affinity. If you move that file out of the affinity into another folder, it doesn't actually move the file. Okay, so that's really important.

So, Let's assume you have, you want to set up an ingest affinity. Okay, so you want to make sure that when you're doing ingest, that you have a dedicated 180 megabytes per second throughput. So you basically take one, two data LANs and set up an affinity and that's my, and you call it ingest. That's your ingest folder. And you have one workstation or two workstations that are doing your ingest work. And then you have another folder where your editors then start editing.

So you can see that the editor is still in the ingest affinity. So then, what happens when you move the file? Well, the editor takes that ingested file and moves it into another folder. It doesn't actually move that file. Okay, it moves the header of the file, but the data is still in the ingest affinity.

So then, The ingest guy starts ingesting again. The editor starts editing the file you just ingested. And basically, you now have-- There's no bandwidth available. The guy's ingesting, the editor's dropping frames. So you want to make sure that there's a couple ways to go around that. First, you can hold down the option key so it actually does a real copy. You can copy the file locally or anyone here have Xsan support?

Yes, you should all have-- all the hands should have raised. You should all have Xsan support when you deploy Xsan. But actually, the AppleCare team has a little script that you can set up as a cron job. And what it'll do is that every night or however you set it up in the cron tab will basically then kind of move that file and reassemble that file properly. So that's something that's available.

On the Qlogic optimization settings, so that's another fun one. How many people here have more than one Qlogic switch in their fabric? OK. So a very important point to make is in your switch, right? Again, assume there's a power failure. And those two switches, which are interconnected using 10 gig cables, right? You've got the 10 gig interconnects on the 5200s.

Assume those two switches come up at the same time. You've got this thing called domain ID. Okay, remember the old scuzzy days, right? Where you can go in your back of your hard drive and you sign an ID from one to seven? Okay, with a little wheel. Remember those days?

Same thing here. You have to go in and once you bring up your SAN for the first time, you want to make sure that you lock the ID of the QLogic switch. The default is always one. So make sure that you set up your first switch to one, your second switch to two, your third switch to three, and fourth switch to four. Because on a power failure, if they all come back up at the same time, they're all going to take the ID of one. And then they're not going to know what the hell they're doing.

So that's the first thing. And then obviously on the client side, you want to make sure that you turn on IO Stream Guard and you want to disable device scan. That's on the clients and obviously on the metadata servers. On the RAIDs, you don't want to do that. You don't need IO Stream Guard on any target. You want to do that on the initiators.

Here's an example of a three sandbox three configuration. And you can see we have ISLs. We have like eight ISLs between the three sandbox two setups. So that's really important. Make sure you configure the sandbox switch as well. By default, you've got an IP address. Just make sure you go in and monitor that. Make sure you set up the notifications on that as well. It's really important.

There's a lot of different configurations, again, depending on how many ports you need. And that's all available on the Qlogic website. On the ADEC side, there's some really interesting capabilities there. So I'll tell you a story of one customer we did. It was an all-Windows shop, but they love the XServe rates for the cost, and they do a lot of web serving.

So what they did is they have Windows servers for their web serving, because I guess they're using IIS or something. Bad decision, but anyway. And what they wanted is they want to use XServe rate and store NextFX. So we basically set that up. And for web serving, it's fantastic, because when you need to upload new content, again, because of the SAN, you just have to upload it once, and then everything gets distributed to all the IIS or the Apache web servers that are serving those web pages. So it makes it really neat.

On the StoreNext FX 2.7, they actually added Active Directory support. Prior to that, they only supported NIS and PCNFSD. How many people know what PCNFSD is? Yeah, that's what I thought. One person. There you go. You must have been around for a long time. NIS is another fun one. At the time, you could do an LDAP to NIS gateway and everything, but thank God that's all over now. AD and LDAP are fully supported with StoreNext 2.7, which integrates very well with our directory services.

A lot of tweaking on the PC side. If you're configuring a PC, make sure you go into the Ethernet NIC and turn off the QoS, disable QoS on the servers or on the machines, the workstations, if they're, you know, whatever workstations they might be. Just make sure you go into the NIC and turn off the QoS. A lot of caching parameters. We usually just leave those, you know, as default.

How to repair your SAN volume. Obviously, this applies to Xsan 1.3, but I'll just refer you to the article number and just make sure you follow those guidelines. And obviously, if a problem is detected, you want to unmount all the volumes, stop the SAN, and then run CVFS-CK.

Backing up your SAN and your config file. So many times have I seen you set up a SAN and you don't go in and do a CV gather on your metadata servers. This is basically your information, your SAN information on how you've configured those data lines, how you've configured your metadata. So make sure that you do a CV gather or at least back up the config file and store that. Burn a CD, send it to your mother, email it to your friend.

Just sort in multiple locations to make sure that you have a backup. And every time you do a change to your configuration, make sure you go back and run another CV gather. Our recommendation is to basically set up a cron job and just back up your lock files on a regular basis.

Because if something happens, if an error gets detected, what's going to happen is that that error is going to repeat, repeat, repeat, and it's going to fill in your logs, and it's not going to keep track of what actually happened. When the error occurred, because it's going to fill the log afterwards with all you know a whole bunch of an error occurred something happened. You want to you want to keep track of that. So make sure you back up this on a regular basis.

Backup is essential. Too many times do I see people just say, hey, I just have enough money to buy a SAN, and I got no money to spend on disk-to-disk or disk-to-disk-to-tape. If this is your bread and butter, you better make sure that you have another RAID or that you have a tape backup and you can take that data offsite. It's not because you're in a RAID 5 configuration that something can't happen.

There's always something can happen. I've done work where I'm in a data center and I have sprinklers over my head. Xsan Mankovsky: Okay, it's just not the kind of stuff you want to, you know, if the sprinklers turn on, you're pretty much guaranteed your data is not going to be around.

On the backup matrix, we've done a lot of work with ATempo and Backbone. So those are kind of the two solutions we recommend. Just make sure you use an enterprise backup solution. Again, don't try to cut corners and spend $300 on brew. It ain't going to work. So just make sure you stick with enterprise solutions, and Backbone and ATempo work great.

So now let's talk about high availability. What I wanted to spend a few minutes on is talk about ViCom. And ViCom is a company that has put together a fantastic product that allows you to do full mirroring on your SAN. Now, there's been other vendors out there that have done that. They're not in business anymore because they tried to do too many things, right?

They tried to be a fiber channel switch vendor as well as do the mirroring. And then they were selling their product in the $50,000 range to $100,000 range. And most customers don't have that money to spend for mirroring. So what ViCom did is they're very specific and very focused on the mirroring engines.

It's plug and play, real time mirroring. The installation is pretty simple. And what it gives you is no single point of failure in your SAN deployment. The back is pretty straightforward. You have an in point and you have an out point. And you've got obviously dual power supplies. Everything is fully redundant. You've got dual Ethernet, dual serial. And it fits very nicely in a 1U box.

They will be shipping very shortly their new ViCom admin user interface, which will allow you to configure this. Xsan Mankovsky, Chief Executive Officer, San Diego State University: And here's a typical, you know, example of a small video broadcast example, and I'll show you a bigger one in just a few minutes.

But what you have is you have basically two sets of raids that are fully mirrored. And so obviously what you would want to do is mirror one half of the raid to the other side and one half of the other raid to the other, you know, make sure you have that set up. The two sets of raids can be in different locations. So you could have one and, you know, one set of the building and another set and the other set of the building.

Again, just think of the sprinklers. You have your ViCom appliances. You've got some LUN masking, which I'm not showing here. But what you do is you do some LUN masking at the fiber channel switch level. And then you have your workstations. And again, you know, you have your metadata. Each metadata server, you know, each fiber channel port is plugged into each switch.

So make sure you have redundancy. Each client is hooked up to each of the Q-Logic switches. So, you know, reality is you've got no single point of failure whatsoever. You've got two V-mirror engines per side. So you've now built a fully redundant system. And we've done some pretty interesting testing, which I'll tell you in just a second.

You know, same similar deployment but more on the IT side for an IT data center. If you're running some kind of database server, that's another perfect scenario. And again, because of the cost of the XServe RAID and, you know, the $1.83 per gigabyte or less, you know, the vMirror fits perfectly into this. Now, granted, we don't have, you know, as you know, the XServe RAID does not have redundant controllers. But again, the reason why we designed the XServe RAID that way is for sheer performance and throughput. Right?

The RAID does fantastic on doing uncompressed HD and that's really what the RAID was designed for, for very, very fast throughput. And now with vMirror, you can add this box in the middle which gives you that redundancy for your playout deployments, for your on-air deployments to make sure you get, you know, you get a fully redundant system.

You know, deployment guidelines, you know, from a throughput perspective, a dual vMirror will give you about 350 megabytes per second on the throughput side. What we usually do is on a, you know, if it's a large deployment and if it's people that are going into the uncompressed HD or in the HD world, what we usually do is we do, you know, a set of RAIDs, a set of XRRAIDs, and we put one vMirror engine per pair of XRRAIDs or per set of XRRAIDs.

Obviously, if the customer is doing, you know, lower type bandwidth requirement, has lower bandwidth requirements, you can scale that down. So, just something to, again, consider and, again, understand depending on the customer requirements. Here's a real deployment. So, this is with a, again, a three-letter, anyone watch football? There you go. So, this is a deployment that we did. about a month and a half ago.

And you can see here you've got the seven V-Mirror engines. You've got your back-end Q-Logic setup. And you've got ISL connections. You've got the 10 gig ISLs between the two. So again, if any of those two switches, Q-Logic switches, would go down, it wouldn't matter. You've got the 10 gig ISLs. You've got the sets of V-Mirror.

We literally, what we did, just for the heck of it, is we literally pulled three RAID controllers out while we were ingesting into the system. And there was not even a single blip in the system. So that's pretty neat. It also means that, let's say, you know, you have to do some maintenance on the RAIDs. You can do some maintenance on seven RAIDs. Let's say update firmware, for example.

You still have your other seven RAIDs fully working. And then you bring your mirror back up. And then you can do maintenance. And then you bring your mirror back up. And then you can do maintenance on the other set. Okay, so it gives you that seven by 24 without any downtime. Right?

So that's huge. And then obviously on the front end, we've got another stack of Q-Logic switches in this case. And you've noticed that they're not interconnected between the two. There's no 10 gig stacks that interconnect those four sets of switches. And that's for all the client workstations. And again, if any of the ports would go bad or any of the switches would go bad, you've got the full redundancy on the front end as well.

So again, just in final, some of the best practices, make sure you've got DNS working. If your Xsan admin tool is slow, it's probably a DNS issue. Oh, another thing is make sure you've got your NTP time synchronized properly. That is really essential. So if you're, you can connect to time.apple.com, that's great, but if you're setting up an environment where you're not connected to the outside world, make sure you configure an NTP server properly. I would recommend not to run or have Final Cut projects or launch Final Cut projects off the SAN. Just copy them locally and run them locally and then copy them back or set up the autosave to save them back to the SAN.

Make sure you don't connect to the two metadata servers in the same power distribution circuit. It's not a good idea. Same thing with the XServe raids. Make sure they're on different circuits. Make sure you have UPS and take advantage of the email notification. On the Xsan 1.4 side, I'm not going to spend a lot of time. It's all published on the website, but ACLs is the big feature here.

And then as a wrap up, I wanted to thank you for using Xsan and just plan your deployment properly. Let us know if we can help. My team is all around the country. You can go to our website, apple.com/services or consulting services at apple.com and we'll be glad to help you with your Xsan deployment. Thank you.