Starting and Stopping Xsan in 1.4.2
There's been a lot of buzz about the effectiveness of the Xsan 1.4.2 update. For the systems we manage, we've never encountered a more robust version of the software, for both Tiger and Leopard. The console reporting is far more robust. MDC failover reliability has greatly improved.
That's not to say that there aren't some caustic bugs. Most specifically, machines will fall out of both the Setup lists and the Client lists in Xsan Admin, even though the SAN is functioning perfectly fine on those machines. The annoying part here is when you want to use the Xsan Admin GUI to unmount the volume from a particular client, but can't since you don't see it!
If possible, a reboot of the client usually does the trick. Sometimes jogging the serial number and resaving the config in the Setup tab will get a client you see in the Setup tab to show up in the Client tab.
But I write this to explain a new way to perform a functionality that used to be easy pre-1.4.2, that now has been taken away from us. Sometimes restarting the Xsan processes without rebooting is all that's needed.
In 1.4.1 and earlier, we accomplished this by issuing the following command in the Terminal:
[code] sudo /System/Library/StartupItems/acfs/acfs restart /code
The hostconfig file for Tiger machines (located in /etc) used to contain a laundry list of system-level processes that needed to be launched at startup. In here we had an "ACFS=-YES-" tag that told the machine to look for a startup script within StartupItems. The acfs script, inside the acfs folder, located in the StartupItems folder, was essentially a script that launched the fsmpm process that runs on all nodes. fsmpm, in turn, launched fsm if the node was a controller. Using the restart switch for the acfs script just killed these processes and started them up again, which was helpful to refresh the Xsan software without rebooting the machine.
Well, those times are gone. In Leopard, the StartupItems folder is starkly empty (except for code that hasn't been rewritten for Leopard). The /etc/hostconfig file also contains the ominous comment "# This file is going away" in its first line. Even in Tiger, we don't see the acfs script that used to be there.
What has replaced all this?
And for good reason. launchd is the first process launched by the OS, and is basically responsible for running and maintaining the state of every other process. launchd is the launcher of a new process in 1.4.2, called xsand.
xsand replaces the cumbersome acfs startup script. It launches very close to startup time, and knows to launch fsmpm, which in turn launches fsm if the machine is a controller. And unlike the acfs script, which launched fsmpm and called it a day, xsand will also monitor the fsmpm and fsm processes and relaunch them in case they crash.
Because of this, the Xsan processes are far more reliable, and xsand was written from the ground up to be more verbose about what it is doing.
But what if we want to kill the fsmpm, fsm and xsand processes properly to give Xsan a swift kick without rebooting the machine?
All we need do is "unload" the xsand from the launchd laundry list. We do this using the companion command to launchd: launchctl.
So to stop Xsan on a machine, we would type:
[code] sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.xsan.plist /code
The -w switch ensures that the xsan job will not reload until we want it to, even after reboot. So, to get things started again, soon after, we should issue:
[code] sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.xsan.plist /code
Only difference in the second command is "load." We want to get the Xsan software back in the good graces of launchd.
Now we have a reliable way of restarting Xsan on a machine without rebooting.
Just one caution. We shouldn't issue this command on an active MDC. This would basically yield the same result as cutting power to it. I guess if you wanted to test failover, this is one way you could do it.
Please write back with corrections or successes!