Xsanity Sanity for Apple's Xsan and Final Cut Server.
  
Sunday, May 19 2013 @ 03:22 AM EDT
Topics
Storage (39)
People (1)
Xsan (103)
How To (26)
User Functions
Username:

Password:

Don't have an account yet? Sign up as a New User
Who's Online
Guest Users: 11
Sponsorship

Xsanity is proudly sponsored by:

Tekserve
The Old Reliable Mac Shop

Starting and Stopping Xsan in 1.4.2

 
Post new topic   Reply to topic    Xsanity Forums Forum Index -> Troubleshooting
View previous topic :: View next topic  
Author Message
MattG
Xsan Master
Xsan Master


Joined: 15 Apr 2005
Posts: 456

PostPosted: Thu Jan 31, 2008 1:42 am    Post subject: Starting and Stopping Xsan in 1.4.2 Reply with quote

There's been a lot of buzz about the effectiveness of the Xsan 1.4.2 update. For the systems we manage, we've never encountered a more robust version of the software, for both Tiger and Leopard. The console reporting is far more robust. MDC failover reliability has greatly improved.

That's not to say that there aren't some caustic bugs. Most specifically, machines will fall out of both the Setup lists and the Client lists in Xsan Admin, even though the SAN is functioning perfectly fine on those machines. The annoying part here is when you want to use the Xsan Admin GUI to unmount the volume from a particular client, but can't since you don't see it!

If possible, a reboot of the client usually does the trick. Sometimes jogging the serial number and resaving the config in the Setup tab will get a client you see in the Setup tab to show up in the Client tab.

But I write this to explain a new way to perform a functionality that used to be easy pre-1.4.2, that now has been taken away from us. Sometimes restarting the Xsan processes without rebooting is all that's needed.

In 1.4.1 and earlier, we accomplished this by issuing the following command in the Terminal:

Code:
 sudo /System/Library/StartupItems/acfs/acfs restart


The hostconfig file for Tiger machines (located in /etc) used to contain a laundry list of system-level processes that needed to be launched at startup. In here we had an "ACFS=-YES-" tag that told the machine to look for a startup script within StartupItems. The acfs script, inside the acfs folder, located in the StartupItems folder, was essentially a script that launched the fsmpm process that runs on all nodes. fsmpm, in turn, launched fsm if the node was a controller. Using the restart switch for the acfs script just killed these processes and started them up again, which was helpful to refresh the Xsan software without rebooting the machine.

Well, those times are gone. In Leopard, the StartupItems folder is starkly empty (except for code that hasn't been rewritten for Leopard). The /etc/hostconfig file also contains the ominous comment "# This file is going away" in its first line. Even in Tiger, we don't see the acfs script that used to be there.

What has replaced all this?

launchd

And for good reason. launchd is the first process launched by the OS, and is basically responsible for running and maintaining the state of every other process. launchd is the launcher of a new process in 1.4.2, called xsand.

xsand replaces the cumbersome acfs startup script. It launches very close to startup time, and knows to launch fsmpm, which in turn launches fsm if the machine is a controller. And unlike the acfs script, which launched fsmpm and called it a day, xsand will also monitor the fsmpm and fsm processes and relaunch them in case they crash.

Because of this, the Xsan processes are far more reliable, and xsand was written from the ground up to be more verbose about what it is doing.

But what if we want to kill the fsmpm, fsm and xsand processes properly to give Xsan a swift kick without rebooting the machine?

All we need do is "unload" the xsand from the launchd laundry list. We do this using the companion command to launchd: launchctl.

So to stop Xsan on a machine, we would type:

Code:
 sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.xsan.plist


The -w switch ensures that the xsan job will not reload until we want it to, even after reboot. So, to get things started again, soon after, we should issue:

Code:
 sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.xsan.plist


Only difference in the second command is "load." We want to get the Xsan software back in the good graces of launchd.

Now we have a reliable way of restarting Xsan on a machine without rebooting.

Just one caution. We shouldn't issue this command on an active MDC. This would basically yield the same result as cutting power to it. I guess if you wanted to test failover, this is one way you could do it.

Please write back with corrections or successes!

MattG
Back to top
View user's profile Send private message Visit poster's website
francisyo
Xsan Master
Xsan Master


Joined: 06 Nov 2006
Posts: 80

PostPosted: Thu Jan 31, 2008 10:07 pm    Post subject: Reply with quote

Hey MattG,

Thank you very much for sharing this one out. To tell you frankly this is my first time to know that there is such a code in unix that can restart xsan without restarting the metadata (this is because i am not a unix guy, all i know is the basic one). I immediately try the code in my xsan metadata which is the failover (not the active one) and it was successful. Thanks to you. Before if there is some problems with my xsan and i feel that i should restart it, what i do is i will restart my metadata (the active one and the failover.) But I follow the rules of xsan like unmount first the clients then the volumes blah blah blah. Before this thing gets long I have one question, I don't know if this is related to restarting my xsan. Before I implement the code there was no error in the logs but when I restarted it, there is a bunch of error I found on my MDC logs. and here it is.

Feb 1 11:04:35 MDC servermgrd: xsan: index_of_fsmvol_named: SNFS Generic Error
Feb 1 11:04:35 MDC servermgrd: xsan: [3285/7D5ABB0] ERROR: index_of_fsmvol_named(VOLUME NAME): SNAdmin_NSListFsm(0) returned -1, error Broken pipe
Feb 1 11:04:35 MDC servermgrd: xsan: index_of_fsmvol_named: SNFS Generic Error
Feb 1 11:04:35 MDC servermgrd: xsan: [3285/7D5ABB0] ERROR: index_of_fsmvol_named(VOLUME NAME): SNAdmin_NSListFsm(0) returned -1, error Broken pipe
Feb 1 11:04:35 MDC servermgrd: xsan: index_of_fsmvol_named: SNFS Generic Error
Feb 1 11:04:35 MDC servermgrd: xsan: [3285/7D5ABB0] ERROR: index_of_fsmvol_named(VOLUME NAME): SNAdmin_NSListFsm(0) returned -1, error Broken pipe
Feb 1 11:04:35 MDC servermgrd: xsan: index_of_fsmvol_named: SNFS Generic Error
Feb 1 11:04:35 canopus servermgrd: xsan: [3285/7D5ABB0] ERROR: index_of_fsmvol_named(VOLUME NAME): SNAdmin_NSListFsm(0) returned -1, error Broken pipe

Right now, I dont know what this error message means. Can you help me? Please?

Thanks
Back to top
View user's profile Send private message
keithkoby
Xsan Master
Xsan Master


Joined: 04 Apr 2006
Posts: 140

PostPosted: Fri Feb 01, 2008 12:07 pm    Post subject: Reply with quote

Hey Matt,

I'm glad to hear about your success with 1.4.2. There is a large number of posters here (and in other places on the internets) that has had issues with 1.4.2 crashing and controllers not failing over successfully. It's nice to know about the fix to get the clients back in the admin gui lists (something that plagued my setup), but I'd also like to know about the special sauce Wink you have that is making your 1.4.2 systems stable. Any advice?

Thanks Matt!
Keith
Back to top
View user's profile Send private message
MattG
Xsan Master
Xsan Master


Joined: 15 Apr 2005
Posts: 456

PostPosted: Fri Feb 01, 2008 11:08 pm    Post subject: Reply with quote

There is no special sauce.

If folks are having issues with 1.4.2, it most probably has to do with improper configuration.

What we really should be discussing is why specific issues are happening, rather than make blanket statements about whether a version works or not.
Back to top
View user's profile Send private message Visit poster's website
colbru
partially protected
partially protected


Joined: 06 Feb 2007
Posts: 8

PostPosted: Mon Feb 04, 2008 5:17 am    Post subject: Volumes not mounting after acfs restart Reply with quote

Hi MattG

Thank you very much for this very informative posting.

I'm still running on 1.4.1.. (like never change a running system)

I've tried the acfs restart on 2 of my clients. It restarted the service fine but my previously mounted Volumes did not remount automatically. (On a reboot the Volumes remount as they should)

I need to remount from XSAN Admin.

Is this "normal"?

Here are the last couple of lines that acfs restart is giving me back.
Starting fsmpm
fsmpm started
Starting cvfsd
cvfsd started
Mounting Xsan File System volumes
(null)
Back to top
View user's profile Send private message
MattG
Xsan Master
Xsan Master


Joined: 15 Apr 2005
Posts: 456

PostPosted: Mon Feb 04, 2008 11:35 am    Post subject: Reply with quote

Good question.

Mounting the Xsan Volume actually happens on the client end. Even though you are pushing a button in the Xsan Admin program, you are essentially sending two commands to the client, and modifying one config file over there. They are as follows:

Code:
sudo mkdir /Volumes/volumename


This creates a mountpoint for the volume.

Code:
sudo mount_acfs volumename /Volumes/volumename


This mounts the volume at the mountpoint.

Then, the automount.plist file within /Library/Filesystems/Xsan/config is modified so that the AutoMount key for that volume is set to "rw", which will do the steps above automatically on next reboot.

The automount feature is very robust in version 1.4.2! Therefore, if an end user accidentally (or intentionally) ejects the Xsan Volume, it pops back up again very quickly.

That is also why, when you try to unmount an Xsan Volume with

Code:
sudo umount /Volumes/xsanvolume

the volume also pops back in place in a few seconds.

However, as expected, if you modify the automount.plist file on that client and change the AutoMount key to "no", then execute the umount command above, the volume will stay unmounted.

By the way, this rigid mounting phenomenon is also why Qmaster rendering nodes and other clustering software that needs to "hard path" to the volume now work properly with Xsan 1.4.2.

If a hard path already exists for the volume, Xsan now does further testing to see if it's a valid mountpoint. If the folder has correct permissions and doesn't have any nested folders, it assumes it's a residual mountpoint and will mount the Xsan volume to it. This didn't happen in previous versions and was the bane of existence for folks trying to make Xsan work with Qmaster.
Back to top
View user's profile Send private message Visit poster's website
Chief Technician
Been around the blocks
Been around the blocks


Joined: 12 Jan 2009
Posts: 22

PostPosted: Mon Jan 12, 2009 3:20 pm    Post subject: This Let Me Unmount a Volume that the GUI Wouldn't Reply with quote

MattG wrote:
However, as expected, if you modify the automount.plist file on that client

Located in /Library/Filesystems/Xsan/config
MattG wrote:
and change the AutoMount key to "no", then execute the umount command above,

sudo umount /Volumes/<volName>
MattG wrote:
the volume will stay unmounted.

Indeed it does. This allowed me to unmount a volume that the GUI would not unmount. Thanks for this bit!
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Xsanity Forums Forum Index -> Troubleshooting All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Best Viewed on a Mac | Suggested Browser: Whatever floats yer boat.