Eccellent Responses. Much Thanks.
------------------ Original query ----------------
Our Alphas are ceasing to provide certain functionality whenever an nfs
file system server goes down. On occasions, it will lock up the machine
completely. This feature has endured across 2 versions of OSF, 2.0 and
3.2, so my first guess is that its either OSF, or an NFS switch that needs
to be set. I am using the usual mount parameters; eg. from fstab:
/data0_at_csrp /data0 nfs rw,bg 0 0
The Alphas are model 3000/600 at OSF/1 v3.2. The file system servers are
these Alphas and IBM RS6000's running AIX 3.2.5. The AIX machines do not
have this problem. On both platforms, nfs dutifully announces that a
server is not responding, etc. But the Alphas are, in addition, hanging on
most commands, which will continue processing as soon as the server is
brought back on line and servicing client requests.
What is happening here? What should be done to correct it?
---------------------------------------------------
The prevailing wisdom, summarised by Bernhard Schneck:
Thou Shalt Not Mounteth into the root directory.
but more exactly:
Rule 1. The parent directory of the nfs mount pt. shall not be root.
Rule 2. An nfs mount pt. shall not share its parent directory with
another nfs mount pt.
-passable exception: it the nfs mounts are from the same
nfs server, it probably will not be problematic.
The most complete explanation and solution are from Paul David Fardy below,
but I also include selected others for relevent insite and comment:
=====
From: Paul David Fardy <pdf_at_morgan.ucs.mun.ca>
This is a common problem with NFS. Many programs (the best example
being /bin/pwd) search up the directory to find the full pathname
for the current directory. On the way up, a program can encounter
a remote filesystem root directory and hang if the host is unavailable.
The process often gets into an uninterruptable state.
Take an example with 3 systems--Athos, Porthos, and Aramis--sharing user
files. Athos mounts the following file systems.
/u1_at_porthos on /nfs/u1
/u2_at_porthos on /nfs/u2
/u3_at_aramis on /nfs/u3
When a user Bob runs "pwd" from /nfs/u1/bob, pwd does the following search.
A. i-node = stat(".") # I don't my name, but I do know my number.
B. chdir .. and scan for i-node # Directory must be in parent somewhere
C. found "bob" matching i-node # I now know my path ends in "bob",
D. i-node = stat(".") # but I still don't know my parent's name.
E. chdir .. and scan for i-node #
F. found "u1" matching i-node # I now know my path ends in "u1/bob".
G. i-node = stat(".") #
H. chdir .. and scan for i-node #
I. found "nfs" matching i-node # I now know my path ends in "nfs/u1/bob"
G. i-node = stat(".") #
H. chdir .., scan for i-node #
I. found "." matching i-node # Must be the root directory, we're done.
When it works, pwd prints "/nfs/u1/bob". But NFS hanging can occur if
Aramis is unreachable. If we look back at those scans with full
knowledge (as opposed to pwd's view), we know the following.
Step A stats /nfs/u1/bob
Step B scans /nfs/u1
Step E scans /nfs
Step H scans /
The problem is that Step E scans /nfs and in the process could attempt to
access /nfs/u3 (the ordering of the /nfs directory is based on timing).
If Aramis is unreachable, then pwd will hang.
The Solution: (that dreaded word) Segregation
There are two models that clear this NFS problem. The one I suggested
earlier separates every mount point.
/u1_at_porthos on /nfs/u1/nfs
/u2_at_porthos on /nfs/u2/nfs
/u3_at_aramis on /nfs/u3/nfs
In this model, no two NFS mount points share a common parent. Another
model we've used in the past separates mount points based on the
serving host.
/u1_at_porthos on /nfs/porthos/u1
/u2_at_porthos on /nfs/porthos/u2
/u3_at_aramis on /nfs/aramis/u3
In this model, u1 and u2 share a common parent, but they're both served
from the same host. If u2 were going to hang, then it's not likely that
u1 is reachable in the first place.
In either scheme, a program like pwd run from any network filesystem
would have to go up to /nfs then down a directory to encounter
another remote file system. Very few programs do this; no program
should.
I prefer the first model because
a) it generates shorter paths (df is less likely to wrap
and more likely to fit on a screen)
and
b) the second model is redundant (when you move a disk you have
to change the mount directories and the symbolic links that
hide the mount points along with the entry in /etc/fstab).
--
I'd also like to emphasis that the mount points on the client should be
made world readable and executable. I've encountered the following
behaviour on both SUNos and MIPS systems with NFS mounted home
directories.
login: pdf
Password:
No directory! Logging in with home=/
$ pwd
$ cd ~pdf
$ pwd
/nfs/u1/nfs/pdf
I believe the problem might be that login runs as root when it attempts
chdir(~pdf). Perhaps some combination of the permission on the mount
point and root privileges on NFS filesystems produces the error. It
makes little sense because we allow root equivalency on these filesystems.
I only know that the following commands fix the problem.
umount /nfs/u1/nfs
chmod -R a+rx /nfs/u1
mount /nfs/u1/nfs
=====
From: Bernhard.Schneck_at_Physik.TU-Muenchen.DE
The problem is with how getwd/getpwd do their job ... they repeatedly
go to the parent directory, stat(2) all files there, until they find
that stat(.) == stat(..).
If one of the files it wants to stat is on a NFS mounted volume where
the server is down, the stat will hang.
getwd/getpwd are called from an incredible amount of programs ... *sigh*.
=======
From: Jon Reeves <reeves_at_zk3.dec.com>
You probably want to add "intr" to your options; you might want to add
"soft", but with "rw" that's living dangerously. Our NFS gurus have
referred to a soft-mounted file system as a "corrupt" file system.
More important, you should reconsider your mount point; by having a
hard-mounted NFS file system at the root level, you ensure that every
getcwd, most opens, command searches, ... will run across that mount point
and will "stat" it and then hang. If at all possible, you should create
a directory specific to the NFS mounts, then mount under that directory
(e.g., /server-name/data0). You could use a soft-link to point there.
It's partly a matter of luck, depending on the actual order of directory
entries, whether this will cause you a problem. This might explain the
behavior on the other machines, or you might be using a different mount
point, or perhaps they default to "intr,soft" mounts.
Incidentally, "bg" only affects the action at the time of the initial mount.
=======
Many thanks to those who responded:
Mike Iglesias <iglesias_at_draco.acs.uci.edu>
John Stoffel <john_at_WPI.EDU>
Jon Reeves <reeves_at_zk3.dec.com>
Bernhard.Schneck_at_Physik.TU-Muenchen.DE
David R Courtade <drc_at_amherst.com>
Paul David Fardy <pdf_at_morgan.ucs.mun.ca>
Khalid Paden <khalid_at_FNAL.FNAL.GOV>
Jason Yanowitz <yanowitz_at_eternity.cs.umass.edu>
"Richard L Jackson Jr" <rjackson_at_portal.gmu.edu>
-----------------------------------------------------------------------
Neil R. Smith, Research Assoc./Comp.Sys.Mngr. neils_at_csrp.tamu.edu
Climate System Research Program 409/862-4342
Dept. of Meteorology, Texas A&M Univ., USA 409/862-4132 FAX
-----------------------------------------------------------------------
Received on Wed Aug 30 1995 - 22:07:26 NZST