SUMMARY: NFS down & uninterruptible processes

From: lombardi emanuele <lele_at_mantegna.casaccia.enea.it>
Date: Tue, 07 Apr 1998 11:13:32 +0200 (MET DST)

Hi,

Here are the answers I got about the problems which arise when you use
NFS and the serving NFS machine suddently goes down. In such cases it
happens that some processes (those using the remote files) go into a
stete called uninterruptible and cannot be killed any more. And you
can't umount the failed NFS mount point since on it there are such
processes still alive! (device busy).


        lamullikin_at_CCGATE.HAC.COM
only way to kill uninterruptible processes is to reboot
if you are using automount then automount it with a "soft" option so
in case nfs doesn't come up - your system will not hang.
     

        Ana Pedraz <Ana.Pedraz_at_Digital.com>
In my previous job, we had a very complicated nfs network set up,
I found the only way to get things working again was using "kill -HUP
automount-pid-number"
Give this a try.


        Lois Bennett <lois_at_genome.wi.mit.edu>
We never found a way around that. We now use amd instead of the distributed
automounter. Instead of direct nfs mounts it mounts the remote filesystems
in a special directory and then creates links to the location of the
expected filesystem. THat way when a machine fails the mount point isn't
busy so the machine doesn't lockup.
The only problem we have had is teaching the delevelopers not to use the
content of the pwd conmmand in their code. the pwd command gives the
actual mount point and if the file system is not mounted and you cd to the
actual mount point it won't mount and then it can't mount because the mount
point is busy.
If that doesn't make sense it's probably related to the 120 Gig restore we
just had to do because a raid failed!!


        "Spalding, Steve" <SSPALDIN_at_mem-ins.com>
I've run into the same problem before. We have two Alpha servers, one of
which we keep our users' home directory on and we then nfs mount it on
our other Alpha. The problem appears when I take the Alpha which
actually has the home directories on it down but forget to have all of
the users get off of the other machine and umount the nfs file system.
It seems like I have to do something different every time, but what I
remember doing is using the umount command on the nfs file system even
though it hangs and then break out of it; the file system eventually
unmounts.
I think that you may have a different scenario than what I have, but you
might try the above.


        XJ WANG <xuejinw_at_shubertorg.com>
To kill such a process, type kill -QUIT PID


        "Biggerstaff, Craig T" <Craig.T.Biggerstaff_at_USAHQ.UnitedSpaceAlliance.com>
It may be that you need to mount the NFS filesystem with the option
"intr".


Thanks to all of you, gurus.

Emanuele


-- 
 Emanuele Lombardi
 mail:  AMB-GEM-CLIM ENEA Casaccia
        I-00060 S.M. di Galeria (RM) 
        ITALY
 mailto:lele_at_mantegna.casaccia.enea.it
 tel	+39 6 30483366
 fax	+39 6 30483591
Received on Tue Apr 07 1998 - 12:54:23 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT