shutdown problems from Horsnell T. on 2000-12-19 (tru64-unix-managers)

From: Horsnell T. <tsh_at_mrc-lmb.cam.ac.uk>
Date: Mon, 18 Dec 2000 14:44:15 +0000 (GMT)

Hi Managers,

I'm trying to develop a reliable automatic shutdown procedure for a
group of Alphas connected to a UPS. The command host gets the signal
from the UPS that power has failed, and then proceeds to issue
commands to the other hosts, using rsh, to shut them down.
The command host then shuts iself down.
This procedure generally works, but occasionally a host will
fail to shutdown properly, sometimes even the command host doesnt
do so.

I've had shutdown problems on and off ever since I've been using OSF,
(6+ years) and I've never got to the bottom of it. I now use
'sync; sync; sync; halt' after warning users of impending shutdown,
which seems to succeed more often than 'shutdown -h' (is this dangerous?)
but occasionally, even that hangs.
I wondered whether the attempt of 'shutdown' to halt all processes
was hanging due to some outstanding NFS transfer, since there may well
be disks NFS-mounted by the UPS-supplied hosts which are attached to
machines which may not be up at the time the shutdown command is issued.

Does anyone know of a rock-solid method of achieving a guaranteed
clean shutdown which will proceed without hanging?

What are the definitive steps that take place during a 'shutdown -h'
and a 'halt'? The man pages are a bit vague. For instance, at which point
are disks umounted (if at all) and what happens if some process which
wont die, has a file open on one of the disks. Does the umount stall?

Cheers,
Terry.

Terry Horsnell (tsh_at_mrc-lmb.cam.ac.uk)
I.T. Manager
Medical Research Council
Lab of Molecular Biology
Hills Road
CAMBRIDGE CB2 2QH
U.K.
Phone: +44 (0)1223 248011
Fax: +44 (0)1223 213556
Received on Mon Dec 18 2000 - 14:45:35 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT