System Hang and Network Boot--desperate

From: Thomas, Julie <jthomas_at_ercot.com>
Date: Wed, 11 Apr 2001 10:01:24 -0500

Hi there,

1) We have two ES40's in a cluster, Tru64 v5.1,
TruCluster 5.1. This morning when we arrived there
were about 2000 instances of the SAMBA daemon
running. They couldn't be killed since the other
node respawned them. All system resources were
consumed such that we couldn't log in, so we
attempted a hard reboot on each machine
individually. The boot hangs after a SCSI bus
reset and the quorum disk is added to the cluster.
Any ideas as to what has happened?

2) It has been suggested that I set up a boot
across the network since I have two identical
redundant machines at my site (the problem
children are at our backup site). Has anybody done
this? We're going to try to boot from CD-ROM
first, but I'd like to have the network boot as a
last resort and potential source of a "good"
kernel file. I'll be RTFM, but any and all
pointers would be most appreciated. I'll summarize
when my blood pressure returns to normal.

TIA.

Julie L.C. Thomas
UNIX Systems Administrator
ERCOT
Taylor, TX
512.248.3967
Received on Wed Apr 11 2001 - 15:03:09 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT