T64/TCS 5.1 + CLSM Problems

From: Gunther Feuereisen <gunther_at_gfh.com.au>
Date: Thu, 05 Apr 2001 12:00:21 +1000

Hi,

I'm experiencing a really strange problem.

My setup:

3 node cluster (2 nodes on a GS320, 1 node on another GS320)
dual-HSG80 storage (multibus_failover enabled)
dual-KGPSAs from each node
dual-FX switches
Tru64 UNIX 5.1 with Patch Kit 0002
TruCluster 5.1

The above setup has been working fine for 3 months now.

The problem:

In the last couple of days we've been having strange problems with
CLSM.

When I shutdown 1 node; the remaining two nodes' 'vold' daemon
goes into a disabled state. Volumes are OK, but no LSM configuration
is viewable ('volprint' complains about no diskgroups imported)

If I manually run 'lsmbstartup' on a remaining node, everything is
fine on that node, but the other remaining node's 'vold' dies. If I
then run 'lsmbstartup' on it - everything is fine too.

If I boot the node that was shutdown - everything remains ok. The
other members vold's continue in an enabled state.

If I shutdown the whole cluster and boot again, it all comes up OK.

I've tried shutting down the other members, and get the same effect.

The only change in the last week has been the replacement of a failed
KGPSA on one node (they each have dual KGSPAs) - which was done
the morning of the day we detected the problem - but they problem
may have surfaced earlier and we didn't notice it (we hadn't created
any LSM volumes for about 10 days).

Has anyone seen this kind of behaviour? I'm completely stumped at
this stage.

Thanks in advance,
gunther
Received on Thu Apr 05 2001 - 02:02:33 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT