SUMMARY: assemgr loooong delay after crashed node is restarted. from Blom, Wayne on 2000-04-27 (tru64-unix-managers)

From: Blom, Wayne <Wayne.Blom_at_au.faulding.com>
Date: Thu, 27 Apr 2000 09:45:35 +0930

Original question below.

Only two replies, thanks to Scott Skeate and Steve Hancock.

Two different answers,
1) It's caused by the disks resyncing after the restart which was caused by
the crash. (Scott)
2) It could be a bug and there may be a patch. Steve advised I should apply
BL14 or higher to TCR 1.5 (Steve)

Thanks guys.

At this stage I will leave it. We are going to be upgrading in a months time
and I don't anticipate another unplanned outage (naive aren't I).

> Gidday Gurus,
>
> Yesterday we had one of the dual rectifiers in one of our pair of
> clustered
> GS140s let go with an eardrum blasting bang.
>
> Fortunately we had an engineer on site and we were up and running again in
> 3
> hours. (Took that long to be sure of the board that had failed and
> retrieve
> it from the office.) The systems are GS140s' running tru64 v4.0E
>
> QUESTION: The cluster had failed over as planned so that was a good test.
> (We are only 20% live at the moment). When the dead system was restarted
> and
> the asemgr menu was started to allow us to fail back it simply sat there
> for
> nearly 10 minutes placing a dot on the screen every minute.
>
> Why?
>
> Did we pick the wrong system first or is it some other reason?
>
> PS. The asemgr command was issued on the cluster member which had not gone
> down. After CTRL-Cing it and then running the command on the node that had
> restarted. Both systems menu's then came up.
>
> As usual a summary will follow.
Thanks
Wayne Blom
System Specialist
Technical Development Healthcare
F H Faulding & Co Limited
Ground Floor
1 Station St
Hindmarsh SA 5007
Ph: +61 8 8241 8334
FAX: +61 8 8241 8357
Mobile: +61 0419 808 496
Email: wayne.blom_at_au.faulding.com
Received on Thu Apr 27 2000 - 00:17:48 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT