Gidday Gurus,
Yesterday we had one of the dual rectifiers in one of our pair of clustered
GS140s let go with an eardrum blasting bang.
Fortunately we had an engineer on site and we were up and running again in 3
hours. (Took that long to be sure of the board that had failed and retrieve
it from the office.) The systems are GS140s' running tru64 v4.0E
QUESTION: The cluster had failed over as planned so that was a good test.
(We are only 20% live at the moment). When the dead system was restarted and
the asemgr menu was started to allow us to fail back it simply sat there for
nearly 10 minutes placing a dot on the screen every minute.
Why?
Did we pick the wrong system first or is it some other reason?
PS. The asemgr command was issued on the cluster member which had not gone
down. After CTRL-Cing it and then running the command on the node that had
restarted. Both systems menu's then came up.
As usual a summary will follow.
PSS: Have a great and meaningful Easter.
Thanks
Wayne Blom
System Specialist
Technical Development Healthcare
F H Faulding & Co Limited
Ground Floor
1 Station St
Hindmarsh SA 5007
Ph: +61 8 8241 8334
FAX: +61 8 8241 8357
Mobile: +61 0419 808 496
Email: wayne.blom_at_au.faulding.com
Received on Thu Apr 20 2000 - 12:02:23 NZST