Problems with AS4100 Rackmount ??

From: Unix Administrator <unixadmin_at_gmd.com.pe>
Date: Thu, 26 Jun 1997 15:19:46 -0500

> Hi,
>
> Our configuration is: two AlphaServer 4100 5/300, DU3.2G. These
> AlphaServers are installed on a rack. They both are configured with 2
> KZPSA´s, 1 KZPSC, 1 DE500, 1 Video Card, and 1 DNSES; 384 MB.
> Firmware
> upgraded with the 3.9 Firmware Upgrade CDROM + 3.8-6 SRM console
> upgrade.
> The O.S. disk (root, /usr, swap) on both nodes is a mirror made up
> from
> two RZ28D-VW disks and the KZPSC RAID controller
>
> They´ve been working fine since 4 months ago.
>
> Three weeks ago, we needed to install an additional card on one of the
> AS4100 ( the upper one ). The installation procedure was extremely
> careful, but suddenly the other server (which was running DU3.2G with
> 0
> users at that time) stopped and showed up the blue screen and the
> P00>>
> prompt. Partially recovered of the shock, we booted again the server.
> The event log says that everything is OK, with the exception that
> there
> has not been a shutdown event before the last startup event.
>
> The problem remained on the darkness (we thought that it had been due
> to
> a human error or something like that) until one week ago, when we
> needed
> to add a card to the second AS4100 (lower one).
>
> In the middle of the process (very, very, very meticulous), it
> happened
> exactly the same as three weeks ago, with the exception that the first
> we perceived was that in one of the O.S. members of the mirror (upper
> node) it began flashing the error led.
>
> With the node still up (and the other one still opened), we tried to
> use
> the swxcrmgr software for DU, but it generated a core file. We tried
> to
> run the program 4 or 5 times else when the machine went down to the
> blue
> screen and the P00>> prompt. The events log says the same as three
> weeks
> before: everything is fine. We changed the "bad disk", rebuilt the
> mirror and till now, everything is OK.
>
> There is no reason for a node to go down when one and only one of the
> members of the mirror goes down, i think. What happened ? no idea.
>
> Now, we have two AS4100 which weirdly crashed down without an apparent
> cause. The PCM boards on both machines doesn´t report problems.
>
> These crashes never gave us one single clue.Is there any previous
> similar experience wherever ? Is there an FCO ? Perhaps a tip to give
> maintenance to AS4100´s rackmounted ? Is recommended to turn off both
> Alphas for maintenance ?
>
> Thanks for reading this long. For sure, i´ll summarize the responses,
> if
> any.
>
>
> Regards,
> UNIX Admin
Received on Thu Jun 26 1997 - 22:33:56 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT