Greetings, alpha-managers. We are seeing strange crashes of Alphaserver 400s,
and while waiting for tech support to arrive I'd like to check for any other
similar experiences.
We have two Alphaserver 400s - nodeA and nodeB. Facts are as follows:
NodeA NodeB
DU 4.0b DU 4.0a
(rev 564) (rev 564)
Firmware rev while crashing:
6.3 6.2
Firmware rev after replacing system board:
6.4 N/A
Network cards:
DECchip 21140-AA rev. 1.2 DEC LeMAC Ethernet Interface
DECchip 21140-AA rev. 1.2 DEC TULIP Ethernet Interface
DECchip 21140-AA rev. 2.0 DECchip 21140-AA rev 2.0
Memory:
256 MB 256 MB
NodeA was upgraded to include an Trident video card, 128MB of dataram
memory, and a DE500-AA network card, about 3 weeks prior to the time it
started crashing. When it began crashing, it crashed several times in
a short period. Dec came in, looked at core files, was suspicious of memory.
They pulled all memory, replaced with Dec memory, and sent memory for analysis.
All memory tested clean. Meanwhile, the system crashed again twice, with
no activity on the machine, on about a 5 day cycle. Dec came in, replaced
the system board with one at a slightly newer rev, replaced Trident card
with ATI Mach64 video card, and put back in original memory. Since then,
NodeA has been up for 9 days, seems stable.
NodeB was upgraded to include an ATI Mach 64 video card, 128MB of some
memory, and a DE500-AA network card, 5 days prior to the time it began
crashing.
Memory seems to be ruled out, as it was all pulled, tested, and nodeA still
crashed. Video cards were different during crashes, so only commonality
would be the fact that there *was* a vga card in both nodes. Network cards
have not been ruled out, though nodeA is running fine with the DE500-AA since
the system board swap. I am suspicious of some weird interaction between
the network card and the system board firmware rev. The other weird thing is
the (approximately) 5 day period we see between crashes.
Error msgs appended at end. If anyone has seen anything similar, please
get back to me with info. Thanks much!
--
Judith Reed
jreed_at_appliedtheory.com
(315) 453-2912 x335
===========================================================================
Error msgs:
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 302. PANIC
SEQUENCE NUMBER 2.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed Oct 8 23:05:28 1997
OCCURRED ON SYSTEM nodeB
SYSTEM ID x0006000D CPU TYPE: DEC 7000
SYSTYPE x00000000
MESSAGE panic (cpu 0): Machine check -
_Hardware error
********************************* ENTRY 3. ********************************
*
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 100. CPU EXCEPTION
SEQUENCE NUMBER 1.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed Oct 8 23:05:28 1997
OCCURRED ON SYSTEM nodeB
SYSTEM ID x0006000D CPU TYPE: DEC 7000
SYSTYPE x00000000
----- UNIT INFORMATION -----
UNIT CLASS CPU
Received on Thu Oct 09 1997 - 15:28:42 NZDT