B-cache errors on PC-164

From: <bantolovich_at_specialmetals.com>
Date: Fri, 16 Jul 1999 11:15:18 -0400

We are running a PC-164 based alpha with DU4.0B as the operating system and
have been having many many memory error messages pop up in the dxconsole.
Other than these messages, the machine seems to be running fine and in fact
is still quite speedy (relatively speaking) computationally. I analyzed the
error messages using the DEC_EVENT dia -R ... command and found many
entries such as:
===========================================================================
============
DECevent V2.3
******************************** ENTRY 1
********************************
Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 351.
Timestamp of occurrence 16-JUL-1999 08:54:50
Host name forge
System type register x0000001A EB164 or AlphaPC164
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000
Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 100. CPU Machine Check Errors
CPU Minor class 3. Bcache error (630 entry)
Entry Body Size: x00000068
Entry body:
           15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
0000: 00000038 00000018 80000000 00000060 *`...........8...*
0010: FFFFFF00 04F8C45F 00000000 00000086 *........_.......*
0020: FFFFFFF0 C5FFFFFF 00000000 00009400 *................*
0030: 00000000 00000000 00000001 00000000 *................*
0040: 00000000 00000000 00000000 00000000 *................*
0050: 00000000 00000000 00000000 00000000 *................*
0060: 5E3C7E25 00000000 * ....%~<^*
===========================================================================

On the face of it, this seems to this novice sysop to be a hardware error
which for us is quite bad because I doubt that this motherboard is even
availble anymore and a replacement is bound to be expensive. In talking
with one of my unix friends, she suggested that it might simply be a kernel
error passing itself off as a hardware error. Although this seems unlikely,
does anyone have any information or suggestions on how to go about
verifying that it actually is a hardware error? Any suggestions are greatly
appreciated.
Sincerely,
Bruce
bantolovich_at_specialmetals.com
Received on Fri Jul 16 1999 - 15:17:08 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT