ECC memory error

From: Shiu Tin <shiu_at_rsphysse.anu.edu.au>
Date: Thu, 02 Sep 1999 10:16:47 +1000 (EST)

One of our workstation (a AlphaStation 500/500) is having the infamous ECC
correctable memory error quite frequently. From previous postings, I
understand it is likely a problem with one of the memory modules.
However, I have two banks of memory installed on the system (128MB
from Digital and 256MB third party memory). So, my question is:

How can I tell which one is causing the problem?

It looks like the error occurs with the EI Address varies between
ffffff00064cxxxx - ffffff00064fxxxx.

A typical report on /var/adm/messages:

Sep 1 20:31:56 h-1nf vmunix: Machine Check error corrected by processor
Sep 1 20:31:56 h-1nf vmunix: Physical address of error ffffff00064c100f
Corrected ECC Error in Memory during D-Cache fill
Sep 1 20:31:56 h-1nf vmunix: Fill Syndrome = 0000000000009200
Sep 1 20:31:56 h-1nf vmunix: Single Bit error in Quadword 1 at bit<50> in
a Data bit
Sep 1 20:31:56 h-1nf vmunix: EI Address = ffffff00064c100f
Sep 1 20:31:56 h-1nf vmunix: EI Status = fffffff0c5ffffff
Sep 1 20:31:56 h-1nf vmunix: Interrupt Status Reg = 0000000100000000
Sep 1 20:31:56 h-1nf vmunix: ECC Syndrome = 0000000000000000
Sep 1 20:31:56 h-1nf vmunix: Memory Port 0 Status Reg =
00000000000000
00
Sep 1 20:31:56 h-1nf vmunix: Memory Port 1 Status Reg =
00000000000000
00
Sep 1 20:31:56 h-1nf vmunix: CIA Error Status = 0000000000000000
Sep 1 20:31:56 h-1nf vmunix: CIA Error Reg = 0000000000000000


Thanks in advance for your help,
Shiu
-----------------------------------------------------------------------
Shiu Tin Email: Shiu.Tin_at_anu.edu.au
Research School of Physical Sciences and Engineering
Australian National University Tel: 61 2 6249 0104
Canberra, ACT, Australia 0200 Fax: 61 2 6279 8103
-----------------------------------------------------------------------
Received on Thu Sep 02 1999 - 00:19:15 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT