SUMMARY: Memory errors

From: G. Dimitoglou <george_at_esa.nascom.nasa.gov>
Date: Tue, 04 Nov 1997 10:10:51 -0500

Hi,
I received few answers to this question. Fortunately the event has
stopped occurring since then. Thank you all for your responses.

Best,
====================================
George Dimitoglou (SAC)

SOHO ESA/NASA Project Scientist Team
NASA/Goddard Space Flight Center
Bldg. 26, G-1, Code 682.3
Greenbelt, MD 20771
george_at_esa.nascom.nasa.gov
====================================

                                SUMMARY
                                -------
- From: Olle Eriksson <olle_at_cb.uu.se>
Correctable memory errors. As long as they are not too many they
cause no problems.

- From: Jan Hansen <dkJanHan_at_europe.lego.com>
I have 10 AS500/500MHz with 1Gb of memory and almost all of them has
this error. The system is able to correct single-bit errors.
According to Digital it is acceptable to have x numbers of these per
month per Mb. ??? If the error occur at the same address every time
you should replace the module.


- From: "Huehls, Mark R." <huehlsm_at_indy.navy.mil>
run uerf as root. if you look it up in the man pages you can look at
just memory.


Original Posting:
------------------
Our ALPHASTATION 500/333Mhz came from the factory with 128Mb of
memory. We have then added 256Mb for a total of ~384Mb.
It has been working with no problem for about 4 months. For the past
few days I have been getting the attached error message.
Has anyone seen something like this before? I think this is just a
mismatch of the cache registers and the hardware. But is it a problem
with the caching or the board?
With a simple calculation of the EI address, it appears to be on the
226th Mb (maped on the 3rd party added boards).
Any clues or information will be appreciated. I will summarize.

Thank you.



Machine Check error corrected by processor
Physical address of error ffffff000d4c3d7f Corrected ECC Error in Memory
during
D-Cache fill
        Fill Syndrome = 000000000000002a
Single Bit error in Quadword 0 at bit<12> in a Data bit
        EI Address = ffffff000d4c3d7f
        EI Status = fffffff0c4ffffff
        Interrupt Status Reg = 0000000100000000
        ECC Syndrome = 0000000000000000
        Memory Port 0 Status Reg = 0000000000000000
        Memory Port 1 Status Reg = 0000000000000000
        CIA Error Status = 0000000000000000
        CIA Error Reg = 0000000000000000
Received on Tue Nov 04 1997 - 16:45:11 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT