Managers,
The errors reported by the kernel stem from (as the messages say):
Correctable machine errors, ie ECC memory correctable errors. DEC
support gave me some valuable input on how to isolate the error. The
console firmware code should be at level 2.2.13 ->, because these
firmware versions have better SIMM isolation code than the older ones.
Here's the procedure:
1: Shutdown to console mode and depending on the firmware level you
might have to type 'set d_group field' (level < 2.2.13, I assume)
2: Type 'memory' to start the memorytest (might run for some time,
depending of course on how much memory is installed).
3: 'showit' to find out which SIMM(S) failing.
A hard reset is necessary to stop the memory test.
Thanks to
alan_at_nabeth.cxo.dec.com (Alan Rollow at DEC)
Barb Baker <baker.barb_at_tchden.org>
Olle Eriksson <olle_at_cb.uu.se>
and last but not least,
Tor Inge Lillebøe <lilleboe_at_nwo.dec.com>
from DEC support (Norway) who supplied the error isolation procedure
Original message follows.
--
******************************************************************
* Knut Hellebø | DAMN GOOD COFFEE !! *
* Norsk Hydro a.s | (and hot too) *
* Phone: +47 55 996870, Fax: +47 55 996342 | *
* Mobile Phone: +47 93092402 | *
* E-mail: Knut.Hellebo_at_nho.hydro.com | Dale Cooper, FBI *
******************************************************************
attached mail follows:
Managers,
The following entries keep coming in the
/var/adm/syslog.dated/*/kern.log (AlphaStation 600 5/266, DEC Unix 3.2C)
Mar 12 15:04:32 system vmunix: Machine Check error corrected by
processor
Mar 12 15:04:33 system vmunix: Physical address of error
0000000000000000 in B-Cache during D-Cache fill
Mar 12 15:04:33 system vmunix: Fill Syndrome = 0000000000000000
Mar 12 15:04:33 system vmunix: Invalid Syndrome Value = 0000000000000000
Mar 12 15:04:33 system vmunix: EI Address = 0000000000000000
Mar 12 15:04:33 system vmunix: EI Status = 0000000000000000
Mar 12 15:04:33 system vmunix: Interrupt Status Reg =
0000000000000000
Mar 12 15:04:33 system vmunix: ECC Syndrome = 0000000000000000
Mar 12 15:04:33 system vmunix: Memory Port 0 Status Reg =
0000000000000000
Mar 12 15:04:33 system vmunix: Memory Port 1 Status Reg =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA Error Status =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA Error Reg = 0000000000000000
Mar 12 15:04:33 system vmunix: Low order address =
0000000000000000
Mar 12 15:04:33 system vmunix: High order addres =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA ERR =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA ERR STAT =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA ERR MASK =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA ECC_SYN =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA MEM ERR0 =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA MEM ERR1 =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA PCI ERR0 =
0000000000000000
Mar 12 15:04:33 system vmunix: CIA PCI ERR1 =
0000000000000000
Mar 12 15:04:33 system vmunix: EISA bridge NMI status &
control = 0000000000000000
Any idea on what's going on ?
--
******************************************************************
* Knut Hellebø | DAMN GOOD COFFEE !! *
* Norsk Hydro a.s | (and hot too) *
* Phone: +47 55 996870, Fax: +47 55 996342 | *
* Mobile Phone: +47 93092402 | *
* E-mail: Knut.Hellebo_at_nho.hydro.com | Dale Cooper, FBI *
******************************************************************
Received on Thu Mar 13 1997 - 17:02:22 NZDT