cpu panic on PWS600au

From: Dirk Hufnagel <hufnagel_at_mps.ohio-state.edu>
Date: Tue, 26 Feb 2002 10:06:16 -0500

Yesterday a PWS600au I manage paniced and went to the SRM prompt.
I had gotten what I thought to be memory errors from that machine
for a while now and was in the process of figuring out what
memory modules to replace. But it never paniced before and now
I am not sure anymore if these really are memory errors or
something worse. I attach the corresponding output from the
binary errorlog and /var/adm/messages. I would appreciate
any help deciphering them.

BTW, I got a few hundred binary errorlog entries like 1387
within the last few weeks but the machine never paniced.
They usually happened under high load which made me suspect
the memory.

Thanks

        Dirk Hufnagel


**** V3.3 ********************* ENTRY 1387 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 317.
Timestamp of occurrence 25-FEB-2002 17:09:39
Host name hostna

System type register x0000001E Systype 30. (Miata)
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 100. Machine Check Error - (major class)
                                   1. - (minor class)



========================
Raw Event Data Dump
========================

Entry# (record in file) 1387.

Entry Body Size: x00000240
Entry body:

           15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
  0000: 3C7AB623 00060101 0007001E 013D0240 *_at_.=.........#.z<*
  0010: 00000006 00000000 00003266 6C616C63 *hostna..........*
  0020: 00000000 1A010064 00000000 00000001 *........d.......*
  0030: 00000000 000002C0 00000000 00000000 *................*
  0040: 00000000 0000020F 000001A0 00000118 *................*
  0050: 00000000 00000000 00000000 00000000 *................*
  0060: 00000000 00000000 00000000 00000000 *................*
  0070: 00000000 00000000 00000000 00000000 *................*
  0080: 00000000 00000000 00000000 00000000 *................*
  0090: 00000000 00000000 00000000 F38A427F *.B..............*
  00A0: 00000000 00005200 FFFFFC00 004C8A50 *P.L......R......*
  00B0: 00000000 00000000 00000000 00000257 *W...............*
  00C0: FFFFFC00 004C8310 00000001 00000016 *..........L.....*
  00D0: FFFFFC00 004C8790 1F1E1615 14020100 *..........L.....*
  00E0: FFFFFC00 004C8600 FFFFFC00 004CC1D0 *..L.......L.....*
  00F0: FFFFFFFF FFF8C800 FFFFFC00 004C89C0 *..L.............*
  0100: 00000000 00F0380C 00000000 00F00270 *p........8......*
  0110: 00000000 00000000 0000020F 06600001 *..`.............*
  0120: FFFFFFFF A3E6FA38 00000001 1FFFF090 *........8.......*
  0130: FFFFFC00 004C89F0 00000000 0B804000 *._at_........L.....*
  0140: 00000000 0D53FA38 FFFFFC00 006A7570 *puj.....8.S.....*
  0150: 00000000 00000000 FFFFFC00 004CC1D0 *..L.............*
  0160: 00000000 00018000 00000000 00000000 *................*
  0170: 00000041 62020000 00000000 80000000 *...........bA...*
  0180: 00000000 00000000 00000000 00000000 *................*
  0190: 00000000 000140D0 00000001 423E4C1C *.L>B....._at_......*
  01A0: 00000000 00000000 FFFFFF00 0001CD4F *O...............*
  01B0: FFFFFFFF F8F7FEFF FFFFFFFF F7FFEFFF *................*
  01C0: FFFFFFF0 05FFFFFF 00000000 00009F9F *................*
  01D0: 00000000 00000000 FFFFFF00 1CAB795F *_y..............*
  01E0: FFFFFFFF 80000080 00000000 00000000 *................*
  01F0: 00000000 00000B93 00000000 00000010 *................*
  0200: 00000000 0EE28FC0 00000000 0000F3F3 *................*
  0210: 00000000 07060000 00000000 58000000 *...X............*
  0220: 00000000 00000000 00000000 0000E002 *................*
  0230: 003C7E25 00000000 FFFFFFFF 80140000 *............%~<^*



**** V3.3 ********************* ENTRY 1388 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 318.
Timestamp of occurrence 25-FEB-2002 17:09:39
Host name clalf2

System type register x0000001E Systype 30. (Miata)
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 302. ASCII Panic Message Type
                                  -1. - (minor class)

SWI Minor class 9. ASCII Message
SWI Minor sub class 1. Panic

ASCII Message panic (cpu 0): System Uncorrectable
                                      Machine Check




Feb 25 17:22:05 hostna vmunix: Machine Check SYSTEM Fatal Abort
Feb 25 17:22:05 hostna vmunix: Machine Check Code = 20f
Feb 25 17:22:05 hostna vmunix: PCI master abort error
Feb 25 17:22:05 hostna vmunix: pal temp[0-1] = 00000000f38a427f 0000000000000000
Feb 25 17:22:05 hostna vmunix: pal temp[2-3] = fffffc00004c8a50 0000000000005200
Feb 25 17:22:05 hostna vmunix: pal temp[4-5] = 0000000000000257 0000000000000000
Feb 25 17:22:06 hostna vmunix: pal temp[6-7] = 0000000100000016 fffffc00004c8310
Feb 25 17:22:06 hostna vmunix: pal temp[8-9] = 1f1e161514020100 fffffc00004c8790
Feb 25 17:22:06 hostna vmunix: pal temp[10-11] = fffffc00004cc1d0 fffffc00004c8600
Feb 25 17:22:06 hostna vmunix: pal temp[12-13] = fffffc00004c89c0 fffffffffff8c800
Feb 25 17:22:06 hostna vmunix: pal temp[14-15] = 0000000000f00270 0000000000f0380c
Feb 25 17:22:06 hostna vmunix: pal temp[16-17] = 0000020f06600001 0000000000000000
Feb 25 17:22:06 hostna vmunix: pal temp[18-19] = 000000011ffff090 ffffffffa3e6fa38
Feb 25 17:22:06 hostna vmunix: pal temp[20-21] = 000000000b804000 fffffc00004c89f0
Feb 25 17:22:06 hostna vmunix: pal temp[22-23] = fffffc00006a7570 000000000d53fa38
Feb 25 17:22:06 hostna vmunix: shadow[0-1] = 0000000000000000 0000000000000000
Feb 25 17:22:06 hostna vmunix: shadow[2-3] = 0000000000000000 0000000000000000
Feb 25 17:22:06 hostna vmunix: shadow[4-5] = 0000000000000000 0000000000000000
Feb 25 17:22:06 hostna vmunix: shadow[6-7] = 0000000000000000 0000000000000000
Feb 25 17:22:06 hostna vmunix: Address of excepting instruction = fffffc00004cc1d0
Feb 25 17:22:06 hostna vmunix: Summary of arithmetic traps = 0000000000000000
Feb 25 17:22:06 hostna vmunix: Exception mask = 0000000000000000
Feb 25 17:22:06 hostna vmunix: Base address for PALcode = 0000000000018000
Feb 25 17:22:06 hostna vmunix: Interrupt Status Reg = 0000000080000000
Feb 25 17:22:07 hostna vmunix: CURRENT SETUP OF EV5 IBOX = 0000004162020000
Feb 25 17:22:07 hostna vmunix: I-CACHE Reg Tag parity error = 0000000000000000
Feb 25 17:22:07 hostna vmunix: D-CACHE error Reg = 0000000000000000
Feb 25 17:22:07 hostna vmunix: Effective VA = 00000001423e4c1c
Feb 25 17:22:07 hostna vmunix: reason for D-stream = 00000000000140d0
Feb 25 17:22:07 hostna vmunix: EV5 Secondary Cache address = ffffff000001cd4f
Feb 25 17:22:07 hostna vmunix: EV5 Secondary Cache TAG/Data parity = 0000000000000000
Feb 25 17:22:07 hostna vmunix: EV5 BC_TAG_ADDR = fffffffff7ffefff
Feb 25 17:22:07 hostna vmunix: EV5 EI_STAT_ADDR Phys addr of Xfer = fffffffff8f7feff
Feb 25 17:22:07 hostna vmunix: Fill Syndrome = 0000000000009f9f
Feb 25 17:22:07 hostna vmunix: EI_STAT reg = fffffff005ffffff
Feb 25 17:22:07 hostna vmunix: LD_LOCK = ffffff001cab795f
Feb 25 17:22:07 hostna vmunix: PYXIS_DMA_DATA = 0000000000000000
Feb 25 17:22:07 hostna vmunix: CIA/PYXIS ERR = ffffffff80000080
Feb 25 17:22:07 hostna vmunix: PCI BUS Master state machine generated Master Abort
Feb 25 17:22:07 hostna vmunix: CIA/PYXIS ERR STAT = 0000000000000010
Feb 25 17:22:07 hostna vmunix: CIA/PYXIS ERR MASK = 0000000000000b93
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS ECC_SYN = 000000000000f3f3
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS MEM ERR0 = 000000000ee28fc0
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS MEM ERR1 = 0000000058000000
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS PCI ERR0 = 0000000007060000
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS PCI ERR1 = 000000000000e002
Feb 25 17:22:08 hostna vmunix: ISA bridge NMI status & control = 0000000000000000
Feb 25 17:22:08 hostna vmunix: CIA/PYXIS PCI ERR2 = ffffffff80140000
Feb 25 17:22:08 hostna vmunix: panic (cpu 0): System Uncorrectable Machine Check
Received on Tue Feb 26 2002 - 15:05:52 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:43 NZDT