What do these CPU exceptions mean?

From: Scott Brewster <scott_at_sessb.its.dias.qut.edu.au>
Date: Thu, 16 Sep 1999 12:29:07 +0000

Hi,

We have a machine reporting CPU exceptions (a list of recent exceptions is
attached at the end of the message). What do these exceptions mean?
As far as I can tell, they have caused the machine to crash at least once.

After the crash, Compaq replaced the motherboard unit (which has everything
except the main memory and PCI cards), however the CPU exceptions persist.
Is the main memory faulty?

Type of machine: Digital Personal WorkStation 600au
OS: Digital Unix 4.0D PK3
Firmware revision: 7.0-10
Memory: 4 * 256Mb (total 1Gb)
Disks: 3 channel SWXCR RAID controller + internal SCSI bus

Scott

--------

Near the time of the crash the exceptions were happening more often,
perhaps ten or twenty that day, but now they are occuring once every few
days.

dia reports the exceptions like this: (the values in the Entry body field
change sometimes)

******************************** ENTRY 1 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 2.
Timestamp of occurrence 16-SEP-1999 05:17:26
Host name bluejay

System type register x0000001E Systype 30. (Miata)
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 100. CPU Machine Check Errors

CPU Minor class 3. Processor Correctable Error (630)

Entry Body Size: x00000068
Entry body:

          15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
 0000: 00000038 00000018 80000000 00000068 *h...........8...*
 0010: FFFFFF00 33C8CF4F 00000000 00000086 *........O..3....*
 0020: FFFFFFF0 C5FFFFFF 00000000 00001A00 *................*
 0030: 00000000 00000000 00000001 00000000 *................*
 0040: 00000000 00000000 00000000 00000000 *................*
 0050: 00000000 00000000 00000000 00000000 *................*
 0060: 5E3C7E25 00000000 * ....%~<^*


At the time of the crash:

******************************** ENTRY 27 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 14.
Timestamp of occurrence 02-SEP-1999 18:33:33
Host name bluejay

System type register x0000001E Systype 30. (Miata)
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 302. ASCII Panic Message Type

SWI Minor class 9. ASCII Message
SWI Minor sub class 1. Panic

ASCII Message panic (cpu 0): Processor Machine Check
                                       

******************************** ENTRY 28 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 13.
Timestamp of occurrence 02-SEP-1999 18:33:33
Host name bluejay

System type register x0000001E Systype 30. (Miata)
Number of CPUs (mpnum) x00000001
CPU logging event (mperr) x00000000

Event validity 1. O/S claims event is valid
Event severity 1. Severe Priority
Entry type 100. CPU Machine Check Errors

CPU Minor class 1. Processor Uncorrectable Error (670)

Entry Body Size: x00000208
Entry body:

          15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order
 0000: 000001A0 00000118 00000000 000002C0 *................*
 0010: 00000000 00000000 00000000 00000098 *................*
 0020: 00000000 00000000 00000000 00000000 *................*
 0030: 00000000 00000000 00000000 00000000 *................*
 0040: 00000000 00000000 00000000 00000000 *................*
 0050: FFFFFFFF A85C4000 00000000 00000000 *........._at_\.....*
 0060: FFFFFC00 003FA9D0 00000000 000002B8 *..........?.....*
 0070: 00000000 00000400 00000000 00005200 *.R..............*
 0080: 00000000 00000000 FFFFFFFF A85C7838 *8x\.............*
 0090: 1F1E1615 14020100 FFFFFC00 003FA2F0 *..?.............*
 00A0: FFFFFC00 003F9818 FFFFFC00 003FA710 *..?.......?.....*
 00B0: FFFFFC00 003FA940 FFFFFC00 003FA570 *p.?....._at_.?.....*
 00C0: 00000000 00F00270 FFFFFFFF FFF8DA00 *........p.......*
 00D0: 00000098 06700009 00000000 00F0380C *.8........p.....*
 00E0: 00000000 11FFD980 00000000 00000000 *................*
 00F0: 00000000 39018000 FFFFFFFF A85C75D0 *.u\........9....*
 0100: FFFFFC00 00561FE0 FFFFFC00 003FA970 *p.?.......V.....*
 0110: FFFFFC00 003F9818 00000000 05C3BA38 *8.........?.....*
 0120: 00000000 00000000 00000000 00000000 *................*
 0130: 00000000 00000000 00000000 00018000 *................*
 0140: 00000000 00000000 00000041 62020000 *...bA...........*
 0150: FFFFFFFF FF8000A0 00000000 00000000 *................*
 0160: FFFFFF00 0001D04F 00000000 00014890 *.H......O.......*
 0170: FFFFFF80 2D8D6FFF 00000000 00000000 *.........o.-....*
 0180: 00000000 00000C00 FFFFFF00 1961227F *."a.............*
 0190: FFFFFF00 1961227F FFFFFFF9 45FFFFFF *...E....."a.....*
 01A0: 00000000 00400000 00000000 00000000 *.........._at_.....*
 01B0: 00000000 00000000 00000000 00000000 *................*
 01C0: 00000000 020C0000 00000000 00000B93 *................*
 01D0: 00000000 58910000 00000000 0001D540 *_at_..........X....*
 01E0: 00000000 00008240 00000000 02010002 *........_at_.......*
 01F0: 00000000 00008240 00000000 00000000 *........_at_.......*
 0200: 5E3C7E25 00000000 * ....%~<^*


uerf reports much less information, typically something like:

********************************* ENTRY 1. *********************************

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 100. CPU EXCEPTION
SEQUENCE NUMBER 2.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Thu Sep 16 05:17:26 1999
OCCURRED ON SYSTEM bluejay
SYSTEM ID x0007001E
SYSTYPE x00000000

----- UNIT INFORMATION -----

UNIT CLASS CPU
Received on Thu Sep 16 1999 - 02:44:34 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT