SUMMARY: Halt Code on Alphastation 255/233

From: Robert Benites <benites_at_cs.unca.edu>
Date: Mon, 26 Jun 2000 14:05:00 -0400 (EDT)

Thanks to those who responded:

Joe Fletcher <joe_at_meng.ucl.ac.uk>
Dr. Tom Blinn <tpb_at_doctor.zk3.dec.com>
John James <jrjame_at_acxiom.com>

My original post is at the very end of this message. I neglected to
mention in my initial post that I had used both uerf and the event
viewer to check the binary errorlog. There were no entries for this
crash recorded there. Also, I do examine the error log on a regular
basis, and it was the first place I looked.

I've included Tom Blinn's informative response below, but I fear I
fall into a category he describes as being a "fatal error" and as such
there is "no recovery."

If there was sufficient effort for the PALcode to determine this was a
"Halt code 7" (as opposed to some other code value, such as 2) I
assumed the various codes were documented. I guess not...

Thanks...

-- bb

> "PAL" is the privileged architecture library -- it's the really low
> level code (delivered as part of the console firmware) that deals
> with hardware at a very low level, for example, in interrupt
> handling, memory management and the like.
>
> While the CPU was running in "PAL mode" (that is, running this
> low-level code), there was a hardware error detected by circuitry in
> the CPU. Since there are no error recovery routines implemented in
> the PALcode itself (it relies on the operating system to help with
> error recovery) the system just halted (it's a fatal error, there is
> no recovery).
>
> Since the system spends almost no time running PALcode, in the usual
> case an intermittent CPU error will not make the system halt,
> instead it will be captured in the error logs. I assume you don't
> examine the error log on a regular basis, or that's the first place
> you might have looked.
>
> Being that as it may, you might check the error logs (use uerf or
> dia or ca depending on what you've got installed, uerf probably
> supports your old hardware, dia might, ca probably doesn't).
>
> You might never find out what caused the problem, but it's a
> hardware fault and it may start recurring if it's due to a marginal
> component somewhere in the system.
>
> Tom

-------------------------<Original post follows>-------------------------

> I have a Alphastation 255/233 running:
>
> Tru64 UNIX 5.0 (with patch kit #2 installed)
> Firmware revision: 7.0
> PALcode: UNIX version 1.46
>
> I came in this morning to find the hardware console with the following
> messages:
>
> Halt code 7
> Machine check while in PAL Mode
> PC=19364
>
> I can't find anything in the tru64 managers archives or at Compaq's
> web site.
>
> The system booted fine, but I'd still like to know what caused the
> problem.
Received on Mon Jun 26 2000 - 18:06:14 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT