Hi Managers,
Over the last two weeks, our Digital AlphaPC 164LX 533 MHz machine
has crashed three times. After analyzing the binary error log and
the crash-data files, it looks like the cause is the same in
each case. In fact, the symptoms look virtually identical to a case
posted in the archives on Nov, 14 1997.
In the poster's case, the diagnosis was:
FINAL CONCLUSION: The most probable cause is over heating of the Bcache Simms.
To put it simply, in each case I see the following--exactly the same as the
referenced case, except for one slight(?) difference:
SOME INFO FROM THE CRASH DATA FILES: What you may see if you have this problem:
1) a line in the crash data file that has: EI_STAT reg = fffffff014ffffff
(in our case it appears as: EI_STAT reg = fffffff005ffffff)
2) a panic string of: "Processor Machine Check"
3) lines in succession with the following:
Machine Check Processor Fatal Abort
Machine Check Code = 98
Machine Check Code = 98
Processor detected hard error
My question: can I therefore confidently conclude that the problem is the same as what was
diagnosed in the previously posted case? Or might there be other possibilities,
and if so, how might I diagnose it? The overheating problem is possible, though
the overall room environment has not changed in the past year. Perhaps a case fan
might not be working properly, or could the Bcache simms simply be wearing out (the
machine is 2.5 years old and does a lot of memory-intensive number crunching).
Thanks,
Kevin Tyle <kevin_at_meso.com>
MESO, Inc.
Troy, NY USA
Received on Tue Feb 01 2000 - 19:25:15 NZDT