SUMMARY: ES40: severe trouble (continously reboots unexpectedly)

From: MMP Wolfgang Rupp <wolfgang.rupp_at_mm-packaging.com>
Date: Tue, 27 Nov 2001 15:06:00 +0100

A late summary, but Compaq finally figured it out.

----
The machine was ES40 ser. 2 with 4xEV6.7/667. The problems were
continous reboots and the following in /var/adm/messages:
Nov  7 14:41:05 dubaan2 vmunix: panic (cpu 0): kn600_softerr_intr():
 Correctable  Errors not supported on EV6 pass 1
----
Short version: it was NOT a hardware but a firmware problem.
Detailed version:
Most suggestions were that this is a hardware problem, and that
we have EV6 pass 1. Compaq said that although the error indicates
the CPUs are pass 1, EV6.7/667 cannot be pass 1.
It turned out to be a firmware problem. When the CPU 0 was changed
after a "real" failure, the new one was pass 2.6 (instead of 2.02,
I think). No firmware update had been made with the CPU swap. The
older firmware did not recognise the new CPU and told the system
"unknown EV6", which lead to the errors and crashed we experienced.
They upgraded the firmware to Version 6. Now the CPU is correctly
recognised at SRM level, and the machine has been stable since.
As an additional measure, we moved the new pass 2.6 CPU to a
different slot, so that one of the original CPUs is in slot 0.
Wolfgang Rupp
Received on Tue Nov 27 2001 - 14:08:09 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT