Power supply confusion.

From: Lawrence, Kenneth E ERDC-CHL-MS Contractor <Kenneth.E.Lawrence_at_erdc.usace.army.mil>
Date: Wed, 09 Jul 2003 16:58:05 -0500

Hello Admins,

I have an AS4100 running 4.0E with patches.
It has all 4 cpus, 2 GB memory
and 2 power supplies (specifically 0 and 2).

The system panicked this morning.
The "show power" command at the SRM prompt points to an intermittent problem
with power supply 2. I am willing to believe this.

BUT, dia says:


  System type register x00000016 Alpha 4000/1200 Series
  Number of CPUs (mpnum) x00000004
  CPU logging event (mperr) x00000000

  Event validity 1. O/S claims event is valid
  Event severity 1. Severe Priority
  Entry type 100. CPU Machine Check Errors

  CPU Minor class 2. 660 Entry

  Software Flags x0000000300000000
                                       IOD 0 Register Subpkt Pres
                                       IOD 1 Register Subpkt Pres
  Active CPUs x0000000F
  Hardware Rev x00000000
  System Serial Number <Deleted>
  Module Serial Number
  Module Type x0000
  System Revision x00000000

  Machine Check Reason x0208 Fatal Environmental Event Interrupt
                                       
                                       
  Environmental Entry ---> System Environmental Register Follows

  ======================== =====================================


  Sys Environmental Regs x000017CB Function Reg<15:8>: x00000017
                                       Failure Reg <7:0> : x000000CB
>> Invalid Pwr Supply 0 Status Bits
Sequence
>> Power Supply 1 Present and Ok
>> Invalid Pwr Supply 2 Status Bits
Sequence
                                       System Fans are OK
>> PROBLEM with CPU Fan 0 and 2
                                       Temperature is OK

  PALcode Revision Palcode Rev: 1.21-26

As you can see, the decoded Environmental Register bits claim that power
supplies 0 and 2 have invalid status bits and power supply 1 (which doesn't
exist) is present and ok.

So the real question is "Does this suggest that the problem is monitoring
circuits giving false readings?" Or should I trust the SRM and buy a
replacement power supply?

As a followup question...Does anyone know where I can find a quality, yet
inexpensive, replacement power supply?


Thanks!!

Ken Lawrence <mailto:lawrenk_at_wes.army.mil>
BAE Systems
USAE-Engineer Research & Development Center
Coastal & Hydraulics Lab
Unix System Administrator?
601.634.3813
Received on Wed Jul 09 2003 - 21:58:48 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:44 NZDT