mysterious HSZ 40/50/70 problems

From: <Peter.Braack_at_degussa.de>
Date: Thu, 23 Apr 1998 15:24:57 +0200

Dear managers,

This is just a rather brief description, but I try not to waste bandwidth if
nobody has any ideas:

We have 14 HSZ40/50/70 connected to various alphas (1000, 2100, 4100) with or
without TruCluster. We use DU 3.2d and 4.0b with all patches. HSOF is V3.1-Z4,
V5.1-Z4 and V7.0 . All HSZ40/50 are single, the HSZ70 are dual redundant.

During the last 3 month we had five crashes due to inaccesible units (some
raid, some single disks) on HSZs. The CLI was dead and hszterm didn't work
too. Only one or more reset brought the HSZ back to live. I can provide some
logs, but mostly it happens without a trace.

Each HSZ was replaced by digital and the failure occured never again. Until now
digital has no clues what was happening but it seems that they are suspicious
about our console manager setup. Every server and HSZ consoleport is connected
to a DECserver 700 (more than one and the failures occured on nodes connected
to different DECservers) and is managed from a single host with the Polycenter
Console Manager V1.7.

Has anyone heard about similar problems?

Thanks in advance,
Peter

Peter Braack
Unix System Administrator
Degussa AG, Frankfurt, Germany
Peter.Braack_at_degussa.de
Received on Thu Apr 23 1998 - 15:27:28 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT