Hello managers,
I've been looking around the Compaq web site for insight to these error
messages but haven't been having much luck. Maybe I'm looking in the wrong
place. First of all our hardware is a three member cluster consisting of a
GS80 and two 4100's, they are all connected via a memory channel hub. Also
on a shared SCSI is an HSZ50 for system and application disks, our databases
reside on an HSG80, which is shared via a fiber channel hub, using loop
topology. Finally we have several crates of third party Fiber Channel disks
also shared via a fiber channel hub, using loop topology. Software; we are
running Tru64 V5.1 patch_kit #2 on all but member03 which is the lead member
in a rolling upgrade to patch_kit #3, on TruCluster V5.1.
Now the problem. Our /var/adm/messages file is filling up with these
messages:
May 4 12:16:20 mem01 vmunix: DRD register failed against 157 returned 61.
May 4 12:16:20 mem01 vmunix: DRD register failed against 159 returned 61.
and occasionally you'll see a:
May 4 12:55:55 mem01 vmunix: DRD register failed against 157 returned 5.
We've gotten them before, on all three cluster members, but only during
system boot, and never this many. They are occuring on member01 which was
booted first (the GS80). The other two members of the cluster (the 4100's)
got some of these during boot, but only a few, then they stopped, on
member01 they continue. The 157 and 159 are the HWID of two of the third
party Fiber Channel disks, the rest of them are shown as well but since we
have 40 and I didn't see a need to list all the error messages, or my mail
would have been a mile long. The disks are flashing as these messages occur.
Further more, I can read data from these disk on all members without
problems, and writes occur at normal speeds on some of the disks, but even
the smallest write to some of the disks can take up to a minute to complete.
Does anybody know what the DRD subsystem might be having trouble with?
Are these error messages documented anywhere? The errors usually stop, this
time why did they stop on the other members, but continue to occur on this
member? Can I stop them, and if so how?
Any help would be appreciated.
Jim Fitzmaurice
jpfitz_at_fnal.gov
UNIX is very user friendly, It's just very particular about who it makes
friends with.
Received on Fri May 04 2001 - 18:33:55 NZST