Problems with swxcr utilities

From: Jim Fitzmaurice <jpfitz_at_fnal.gov>
Date: Tue, 24 Aug 1999 15:17:49 -0500

Managers,

        I have a 4100 running 4.0D, and have a few questions, all swxcr related.

        I have a bad drive in my SWXCR array. I know there is a problem with the
drive because it unmounted itself, and any attempt to re-mount it results in
an I/O error. The swxcrmgr however shows all drives as "Optimal" although it
recognizes the drive in question as having, "127 miscellaneous errors", but
still it's labeled as optimal. Any ideas why?

        I'm also attempting to get swxcrmon 1.1.24.3 running properly on my
machine. Some of the flags are not behaving as the should, (according to the
documentation). I am using:

        # swxcrmon xcr0 -l -n -m -i 10

It appears the "-l" option works as it created a log and is logging info to
it. The -i is also working, the monitoring interval is 10 minutes. Now for
the problems. The option, "-n" is suppose to report an option only the first
time it occurs. root is getting the same mail message every 10 minutes. Then
the "-m" option, controls who gets mail. You have to create a file
/usr/opt/swxcr/swxcrmon.maillist. Which I have created. In the file you put
a list of who you want to get swxcrmon mail. I have only my address in the
list, yet mail is only going to root. That is the default if you don't
create the above file which I have created. (I checked for typos, there are
none.) Any ideas why?

        Finally, these are the errors reported to root and in the log by swxcrmon:

        The hard disk at channel 0, target 0 had 127 miscellaneous errors.
        Shelf failure on channel 0.
        Shelf failure on channel 1.
        Shelf failure on channel 2.
        Shelf failure on channel 2.

The last error is the only one that repeats every 10 minutes. All other
drives in the array seem to work O.K. The first error makes perfect sense,
since I already knew the drive had a problem, but what do these "Shelf
failure" messages mean and why does the one on channel 2 keep repeating. I
haven't located any documentation for evaluating error messages.

        Is anyone else using a SWXCR RAID Array, and seen anything similar? Any
help would be greatly appreciated. Thanks.

Jim Fitzmaurice
jpfitz_at_fnal.gov

UNIX is very user friendly, It's just very particular about who it makes
friends with.
Received on Tue Aug 24 1999 - 20:18:49 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT