SUMMARY: Strange RAID problem

From: David Gadbois <gadbois_at_cyc.com>
Date: Wed, 26 Apr 1995 18:50-0500

The original message:

    Date: Fri, 7 Apr 1995 00:01 CDT
    From: David Gadbois <gadbois_at_cyc>
    
    I have a 2100 running OSF/1 3.0 with 4 RZ28-VAs hooked to a SWXCR-EB
    in a RAID 0 configuration. On two occasions, the SWXCR has reported
    read errors once each on two different drives in the group. Both
    times the SWXCR marked the drive as failed and took itself offline.
    Both times I powered the system down, pulled the failed drive,
    reseated it, and used the SWXCR configuration software to mark the
    failed drive as optimal. The first time I did this the filesystem
    on the SWXCR mounted, and I was able to do a level 0 vdump of the
    whole thing.
    
The resolution is a bit embarrassing. In poking around the system box,
a colleague and I had apparently incorrectly reseated the ribbon SCSI
cable connecting the SWXCR and the drive cabinet in the front of the
box. This left the cable dangling in front of a fan so that it was
knocked about by the blades. This caused the read errors as the fan
knicked off pieces of the cable's insulation. Our CE, Jim Meyers, after
determining that the SWXCR board was not at fault, quickly
troubleshooted the problem and replaced the cable with a new one after
giving me a lecture about programmers fiddling with hardware. I would
have complained about a design that would allow such a thing to happen,
but I was too busy eating crow.


Thanks to:
Todd Bedell {77299} <tbb_at_swl.msd.ray.com>
Eugene Redner <redner_at_apache.ENET.dec.com>
SYSTEM_JB_at_unode1.nswc.navy.mil
Jeff Gilman <jcg_at_swl.msd.ray.com>

--David Gadbois
Received on Wed Apr 26 1995 - 19:50:56 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT