To answer my question, its as easy as:
1.removing the failed drive, and replacing the second
one, as long as the system was not shutdown after the
failed drive.
2. sys_check is a script! (didn't check that), one can
read. Otherwise, the command /usr/lbin/ra200info gives
all the info I wanted.
Thanks to Robert Aldridge, Ahmed Mohamed, Adametz
Bluejay, Derk Tegeler, Werner Hahling + Andreas
Maagdenberg.
Responses in full, might be helpful to someone else:
***************************
         Derk Tegeler 
sys_check is a script you can read.
ra200info xcr0 -f
where xcr0 is the first controller.
Rgds,
Derk
There is an on-line SWXCR management utility; it also
comes on a floppy 
and
references "unix" on the label. Install that, and
you'll have an 
x-windows
utility that can display your drive status and handle
replecement of a
failed drive.
The latest version of the utility I have isn't V5
aware, so it looks 
for
V4-style device names. To get it to to work, I had to
figure out the 
device
numbers for (for example) my /dev/rdisk/dsk0c device,
then create a
/dev/rre0c device with those same numbers so the
utility could find the
controller.
                                                 -
Bluejay Adametz
                                                 
                                                 
The swxcrmgr is the utility on the floppy.  Yes, the
system would need 
to be
down to use swxcrmgr.
Usually with this controller, there should be a Unix
"monitoring" 
interface
(probably on another floppy), but it wouldn't be a
utility where you 
can
actually manipulate the controller ("read-only").
As long as the drive shows "FAILED" you _should_ be
okay just to swap 
it out
and replace it.  The controller should automatically
rebuild with no
intervention.  (sys_check should indicate that the
drive is rebuilding, 
then
show it become optimal again)
Robert
The only way I know of to replace the drive while the
system is online 
is the command  swxcrmgr.
Try accessing the swxcrmgr command from the console
terminal while you 
are logged with username  root
If it is not there, then you have to take the system
down and run ecu 
then run the program swxcrmgr from a floppy
It is very easy to replace the failed drive using any
of the ways 
above, just "FAIL" the drive, remove it, then put a
new one, then make 
it OPTIMAL.
Hope that helps.
MKMA
swxcrmgr is a stand alone utility to setup and
configure the swxcr raid
controllers.
since the machine is still running even with the dead
drive in it one 
can
assume that this is either RAID5 running in degraded
mode, RAID5 having 
a
hot spare (which is now in use) or RAID3 (a shadow set
now not 
shadowing). I
very much assume this to be RAID5!!
To replace the drive simply pull the dead one out -
wait approx. 5 
minutes
(to let the controller settle all his checks)  then
insert a new (or
working) drive. 
NOTE:
The replacement drive NEEDS to be at least the same
capacity as the 
dead one
or larger (an equal type is probably best).
After replacement the recovery happens automatically -
very visible 
when
looking at the drive bay.
IMPORTANT:
Do NOT under nay circumstance power off/on the system.
If you do, the 
fault
condition will be written into the swxcr's
configuration and you will 
need
the standalone swxcrmgr utility to correct the
situation.
good luck
regards
Werner Hahling
Systems Analyst and Support
(ex. digital Equipment)
North Queensland Newspapers
*******************************************
ORIGINAL POST:
--- Tru64 User <tru64user_at_yahoo.com> wrote:
> Greetings,
> 
> Physically saw a dead drive on BA350 shelf. Ran
> sys_check, and determined exactly which raid5 was
> affected.
> Quick Questions:
> 1. What commands does sys_check run (or i should run
> from command line) to get this output below?
> 2. HOWTO replace drive (SWXCR) procedures somewhere?
> 	-system needs to go down for this? 
> OR same as hsz controllers...just online?
> 3. Other command line tools for SWXCR i should be
> aware about?
> 
> _Thanks in advance.
> Richard
> 
> 
> ***** Logical Drive Information for xcr1 *****
> 
> Drive groups on this controller:
> --------------------------------
>    Group 0 : <0,0><0,1><0,2><0,3>*0,4*<0,5><0,6>
>    Group 1 : <1,0><1,1><1,2><1,3><1,4><1,5><1,6>
> 
>  Logical drives configured:
> ---------------------------
>    Logical                                Drive
>    Drive    RAID    Size      Cache       Group     
> Current
>    Number:  Level:  (in MB):  Policy:     Spanned:  
> Status:
>   
>
---------------------------------------------------------
>      0      5       24546     WRITE THRU    0       
> DEGRADED
>      1      5       24546     WRITE THRU    1       
> OPTIMAL 
> 
> SWXCR xcr1 physical drive parameters: 
> 
> RAID Array 200 Controller Family Information Utility
> V1.03
> Copyright (c)1997,2000 by Compaq Computer
> Corporation,
> all rights reserved.
> 
> 
> 
> 
> ***** Physical Device Information for xcr1 *****
> 
> Channel,                                          
> Wide/     Controller
> Target:   Vendor:   Description:       Firmware:  
> Narrow:   Status:
>
-----------------------------------------------------------------------
>  <0,0>    DEC       RZ29B    (C) DEC   0014        N
>  
>       OPT
>  <0,1>    DEC       RZ29B    (C) DEC   0014        N
>  
>       OPT
>  <0,2>    DEC       RZ29B    (C) DEC   0014        N
>  
>       OPT
>  <0,3>    DEC       RZ29B    (C) DEC   0014        N
>  
>       OPT
>  *0,4*    DEC       RZ29B    (C) DEC   0014        N
>  
>       FLD
>  
>  ***** Status Check Summary for xcr1 *****
> 
>    FAILED DISKS FOUND: 1
>       Failed Disks: *0,4*
> 
> =====
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Send instant messages & get email alerts with Yahoo!
> Messenger.
> http://im.yahoo.com/
=====
__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/
Received on Tue Aug 14 2001 - 13:15:57 NZST