Hello managers ...
I am having some problems with two Tru64 5.0a boxes, and a shared disk 
array between them.  I'll try to make it brief, but if someone has any 
insight as to what could be causing these troubles, I would be grateful.
The shared disk array [six disks, striped and mirrored] is attached to the 
two boxes, and we are running in an ad-hoc cluster environment using 
heartbeat.  One box is the master and has the disk array or shelf [using 
AdvFS and LSM] imported, mounted, and running user email, home directories, 
etc.
Our situation is that the shelf will become unavailable.  It shows as being 
mounted when you run df but if you try to cd into once of the directories 
that reside on the shelf, you get a 'Permission Denied' error.  I can 
deport and then re-import the shelf, and it becomes available again, but 
only for a few minutes.
I have previously deleted and then recreated the AdvFS file domains and 
file sets, then restored from tape, and it ran for about 24 hours.  since 
the first crash after that it will only run for a few minutes.  I am 
running verify as I type.
I have checked the uerf logs but I don't see anything that jumps out at me, 
there are some CAM SCSI errors [samples below] but I don't know if they are 
indicative of the problem.  I am thinking a bad disk maybe, but would 
appreciate any advice.
Thanks for your time!
Aaron G. Sword
SunLit Surf
Example 1 - this machine usually has the shelf imported and in use
********************************* ENTRY    16. 
*********************************
----- EVENT INFORMATION -----
EVENT CLASS                             ERROR EVENT
OS EVENT TYPE                  199.     CAM SCSI
SEQUENCE NUMBER                287.
OPERATING SYSTEM                        DEC OSF/1
OCCURRED/LOGGED ON                      Fri Jul 20 09:14:12 2001
OCCURRED ON SYSTEM                      wickb
SYSTEM ID                 x00060009     CPU TYPE:  DEC 2100
SYSTYPE                   x00000000
PROCESSOR COUNT                  3.
PROCESSOR WHO LOGGED      x00000000
----- UNIT INFORMATION -----
CLASS                         x0037
SUBSYSTEM                     x0000     DISK
BUS #                         x0002
--------------------
Example 2 - this machine usually has the shelf imported and in use
********************************* ENTRY    21. 
*********************************
----- EVENT INFORMATION -----
EVENT CLASS                             ERROR EVENT
OS EVENT TYPE                  199.     CAM SCSI
SEQUENCE NUMBER                225.
OPERATING SYSTEM                        DEC OSF/1
OCCURRED/LOGGED ON                      Thu Jul 19 23:34:30 2001
OCCURRED ON SYSTEM                      wickb
SYSTEM ID                 x00060009     CPU TYPE:  DEC 2100
SYSTYPE                   x00000000
PROCESSOR COUNT                  3.
PROCESSOR WHO LOGGED      x00000000
----- UNIT INFORMATION -----
CLASS                         x0022     DEC SIM
SUBSYSTEM                     x0000     DISK
BUS #                         x0000
                               x0000     LUN x0
                                         TARGET x0
Received on Sat Jul 21 2001 - 04:19:47 NZST