Hi Managers. I'd like to get peoples ideas on my current situation. I've
got a Quantum Viking II 9.1 GB SCSI drive, which is part of an AdvFS
set, being the cause of error messages in the uerf logs. The server is
running DU4.0D. The errors started around 11:30pm last night, and
continued on until the system was shutdown, around 8:00am this morning. I
ran scu on the drive, using the verify media option, and no errors were
reported. According to error messages from various sources, it appeared
that write operations were causing the disk to fail, and we thought this
to be the case since scu returned no errors, and it was performing a read
only test.
The interesting thing is, I've had the server up and running for a few
hours now, and it seems to be working fine. We've done an incremental
backup of home areas and mail, all of which sit in the domain that this
drive is a member of, and that's completed without a hitch.
I've copied the first error log for this drive below, to show others what
we're getting from uerf. The same error had occured up until the
machine was powered down. I'd just like to find out whether I do have a
drive problem, and therefore should send it off as a warranty job, or
should I be pointing the finger elsewhere (SCSI controller, cables,
etc..)? The drive box holds four disks, all of which are ultra wide.
I only know of the scu command to test the drive - should I be using
something else?
Thanks,
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 1.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Tue Mar 7 23:33:11 2000
OCCURRED ON SYSTEM <host>
SYSTEM ID x0006000D CPU TYPE: DEC 7000
SYSTYPE x00000000
----- UNIT INFORMATION -----
CLASS x0000 DISK
SUBSYSTEM x0000 DISK
BUS # x0001
x0050 LUN x0
TARGET x2
----- CAM STRING -----
----- CAM STRING -----
ROUTINE NAME cdisk_check_sense
Device aborted command - parity error?
----- CAM STRING -----
ERROR TYPE Hard Error Detected
----- CAM STRING -----
DEVICE NAME QUANTUM VIKING II 9.1WSE4110
----- CAM STRING -----
Active CCB at time of error
----- CAM STRING -----
CCB request completed with an
error
ERROR - os_std, os_type = 11, std_type = 10
----- ENT_CCB_SCSIIO -----
*MY ADDR x050B3A00
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x00C4 CAM_REQ_CMP_ERR
SIM QFRZN
AUTOSNS_VALID
PATH ID 1.
TARGET ID 2.
TARGET LUN 0.
CAM FLAGS x00000482
CAM_QUEUE_ENABLE
CAM_DIR_OUT
CAM_SIM_QFRZDIS
*PDRV_PTR x050B36A8
*NEXT_CCB x00000000
*REQ_MAP x01C3EE00
VOID (*CAM_CBFCNP)() x004E8CA0
*DATA_PTR x87F20000
DXFER_LEN x00004000
*SENSE_PTR x050B36D0SENSE_LEN x40
CDB_LEN x06
SGLIST_CNT x0000
CAM_SCSI_STATUS x0002 SCSI_STAT_CHECK_CONDITION
SENSE_RESID x2E
RESID x00003E00
CAM_CDB_IO x0000000000000020200E000A
CAM_TIMEOUT x0000003C
MSGB_LEN x0000
VU_FLAGS x4000
TAG_ACTION x20
----- CAM STRING -----
Error, exception, or abnormal
_condition
----- CAM STRING -----
ABORTED COMMAND - Target aborted
_command
----- ENT_SENSE_DATA -----
ERROR CODE x0070 CODE x70
SEGMENT x00
SENSE KEY x000B ABORTED CMD
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x00
ADDITION LEN x0A
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x00
CMD SPECIFIC 0 x00
ASC x47
ASQ x00
FRU x00
SENSE SPECIFIC x000000
ADDITIONAL SENSE
0000: 00000000 00000000 00000000 00000000 *................*
0010: 00000000 00000000 00000000 00000000 *................*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 7E250000 00005E3C 00000000 00000000 *..%~<^..........*
------------
Greg Roberts
Computer Systems Officer
Dept. of Electrical & Electronic Engineering
The University of Western Australia
NEDLANDS WA 6907 Australia
Ph : +61-08-9380-7366
Fax : +61-08-9380-1065
Email : gregr_at_ee.uwa.edu.au
Received on Wed Mar 08 2000 - 05:14:32 NZDT