SCSI problems (bus resets)

From: Thomas Strandenaes <thomas.strandenaes_at_adm.uit.no>
Date: Tue, 02 Mar 1999 19:17:27 +0100

Dear list,
we've been having problems with NSR 5.2 and performing clones (manual
clones - so this should not be the known 'automatic clone' problem with NSR
5.2) from a Overland LXS 4115 to a TZ87. Recently I discovered a number of
events on the SCSI bus that might be related to these problems. The
following is a typical 'uerf -R -r 100-199 -o full' entry (since I'm not
used to interpreting uerf-output, I'm including the whole entry):

===================================================

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 753.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Tue Mar 2 15:32:40 1999
OCCURRED ON SYSTEM ulysses
SYSTEM ID x00050009 CPU TYPE: DEC 2100
SYSTYPE x00000000

----- UNIT INFORMATION -----

CLASS x0001 TAPE
SUBSYSTEM x0000 DISK
BUS # x0000
                              x0018 LUN x0
                                        TARGET x3

----- CAM STRING -----

ROUTINE NAME ctape_iodone

----- CAM STRING -----

ERROR TYPE Soft Error Detected (recovered)

----- CAM STRING -----
DEVICE NAME DEC TZ87 (C) DEC9514

----- CAM STRING -----

                                        Active CCB at time of error

----- CAM STRING -----

                                        CCB request completed with an error
ERROR - os_std, os_type = 11, std_type = 10


----- ENT_CCB_SCSIIO -----

*MY ADDR x0FF37580
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x00C4 CAM_REQ_CMP_ERR
                                        SIM QFRZN
                                        AUTOSNS_VALID
PATH ID 0.
TARGET ID 3.
TARGET LUN 0.
CAM FLAGS x00000080
                                        CAM_DIR_OUT
*PDRV_PTR x0FF37228
*NEXT_CCB x00000000
*REQ_MAP x03D94140
VOID (*CAM_CBFCNP)() x0057F2F0
*DATA_PTR x400D4000
DXFER_LEN x00008000
*SENSE_PTR x0FF37250
SENSE_LEN x48
CDB_LEN x06
SGLIST_CNT x0000
CAM_SCSI_STATUS x0002 SCSI_STAT_CHECK_CONDITION
SENSE_RESID x1B
RESID x00000000
CAM_CDB_IO x00000000000000008000000A
CAM_TIMEOUT x0000012F
MSGB_LEN x0000
VU_FLAGS x0000
TAG_ACTION x00

----- CAM STRING -----

                                        Error, exception, or abnormal
                                         _condition

----- CAM STRING -----

                                        RECOVERED ERROR - Recovery action
                                         _performed

----- ENT_SENSE_DATA -----

ERROR CODE x0070 CODE x70
SEGMENT x00
SENSE KEY x0001 RECOVER ERR
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x00
ADDITION LEN x25
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x00
CMD SPECIFIC 0 x00
ASC x0A
ASQ x00
FRU x00
SENSE SPECIFIC x000000
ADDITIONAL SENSE
0000: 2E004000 01DBD498 00100101 8C000002 *._at_..............*
0010: 00000000 00030000 0045D903 00000000 *..........E.....*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 00000000 00000000 7E250000 00005E3C *..........%~<^..*
0040: 00000000 00000000 *........ *

===================================================

This typically occurs when the tape is being labelled and mounted in NSR,
and only with the TZ87 device. I've not seen this happen during backup, but
I'm not certain that it does not happen during backup operation. *However*,
NSR reports errors like this:

        "bus_reset() eei status dec: 12800, hex: 0x3200"

*during backup operation*. These errors cannot be seen in uerf/binerr.log!

The system is an AlphaServer 2000 5/300 with the backplane SCSI connected to
the internal cd-rom and the beforementioned two external tape drives (disks
are on separate SWXCR-controllers), and we've tried changing cables and
terminators. The total bus lenght should be less than three meters (approx.
2.5 meters in sheer cable, but it's hard to measure the 'cable lenght'
consumed by the devices so I'm not totally certain about this). Compaq
support has not yet been able to tell me if the maximum allowable bus lenght
is either 6 or 3 meters long yet for this internal bus (! - and the manual
is actually ambiguous on this).

It would be most helpful if anyone could suggest what's going on on this
SCSI bus, and help me interpret the 'uerf' output.

With regards,

--
Thomas Strandenęs
Computing centre
University of Tromsų
NORWAY
Received on Tue Mar 02 1999 - 18:16:58 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT