DLT2000XT and errors with 'mt fsf'

From: Lamont Granquist <lamontg_at_raven.genome.washington.edu>
Date: Wed, 09 Dec 1998 16:20:30 -0800

We've recently noticed a problem with the backups that we are taking from
a Digital AlphaServer 4100 (Digital Unix 4.0D, patch kit 2) to a
DLT2000XT. The backups complete fine, but early in the backup something
appears to be getting "corrupted" sometimes. Some of our incrementals
only have 5-10 accessable dump files on them (out of 70+ which should have
been written). Attempts to go past a particular "corrupted" dump file
with 'mt fsf' don't work and result in I/O errors. e.g:

% mt -f /dev/nrmt1h rewind
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
% mt -f /dev/nrmt1h fsf 1
/dev/nrmt1h fsf 1 failed: I/O error

Mounting the same tape on a HP-UX with a similar DLT2000XT drive results
in the same problem on the same file. Backups are being taken of remote
systems via roughly:

ssh host 'dump | ssh tapehost dd of=/dev/nrmt1h obs=65536'

Where "tapehost" is the AlphaServer. The remote hosts are various kinds,
roughly 70 partitions (and 70 files on the DLT tape) and a handful of
completely different O/Ses (Irix, Solaris, Linux, HP-UX, Digital Unix).

Questions:
  1. Could the drive be failing?
  2. Could the tapes be bad? (doubtful - one with only 5 or so passes failed)
  3. Could patch kit 2 be at fault?
  4. Is there a good way to automate verification of backups taken this way?

- and -

  5. Any suggestions on ways to improve performance by keeping data
     streaming to the DLT? I've though of putting a buffer program in
     front of 'dd' which would just spool up 32 megs or so of data and
     then feed it to 'dd' as fast as possible...

On the AlphaServer there are no errors during the writing process, but
when trying to read a backup tape the uerf error log records the following
(typical) event (associated with the "/dev/nrmt1h fsf 1 failed: I/O error"):

********************************* ENTRY 1. *********************************

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 16.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed Dec 9 10:02:16 1998
OCCURRED ON SYSTEM hoh
SYSTEM ID x00070016
SYSTYPE x00000000
PROCESSOR COUNT 4.
PROCESSOR WHO LOGGED x00000003

----- UNIT INFORMATION -----

CLASS x0001 TAPE
SUBSYSTEM x0000 DISK
BUS # x0003
                              x00E8 LUN x0
                                        TARGET x5

----- CAM STRING -----

ROUTINE NAME ctape_move_tape

----- CAM STRING -----

ERROR TYPE Hard Error Detected

----- CAM STRING -----

DEVICE NAME DEC DLT2000 8B37

----- CAM STRING -----

                                        Active CCB at time of error

----- CAM STRING -----

                                        CCB request completed with an error
ERROR - os_std, os_type = 11, std_type = 10


----- ENT_CCB_SCSIIO -----

*MY ADDR x9E039E80
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x00C4 CAM_REQ_CMP_ERR
                                        SIM QFRZN
                                        AUTOSNS_VALID
PATH ID 3.
TARGET ID 5.
TARGET LUN 0.
CAM FLAGS x000000C0
                                        CAM_DIR_NONE
*PDRV_PTR x9E039B28
*NEXT_CCB x00000000
*REQ_MAP x00000000
VOID (*CAM_CBFCNP)() x0057D140
*DATA_PTR x00000000
DXFER_LEN x00000000
*SENSE_PTR x9E039B50
SENSE_LEN x40
CDB_LEN x06
SGLIST_CNT x0000
CAM_SCSI_STATUS x0002 SCSI_STAT_CHECK_CONDITION
SENSE_RESID x23
RESID x00000000
CAM_CDB_IO x000000000000000100000011
CAM_TIMEOUT x00000E10
MSGB_LEN x0000
VU_FLAGS x0000
TAG_ACTION x00

----- CAM STRING -----

                                        Error, exception, or abnormal
                                         _condition

----- CAM STRING -----

                                        BLANK CHECK - No-data condition
                                         _occured

----- ENT_SENSE_DATA -----

ERROR CODE x0070 CODE x70
SEGMENT x00
SENSE KEY x0008 BLANK CHECK
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x01
ADDITION LEN x15
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x0E
CMD SPECIFIC 0 x30
ASC x00
ASQ x05
FRU x00
SENSE SPECIFIC x000000
ADDITIONAL SENSE
0000: 00D50080 00260E00 001C7939 00000000 *......&.9y......*
0010: 00000000 00000000 00000000 00000000 *................*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 7E250000 00005E3C 00000000 00000000 *..%~<^..........*


-- 
Lamont Granquist                       lamontg_at_raven.genome.washington.edu
Dept. of Molecular Biotechnology       (206)616-5735  fax: (206)685-7344
Box 352145 / University of Washington / Seattle, WA 98195
PGP pubkey: finger lamontg_at_raven.genome.washington.edu | pgp -fka
Received on Thu Dec 10 1998 - 00:21:24 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT