Barracuda 9 and Fujitsu 9Gb drives crashing 3.2c system

From: Nick Hill - RAL DCI Systems Group <NMH1_at_axprl1.rl.ac.uk>
Date: Tue, 20 May 1997 14:35:52 +0100 (BST)

I have logged a call with DIGITAL about this but I was wondering if anyone on
the list had seen anything similar.

Machine: Alphaserver 8400
O/S: Digital UNIX 3.2c with various patches


I have a serious problem with trying to attach some new 9Gb SCSI disk drives
housed in 16bit storageworks shelves to my Alphaserver 8400. Testing the disks
results in the machine crashing. I have tried two different type of 9Gb drive
and the same problem occurs with both types. The disk drives I have tried are
SEAGATE Barracuda 9 (ST19171W) and Fujitsu M2949E-512. The Fujitsu disks are
the DIGITAL produced OEM disk part number SHUGA-ZZ.

The configuration is:

8400 <-> KFTHA I/O <-> DWLPA PCI <-> KZPSA FWD SCSI <-> 16bit Storageworks

I have tried using these disks under the Advfs file systems both as a single
disk domain and a two disk domain. The testing involves opening a file and
writing or reading 500Mb of data using the standard C calls write and read.

At some point during the tests or shortly after a test has completed (
sometimes can be initiated by a sync command) the console shows output
similar to the following (an exact output is not available as when the
machine crashes it is unable to produce a dump due to locked SCSI buses so
the message buffers are lost):

cam_logger: CAM ERROR packet
cam_logger: bus 2 target 4 lun 0
cdisk_reset_rec err
Recovery failed
Hard Error Detected
UNKNOWN
cam_logger: CAM ERROR packet
cam_logger: bus 2 target 5 lun 0
cdisk_reset_rec err
Recovery failed
Hard Error Detected
UNKNOWN


loads of advfs I/O errors


then loads of

hw_sg_alloc: rmalloc failed

then

KZPSA adapter misc error
pzaintr: KZPSA adapter misc error, ars=0x10, afar=0x0. afpr=0x617



cam_logger: CAM ERROR packet
cam_logger: bus 2
spo_misc errors
Adapter reinit failed
asr=0x200
cam logger: CAM ERROR packet
cam logger: bus 2
spo_adap reinit
Adapter State couldn't be set
pza_read_log_regs: KZPSA ...
cam_logger: CAM ERROR packet
cam_logger: bus 2
spo_misc errors
Adapter has died, must reboot to bring back to life

then a O/S panic:

simple lock: time limit exceeded


another time:

cam_logger: CAM ERROR packet
cam_logger: bus 7
spo_misc errors
Adapter miscellaneous error occurred, resetting adapter...

cam_logger: CAM ERROR packet
cam_logger: bus 7
spo_misc errors
SCSI bus is being reset due to severe errors
hw_sg_alloc: rmalloc failed
spo_map_load_ccb: spo_map_load_ccb failed
hw_sg_alloc: rmalloc failed

simple lock: time limit exceeded.


When the machines crashes the SCSI busses with the disks on are locked up and
the drive activity lights on the disks are permanently on.

I have SEAGATE ELITE 9 Gb drives on this machine and they have been working
fine most of the time for about 18 months. I have had the occasional Elite 9
disk lock up and the machine crash but on these occasions just the Elite disk
drops offline, the other disks on the SCSI bus are fine.

Should these disks work OK on my system. The Fujitsu disk is after all the
new DEC rz40 9Gb drive in a slightly different package. I need to resolve
this issue as I shortly need to buy lots more 9Gb drives to add to the system
and would like to know that they will work before spending the money!


Nick Hill

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DCI, Rutherford Appleton Laboratory, Tel: +44 (0)1235-445598
Chilton, Didcot, Oxon, OX11 0QX, England. Fax: +44 (0)1235-446626

N.M.Hill_at_rl.ac.uk http://www.cis.rl.ac.uk/people/nmh1/contact.html
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Received on Tue May 20 1997 - 16:12:58 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT