Dear all,
I have recently bought a DEQ KZPCM dual-port Ultra/Wide SCSI adapter
and a couple of DEC DS-RZ1EF-VW U/W disks. The disks are installed in
a DS-BA656 deskside shelf with a 180Watt power supply, and the
controller is in a 433 MHz PC164 rig running DU4.0D.
Some years ago I wrote my own disk exerciser which writes random
data of random recordlengths to random positions on the raw disk,
reads it back and compares the results. I use this exerciser
to test new disks before I install them in my main systems,
and did just that with the above controller/disk pair.
I found that under certain circumstances, the compare failed,
suggesting that data was either mis-written or mis-read.
The last byte of a transfer was delivering zero instead of the
correct value. No errors were logged anywhere.
The test fails repeatedly at the same point, on either of the
two disks, even with a replacement controller.
The fault goes away if I swap the controller for a different
type (KZPBA single-port U/W) or if I change the disk (a 3rd-party IBM job).
I'm bothered that this may be a timing problem within the controller,
and am reluctant to use it for fear of ending up with corrupt
filesystems.
Because this is device is deemed to be 'dead-on-arrival',
i.e. has never worked, rather than 'failed-in-service', DEQ
refuse to handle the problem, even though the device is under warranty,
and insist that I pursue it through my vendor.
If anyone out there has this controller/disk combination on their
system and is willing to spend an hour overwriting a filesystem
with -1, would you be prepared to try the attached mini-test?
Somehow I have to determine whether its them or me.
TIA,
Terry.
Terry Horsnell (tsh_at_mrc-lmb.cam.ac.uk)
Computer Manager
Medical Research Council
Lab of Molecular Biology
Hills Road
CAMBRIDGE CB2 2QH
U.K.
================================================================================
/*********
********************************************************************************
BEWARE. THIS PROGRAM WRITES TO RAW DISK DEVICES AND CAN DESTROY YOUR FILESYSTEMS
********************************************************************************
I have a program which does writes of random data with random
record lengths to random startblocks on the raw disk device,
then reads it back and compares it. It is meant as a disk excerciser.
I discovered that under certain circumstances, the
test fails when run with DEC DS-RZ1EF-VW U/W disks (which seem to announce
themselves as RZ2EA-LA at boot time) on a KZPCM dual-port U/W
SCSI adapter. The failure happens on each of the two available DEC
disks, on either channel of the adapter, but not on an IBM Ultra/Narrow
disk. The failure also disappears if the KZPCM is replaced by a KZPBA adapter.
I've put together this small C program which reproduces
the problem on my machine. It seems to occur at certain
record-length/start-block combinations, is independant of the
data-content and seems to require two transfers in sequence to cause the fault.
The test occasionally succeeds, but almost always fails.
T. Horsnell (tsh_at_mrc-lmb.cam.ac.uk)
Lab of Molecular Biology
Hills Road
Cambridge CB2 2QH
UK
Phone +44 (0)1223 248011
Fax +44 (0)1223 213556
Scenario:
Mar 4 13:12:53 test vmunix: Alpha boot: available memory from 0x9cc000 to 0x7f16000
Mar 4 13:12:53 test vmunix: Digital UNIX V4.0E (Rev. 1091); Thu Mar 4 13:04:37 GMT 1999
Mar 4 13:12:53 test vmunix: physical memory = 128.00 megabytes.
Mar 4 13:12:53 test vmunix: available memory = 117.95 megabytes.
Mar 4 13:12:53 test vmunix: using 483 buffers containing 3.77 megabytes of memory
Mar 4 13:12:53 test vmunix: Digital AlphaPC 164 432 MHz
Mar 4 13:12:53 test vmunix: Firmware revision: 4.9
Mar 4 13:12:53 test vmunix: PALcode: Digital UNIX version 1.22
Mar 4 13:12:53 test vmunix: pci0 at nexus
Mar 4 13:12:53 test vmunix: fta0 DEC DEFPA FDDI Module, Hardware Revision 0
Mar 4 13:12:53 test vmunix: fta0 at pci0 slot 5
Mar 4 13:12:53 test vmunix: fta0: DEC DEFPA (PDQ) FDDI Interface, Hardware address: 08-00-2B-B9-FF-18
Mar 4 13:12:53 test vmunix: fta0: Firmware rev: 2.46
Mar 4 13:12:53 test vmunix: trio0 at pci0 slot 6
Mar 4 13:12:54 test vmunix: trio0: S3 Trio64V+ (SVGA) Plug-N-Play, 2.0 Mb
Mar 4 13:12:54 test vmunix: psiop0 at pci0 slot 7
Mar 4 13:12:54 test vmunix: Loading SIOP: script c0000b00, reg 82120000, data 405daa70
Mar 4 13:12:54 test vmunix: scsi0 at psiop0 slot 0
Mar 4 13:12:54 test vmunix: rz0 at scsi0 target 0 lun 0 (LID=0) (FUJITSU M2954S-512 0147)
Mar 4 13:12:54 test vmunix: isa0 at pci0
Mar 4 13:12:54 test vmunix: gpc0 at isa0
Mar 4 13:12:54 test vmunix: ace0 at isa0
Mar 4 13:12:54 test vmunix: ace1 at isa0
Mar 4 13:12:54 test vmunix: lp0 at isa0
Mar 4 13:12:54 test vmunix: fdi0 at isa0
Mar 4 13:12:54 test vmunix: fd0 at fdi0 unit 0
Mar 4 13:12:54 test vmunix: pci1000 at pci0 slot 9
Mar 4 13:12:54 test vmunix: itpsa0 at pci1000 slot 0
Mar 4 13:12:54 test vmunix: IntraServer ROM Version V2.0 (c)1998
Mar 4 13:12:54 test vmunix: scsi1 at itpsa0 slot 0
Mar 4 13:12:55 test vmunix: itpsa1 at pci1000 slot 1
Mar 4 13:12:55 test vmunix: IntraServer ROM Version V2.0 (c)1998
Mar 4 13:12:55 test vmunix: scsi2 at itpsa1 slot 0
Mar 4 13:12:55 test vmunix: rz16 at scsi2 target 0 lun 0 (LID=1) (DEC RZ2EA-LA (C) DEC N1H1) (Wide16)
Mar 4 13:12:55 test vmunix: rz17 at scsi2 target 1 lun 0 (LID=2) (DEC RZ2EA-LA (C) DEC N1H1) (Wide16)
Mar 4 13:12:55 test vmunix: tu0: DECchip 21140: Revision: 2.2
Mar 4 13:12:55 test vmunix: tu0: auto negotiation capable device
Mar 4 13:12:55 test vmunix: tu0 at pci1000 slot 2
Mar 4 13:12:55 test vmunix: tu0: DEC TULIP (10/100) Ethernet Interface, hardware address: 00-06-2B-00-23-F9
Mar 4 13:12:55 test vmunix: tu0: auto negotiation off: selecting 100BaseTX (UTP) port: half duplex
Mar 4 13:12:55 test vmunix: lvm0: configured.
Mar 4 13:12:55 test vmunix: lvm1: configured.
Mar 4 13:12:55 test vmunix: kernel console: trio0
Mar 4 13:12:55 test vmunix: dli: configured
Mar 4 13:12:55 test vmunix: vm_swap_init: warning /sbin/swapdefault swap device not found
Mar 4 13:12:55 test vmunix: vm_swap_init: swap is set to lazy (over commitment) mode
Mar 4 13:12:55 test vmunix: fta0: Link Unavailable.
Mar 4 13:12:55 test vmunix: fta0: Link Available.
Mar 4 13:13:05 test vmunix: Environmental Monitoring Subsystem Configured.
Mar 4 13:13:12 test vmunix: SuperLAT. Copyright 1994 Meridian Technology Corp. All rights reserved.
***********/
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#define BUFLEN 65536
/* Set your raw test device here: */
/*** #define DEV "/dev/rrz16c" ***/
int fd;
int i;
char inbuff[BUFLEN];
char outbuff[BUFLEN];
void rwc(int startblock, int nbytes)
{
printf("startblock=%d, nbytes=%d\n",startblock,nbytes);
if (lseek(fd, (off_t)512*(off_t)startblock, SEEK_SET) < 0)
{
perror("Write-seek failed");
exit(1);
}
if (write(fd,outbuff,(size_t)nbytes) != nbytes)
{
perror("Write failed");
exit(1);
}
if (lseek(fd, (off_t)512*(off_t)startblock, SEEK_SET) < 0)
{
perror("Read-seek failed");
exit(1);
}
if (read(fd,inbuff,(size_t)nbytes) != nbytes)
{
perror("Read failed");
exit(1);
}
for (i=0; i<nbytes; i++)
{
if (inbuff[i] != outbuff[i])
printf("Cmp failed. i=%d wrote %d read %d\n",i,outbuff[i],inbuff[i]);
}
}
main()
{
for (i=0; i<BUFLEN; i++)
{
inbuff[i]=0;
outbuff[i]=-1;
}
if ( (fd=open(DEV,O_RDWR,0)) < 0)
{
perror("Open failed");
exit(1);
}
rwc(2140067,56563);
rwc(77263,25089);
}
Received on Tue Mar 09 1999 - 16:16:18 NZDT