-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Good afternoon,
This is my first post to the list, so please pardon any "newbie" faux
pas I may commit.
I have the following DEC Alpha Cluster configuration that seems to be
experiencing I/O problems with the Raid Array. At least, that's
where I think the problem is - I can't seem to find any solid
evidence pointing anywhere in terms of error logs or alerts. Maybe
someone out there can help!
The systems were running normally (I say that based only upon
"experience", not emperical evidence) until Monday of this week.
Starting Monday, writes to the array are a factor of 10 slower. I
can't find any errors and the cabinet is happily humming along.
A matched pair of CPU's:
- ------------------------
Digital UNIX V4.0D (Rev. 878); Mon Mar 23 16:40:56 EST 1998
Digital UNIX TruCluster V1.5 (Rev. 270); 12/30/97 20:36
Digital UNIX V4.0D Worksystem Software (Rev. 875)
System Type: DEC1000A_5
Number of CPUs: 1 Type: EV56 Speed: 333 Mhz Cache: 2.0 MB Memory
size: 1024 MB
Raid Array 450 w/HSZ-50 (32MB Cache) filled with RZ28D-VW 2.1GB disks
KZPDA-AA "FWSE SCSI Card" in each system
Memory Channel Interface between CPU's
Systems are used for Oracle 7.3.4 Parallel Server, database on RAW
devices
Picture:
- ------- ------- -------
|CPU A| |CPU B| |RAID |
| |-MemCh-| | |ARRAY|
| | | | | 450 |
| | | | | |
|KZPDA|-SCSI--|KZPDA|-SCSI--|HSZ50|
- ------- ------- -------
KZPDA's are connected to each other, then to the Array (differential
SCSI).
Array has 4 RAID5 Raidsets defined, using disks as:
HSZ> show raidset
Name Storageset Uses Used by
- ----------------------------------------------------------------------
- --------
RAID1 raidset DISK410 D3
DISK510
DISK610
RAID2 raidset DISK110 D5
DISK210
DISK310
RAID3 raidset DISK100 D6
DISK200
DISK300
RAID4 raidset DISK420 D4
DISK520
DISK620
Question:
When I compare write time for a 16mb file to the local system disk in
the CPU cabinet to a filesystem on the array, I would expect it to be
a little slower due to the shared SCSI bus. But look at these times:
System Disk:
- ------------
#time dd if=/dev/zero of=foobar bs=16k count=1024
1024+0 records in
1024+0 records out
real 0.8
user 0.0
sys 0.7
Raid Aray:
- ----------
#time dd if=/dev/zero of=foobar bs=16k count=1024
1024+0 records in
1024+0 records out
real 46.4
user 0.1
sys 0.6
.8 seconds compared to 46.4?! That can't be correct.
Can someone with a similar configuration run this and see if it's
"normal"?
Anyone have any ideas? There have been no O/S related changes.
Minor database changes have been reversed. In fact, we've recovered
the system back 1 week (before performance degradation) with minimal
improvement.
Thanks.
Pete
-----BEGIN PGP SIGNATURE-----
Version: PGP for Personal Privacy 5.0
Charset: noconv
iQA/AwUBNmiD/j20lAOOvtjpEQI/mACeKwdTFuqu0Rxo23sm7SR5LwFNWbgAnAp2
0Bb+wf+m0kDxYWE5fUbPRnOF
=Gi7C
-----END PGP SIGNATURE-----
Received on Fri Dec 04 1998 - 21:53:57 NZDT