I recieved several great replies. I forgot to mention that the event
occurred during batch process instead of interactive processing, and the
real issue would appear to be from large disk queues of pending i/o waiting
for the current i/o to complete. We will continue to train the users to
spread the i/o out over more buses and devices.
I have several multiple cpu gs140's which the users are stating have
performance issue. We have a lot of collect data from before during and
after these so called periods of poor performance. During a normal event
there is usually more than 4 gb of available mem while the cpu are more than
80% idle. The homegrown application will read a few nfs mounted filesystems
and several local files, and create a output file. The 2 areas which I am
still investigating are disk i/o and network traffic. Both of these numbers
seem to bounce high for sustained periods during an event, but we still have
headroom as demostrated by the data and times when the users don't
complain.The network is comprised of kgpsa (gb nic's) getting really awesome
throughput 800-900 mb/s sustained and then sometimes even higher. On the
disk side, there can be 6-10 scsi busses, and the users are getting
trainning on spreading out the i/o currently. Two or three users will all
do there jobs either using the same disk or the same buses for both input
and output. But again, these disk are either hsz50's or HSG80's, and the
i/o will be substantial, but well under other spikes which show up in the
large volumes of data. Of course these user all want raid 5 devices even
though they admit raid 0 is a smigen better, we are again working on a
training issue here too. On a hsz50 we monitored today a i/o load of
8600mb/s from 1 raid5 disk device from 1 user, and 2 users got the i/o rate
from the same disk up to 9200 mb/s just before the first users job
completed. The i/o and network are the only numbers that even open a eye
for a preview, while mem & cpu remain flat. Running sys_check -perf does
not even really generate any earth shattering ideas. Any new ideas to get
me out of my rut? The os is 4.0F pk 6.
TIA
pmob
Received on Fri Feb 08 2002 - 16:05:35 NZDT