Good day to all.
Well I received some very quick, thorough and helpful replies from a
number of people. To them I give much thanks, they are :
Martin Dusek
Kevin S. Partin
Marco Luchini
Alan Davis
Dr. Thomas P. Blinn
Dr PH Ng
Bruce Kelly
alan_at_nabeth
Well, since performance tuning is a dark and unpredictable science the
responses were somewhat varied. I have attached a file with all replies and
recommendations I received for any who are interested. Having identified
which kernel parameters are possibly the cause, our approach will be to
change a few at a time and observe system behaviour. The first three we are
changing to be more suitable for a larger memory system ( ours are currently
set at default )are as per Martin Dusek's excellent and thorough advice :
....default UNIX kernel memory parameters are unusable with large
memory usage. Main reasons:
1. system begins to page when the number of free pages gets under
vm-page-free-target (default 128), but this paging is very
"slow" - not aggresive and it cannot keep pace with memory
demands. So free pages get immediately under vm-page-free-min
2. under vm-page-free-min (default 20) the paging becomes very
aggressive but it is too late. Free memory otfen reaches
vm-page-free-reserved
3. under vm-page-free-reserved (default 10) only privileged tasks
cen get memory until it is freed
Also have been in contact with Compaq technical support, they have come up
with a very good and relevant document "Hang: Avoiding System Hangs Due to
Low Free Memory Pages " which I have attched. The information within
basically supports changes to kernel parameters as per Martin's
recommendation and also gives suggested values.
Hopefully these changes will solve the problem ( we will also be adding
more memory at some point as well ) - if they do not I will try some other
parameters and post changes and results.
Thankyou again to all.
Cheers
Jeremy Hibberd
-----Original Message-----
From: Hibberd, Jeremy
Sent: Thursday, September 16, 1999 10:42 PM
To: tru64-unix-managers_at_ornl.gov
Subject: System freeze - memory bottleneck
Hello All.
Been monitoring this list for some time, never posted before so here goes.
We have a Trucluster (v 1.5) system comprising two AS8400 5/625 systems
running Tru64 V4.0D each with 4GB memory and 12GB swap. They are running Sap
R3 v4.0B ( patch level ) with an Oracle database ( v 8.0.4 ). From time to
time ( every few days ) if a large report gets generated ( ie 1GB ) we see a
memory bottleneck ( vmstat ) with a high rise in page outs ( basically from
0 up to > 10K ). It is at this point that the system "freezes" for up to 25
minutes - no-one can login can't even get system console. During this time
nothing gets logged by any processes - we have vmstat and iostat cron jobs
running every 5 minutes, system monitoring with patrol - all logs show last
entry just prior to the freeze and start logging again just after the system
unfreezes. As such we are a bit in the dark as to what actually happens
during the freeze. Other resources seem fine during the freeze - memory
paging is the only one that goes through the roof.
Things are getting a bit hot regarding this issue and I would really
appreciate anyone's opinions/thoughts/abuse/anything.....
Thanks in advance
Jeremy
PS - Sorry if I have missed anything etc... Give ma a yell if I can provide
anything - stats etc.
Received on Wed Sep 22 1999 - 00:36:51 NZST