I have 2 DS10 systems running TruCluster 5.1A. I've built my own version
of sendmail (8.11.6) in order to take advantage of SMTP AUTH and mail
access list control. Periodically, sendmail will refuse to accept
connections on account of the load average on the system exceeding
10. sendmail is only running on one of the cluster members, as it is a
non-cluster aware version. If I run "top" on the cluster member which has
sendmail running, I rarely notice anything out of the ordinary -- load
averages are 0.15 - 2.7, usually fluctionating at less than 1. When the
load average goes high, it usually lasts for one or 2 minutes at the most,
although once when I was away, it went high for 1 hour. When sendmail
refuses to accept connections, any user who attempts to send mail gets a
server unavailable message. My users are extremely unhappy when this
occurs, even though it is generally momentary. The Cluster members are
connected to a MA8000 single HSG80 controller storage.
The cluster was recently placed on-line to replace an old AlphaStation 600
5/266 running 4.0D. I never had this situation occur with the old
AlphaStation. Besides sendmail, the cluster is running samba, CAP, 2
instances of apache, which I have tuned to Max 6 servers each instance.
There are typically about 20 people logged on, mostly running pine.
Anyone know how I can eliminate the load average problem ? I've tried to
run ps when the load average is high, but I can't see anything causing the
problem. I've thought about trying to force all the other services: web,
samba, CAP to sit on the other cluster member.
Or is there some sendmail configuration that will have it ignore the load
average ?
Thanks for any help,
Dirk Kleinhesselink
System and Network Administrator
Keck Center for Integrative Neuroscience/Physiology
UCSF
Received on Tue Dec 04 2001 - 23:23:27 NZDT