I received 4 replies from the following helpful folks:
speno_at_isc.upenn.edu
Nikola.Milutinovic_at_ev.co.yu
rjackson_at_portal.gmu.edu
oisin_at_sbcm.com
The original question, summarized, was: At times, our AS4100 running sendmail
[version 8.11.0, installed by us] starts to accumulate a lot of system
processes
with a process state of 'U' which, according to the man page for ps, means the
process is in an "Uninterruptable Sleeping Process" state. You can't kill -9
the
processes, and they never seem to clear. The only way to correct the problem
is
to reboot the system. We are running Tru64 4.0d. Whats going on?
The U processing state appears to be a 'special' kernel state where a process
must
not be interrupted, less data corruption could result. Processes are put
into this
state to prevent damage to the internal kernel data structures if the
operation being
performed is not completed.
The answer, at least I hope, is that Compaq has a patch for this! According
to
the patch Release Notes file (patch kit #7 for 4.0d), there is an AdvFS bug
that was
corrected. Heck, there are LOTS of AdvFS bugs that were corrected, so maybe
one of them will get this problem! :) The relavant text from the patch is
Fixes an operating system hang condition. The
hang condition exists due to processes
deadlocking in the AdvFS code.
So, tomorrow at dawn I will move out services onto our active cluster
production
server and hope for the best.
I was lead to this solution by the response from rjackson_at_portal.gmu.edu, who
sounds like he has a similar configuration to us (about 60,000 users and heavy
heavy sendmail activity). Investigation of the Compaq problem numbers he
provided
sure enough convinced us that it was time to install the patch kit.
Thanks to everyone on the list for your assistance.
Received on Wed Oct 18 2000 - 03:00:06 NZDT