Hello again !
At least we found that we are not alone: Arrigo Triulzi reports the
same problem and possible causes for this:
> Join the club! Have exactly the same problem, identical. Posted to
> the mailing list a number of months ago and got no reply. I am
> running 4.0G on a DS20E. My feeling, gut feeling, is that it is a
> memory issue. The reason si that the last time it happened I went in
> via the serial console and noticed that on a 1Gb machine I had < 16Mb
> free. As I am running lazy swap (ie. /etc/swapdefault ->
> /etc/swapdefault-) I suspect an out of memory. Basically the kernel
> thread picks up the NFS request, it tries to get memory but there
> isn't any. Lazy swap rules are that "1st who askes gets killed" but
> hey, this is a kernel thread, can't nuke it, so NFS hangs.
>
> The above is pure, unadulterated, 100% speculation ;-)
>
> Incidentally, it happens with any client, Linux or Sun running NFSv3
> or NFSv2 and even moving from UDP to TCP.
As we have very large memory layout (16GB), memory should not have
been the problem for us. But a couple of time I found that some
machines had only 16MB left, despite absolutely nothing was running
on them. A 'vmstat -P' immediately showed that all the memory was
eaten up by the UBC ! So a quick look into the dxkerneltuner/vm nailed
down the problem: ubc_maxpercent was 100, so a similar condition for
the kernel threads could theoretically occur (if the kernel does not
reclaim UBC pages dor itself).
So as a first test I set the UBC max cache to 85%, and we will see
in the next days if the problem recurres.
Thanks to Arrigo ! I will summarize when the situation stabilizes.
--
Dr. Udo Grabowski email: udo.grabowski_at_imk.fzk.de
Institut f. Meteorologie und Klimaforschung II, Forschungszentrum Karslruhe
Postfach 3640, D-76021 Karlsruhe, Germany Tel: (+49) 7247 82-6026
http://www.fzk.de/imk/imk2/ame/grabowski/ Fax: " -6141
Received on Wed Feb 05 2003 - 14:32:58 NZDT