System paralysed due to "Unable to obtain requested swap space"

From: Irene A. Shilikhina <irene_at_alpha.iae.nsk.su>
Date: Tue, 01 Dec 1998 14:32:27 +0600 (NSK)

This morning when I came to my Alpha a bitter surprise was waiting for me...
At once, I noticed a particular sound produced by a disk.
All the dxconsole window was full of such messages:

Dec 1 09:01:10 alpha vmunix: swap space below 10 percent free
Dec 1 09:01:11 alpha vmunix: Unable to obtain requested swap space

(I have to tell at once that we have an immediate swap mode since our
only swap partition is 160 Mbytes).

I wasn't able to get a prompt in my DECterm but at last managed to log in
from an alphabetical terminal, take a look at the processes requiring the
system resources most and execute "su" (though it was awful!). Nevertheless,
I FAILED in trying to kill any of these processes having got nothing but
these messages plus another one:

Dec 1 09:17:11 alpha vmunix: fork/procdup: task_create failed. Code: 0x6

I struggled against the situation for around 20 mins but to no avail...
Nothing remained for me (in my understanding) but a drastic step - trying
sync (without success either) and ... pressing Off switch...

For THIS TIME, the things has settled without losses though I'm conscious of
all possible damage as result of such an act, while I'd like to keep out of
danger.

My analysis showed the user whose activity was the direct source of the
trouble - the time of the first system messages is coincident with starting
(last night!) a number of copies of a computation task requiring great
resources. (I've already had a conversation with him instructing to do
the polite with respect to the system...) Well, what warries me much more
is what I can do to avoid such things. For the moment, we cannot afford
additional swap partition. So, I have two questions:

 - why didn't the system refuse the first task having created troubles with
swap space (immediate mode!) or didn't drop any process itself in such
situation? According to time stamps, the problem arose in starting the
first program;
 - how can I avoid it in the future?

Additional information:
DU 3.2c and DEC 2000 model 300 (yes, they are old, I know).

Thanks,
Irene

P.S. Of course, I know meaning of all these messages, and it's not the first
time that we experience the deficiency of swap space, but the matter is
why it came to a clinch.

*************************************************************************
* *
* Irene A. Shilikhina e-mail: irene_at_alpha.iae.nsk.su *
* System administrator, *
* Institute of Automation & Electrometry, *
* Siberian Branch of Russian Academy of Sciences, *
* Novosibirsk, Russia *
* http://www.iae.nsk.su/~irene *
*************************************************************************
* * *
* The road to hell is paved with * Every cloud has a silver lining. *
* good intentions. * *
* * *
*************************************************************************
Received on Tue Dec 01 1998 - 08:33:42 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT