System lockup "task_create failed" from Martyn Johnson on 1996-08-30 (tru64-unix-managers)

From: Martyn Johnson <Martyn.Johnson_at_cl.cam.ac.uk>
Date: Thu, 29 Aug 1996 17:08:32 +0100

I have a 3000/500 running as a fileserver and WWW server. System running is
V3.2D-1 with patches up to the 23 July 1996 consolidated patch kit.

Occasionally, the system will jam up in the middle of the night, with the
console continuously printing:

fork/procdup: task_create failed. Code: 0x6
Unable to obtain requested swap space

Externally, the system is serving files over NFS perfectly happily, but no
other network services are working. I cannot log in on the console. I have to
force a core dump and reboot.

The analysis of the core dump shows that there is actually plenty of swap
space free:

_kdbx_swap_start:

       Swap device name Size In Use Free
-------------------------------- ---------- ---------- ----------
(null) 358832k 74704k 284128k
                                      44854p 9338p 35516p
-------------------------------- ---------- ---------- ----------
Total swap partitions: 1 358832k 74704k 284128k
                                      44854p 9338p 35516p
_kdbx_swap_end:

This is about normal for this system. As far as I know, nothing unusual is
going on at the point at which it fails. There are no "users" on the system -
it is purely a server. The list of processes in the dump analysis looks
normal.

I have tried reporting this to the support centre, but I can only get a
kneejerk reaction: "increase your swap space". Well, I could do that of
course, but since the amount I already have should be more than enough for the
workload, I have no basis for knowing how much to add! If there IS some
process demanding more swap space than is available, (a) I want to know what
it is, and (b) I want just that process to die, not the whole system!

Personally I'm not convinced that it is a swap space problem at all.

Does anybody recognise the symptoms and know what the true cause might be? Can
anybody think of anything I might examine in the core dump which might give me
some clues?

-- 
Martyn Johnson      maj_at_cl.cam.ac.uk
University of Cambridge Computer Lab
Cambridge UK

Received on Thu Aug 29 1996 - 18:38:53 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:47 NZDT