vmunix hang... heap_thread: no space in heap - addendum from SSL_at_ALPHA.SUNQUEST.COM on 1996-03-20 (tru64-unix-managers)

From: <SSL_at_ALPHA.SUNQUEST.COM>
Date: Tue, 19 Mar 1996 22:39:53 -0700 (MST)

I am reposting the query below with a little more info:
I was able to get a crash dump, and took a look at the crash-data.0 file:
there was a single process group (6289) with 725 entries (all running ksh)
in the kdbx process table...
However, the process group head process(i.e.6289) does not show in this table!
(e.g no entry of in the process table for proc 6289 w ppid 1),
even though it does show in the first list (kernel process status list)
I am not yet sure if this is a kernel bug or if someone just started a crazily
respawning ksh???
-----------------------------------
hey, has anyone seen this one???

-DU 3.0
-alpha 2100
-128m memory, 392 m swap (lazy mode)
- 7 rz28's on 2 scsi busses, using both LSM and ADVFS

maxusers set to 256, maxuprc at 256
about 30-40 users on at time of hang - this is a new system,
and they will be running with at least twice that number of users when
it is up to speed.

running Mumps(the os/language, not the disease)
(note: mumps uses a couple big hunks of shared memory:
# ipcs -ma

Shared Memory:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME
m 0 5732 --rw------- root system root system 2 524288 279 969 18:36:39 18:36:38 17:50:38
m 1 438 --rw------- ix rep23 ix rep23 2 131584 546 547 17:51:08 no-entry 17:51:07
m 2 0 --rw-rw-rw- root system root system 4 387136 1073 1097 19:06:31 19:06:42 19:05:40
m 3 0 --rw-rw-rw- root system root system 4 5736640 1073 1097 19:06:31 19:06:42 19:05:40
m 4 0 --rw-rw-rw- root system root system 4 4737024 1073 1097 19:06:31 19:06:42 19:05:41

System hangs -
syslog mesgs at time of hang show:

Mar 19 17:36:15 sunlab vmunix: h_kmem_alloc_memory_: 0xffffffff8143a780: request
is stalled
Mar 19 17:36:15 sunlab vmunix: h_kmem_alloc_memory_: 0xffffffff813b03c0: request
is stalled
Mar 19 17:36:15 sunlab vmunix: heap_thread: no space in heap 0xfffffc00005fcfd0
->Mar 19 17:36:16 sunlab vmunix: Default heap is empty. Please increase the
->Mar 19 17:36:16 sunlab vmunix: configurable heappercent parameter and reboot.
Mar 19 17:36:16 sunlab vmunix: h_kmem_alloc_memory_: 0xffffffff814ce780: request
is stalled
Mar 19 17:36:16 sunlab vmunix: heap_thread: no space in heap 0xfffffc00005fcfd0
Mar 19 17:36:16 sunlab vmunix: h_kmem_alloc_memory_: 0xffffffff8143a780: request
is stalled

-----ok, vmunix, thanks for the suggestion about increasing heappercent,
but i guess I would like
a clue about what I'm doing before I do it....
So, I rtfm and the most informative entry is in "System tuning manual":
heappercent
is "the virtual size of the kernel heap expressed as a percentage of physical
memory...kernel data structures are allocated from the kernel heap. the kernel
heap wires physical memory as the kernel data structures are allocated."
hmmm...

So what is eating up this "heap"?
(and from my college logic class, if you take away 1,is it still a heap?;-)
should I increase it to 8%,? 9%?
How can I be sure that I actually need to increase it, and It is not
a result of some 'runaway' process... Is the problem related to ADVFS
requirements? (just a guess, but I like to blame things on ADVFS...)

TIA -

I will summarize!!!!!

Shanna Leonard
Unix Systems Specialist
Sunquest Information Systems
ssl_at_alpha.sunquest.com
Received on Wed Mar 20 1996 - 06:56:58 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT