SUM: OSF/1 v2.1 C++ prog crash system w/h_kmem_alloc_memory_: allocation failures

From: Tien LH Mai <tienm_at_amath.washington.edu>
Date: Tue, 24 Jan 1995 09:36:02 -0800 (PST)

On Thu, 12 Jan 1995, Tien LH Mai wrote:

>
> Dear OSFers,
>
> I have a program written in C++ that solves a system of coupled ordinary
> differential equations (ODEs) for a prescribed amount of time. The program has
> variable resolution in that the number of ODEs to be solved, and hence the
> memory needs of the program, is not determined until runtime. The number of
> ODEs is approximately maxM*maxN, where maxM and maxN are two parameters read in
> at runtime from a file "inputs.dat". Using these two values, the code then
> builds data structures of the appropriate size. (For example, there is a class
> called "coeff" which has a data member which is a pointer to an array; this
> array has approximately maxM*maxN elements.)
>
> The prog crashes the system consistently, and each time I
> received crash-data.#, vmcore.# & vmunix.#. A quick scan of
> "crash-data" files, it looks like hw/memory problem. It shows the
> following errors consistently:
>
> h_kmem_alloc_memory_: allocation failures = 4500
> h_kmem_alloc_memory_: allocation failures = 4600
> panic: h_getfreehdr
> syncing disks... 4 4 4 1 done
> ..
> ..
> _proc_thread_list_begin:
> thread 0x86b5d8a0 stopped at [boot:1190 +0x4,0xfffffc000037a3bc]
> Source not available
> _proc_thread_list_end:
>
> Q1: Is this a known problem?
> Q2: Is there any recommendation?
>
> I'm running OSF/1 v2.1 w/96MB + 400MB swap


Thanks to "Dr. Tom Blinn, 603-881-0646" <tpb_at_zk3.dec.com>" for the
following response; which I applied Patch ID: OSFV20-073-3 and we're seem
to be back in business.


OK, I've looked at the source code for the routine that issues those
messages and that does the panic -- it's the same routine,
src/kernel/vm/heap_kmem.c -- and it's not obvious what's going wrong,
other than that something is allocating lots of kernel memory and not
freeing it. That's what's panic-ing the kernel.

There is a lot of change in this module between V2.0 and V3.0, so there
MAY
be a patch (or maybe not), but the edit history includes this comment:

> * Remove the previous change to vm_wait_for_space as it causes the heap
> * to block indefinitely in h_kmem_alloc_memory_(). Modify h_getfreehdr()
> * to try kernel_map when heap submap is full. Also issue warning message
> * when heap submap is empty.

My advice would be, if you've got access to a V3.0 system, try running your
stuff on it. And I think this MAY be your problem:

Patch ID: OSFV20-073-x

System panics after upgrading from V2.0 to V2.1 or build V2.1.
The panic is created by any program that repeatedly calls
munmap() and mmap() if mmap maps more than munmap unmaps.
Config file probably altered to include mapentries = large number.
This is a configurable parameter.

The program will either abort when it consumes all of swap space or
the system will panic: h_getfreehdr
Received on Tue Jan 24 1995 - 12:36:20 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT