Summary: The incredible expanding kernel

From: Saul Tannenbaum <stannenb_at_emerald.tufts.edu>
Date: Fri, 27 Oct 1995 10:47:24 -0400

I wrote:

>I have a Digital Unix 3.0 system that has a kernel that appears
>to be growing without bounds. This is a relatively garden variety
>mail/web/internet access system, FDDI connected, using all AdvFS.
>It is heavily used, with peak periods seeing 150+ simultaneous users.
>
>When booted, the kernel is your basic 30-40 megs. Over time, it keeps
>growing. Now, with 11 days of uptime, it is 210 megs. Good thing this
>system has 512 megs of memory...

[...]

I've now discovered that the kernel panics with a "zalloc" error
when it hits about 250 megs. Prior to that we start seeing FDDI
DMA errors, and performance becomes completely atrocious.
Amusingly, the POLYCENTER Performance Solution Performance Advisor
tool suggests that the application (in the case, the kernel idle thread) be
redesigned to consume less memory. I heartily concur. The report also
suggests that the kernel is consuming too much cpu time and is unbalancing
the load. A snippet of the report is attached below for anyone else who
might be entertained by this.

In responses, I've been asked:

- Do you run LAT? There are known memory leaks. I don't.

- Do you run XDM? If so, put "DisplayManager*terminateServer: true"
  in the xdm.config file. I don't think that's the problem, but I
  did it anyway.

- Do I run INN and NNTPLINK on this system? No.

- Do I use FDDI? It was asserted that there was a known memory leak
  in the FDDI drivers. I've asked Digital support about this
  and received no response.

I've also gotten one "me too" comment.

Any further responses would be more than welcome, as this is an
ongoing and serious problem.

Thanks to the following (and anyone else I might have missed) for the
their responses:

     "Pat Huber, krl sysadmin, 818-395-4693" <pat_at_krl.caltech.edu>
     jtalloen_at_epo.e-mail.com
     "Donald L. Ritchey" <dritchey_at_cecc.chi.gov
     nicolis_at_celfi.phys.univ-tours.fr
     Hugh Messenger <hugh_at_garply.com>
     "Scott Ruch - DTN 462-6082" <swr_at_unx.dec.com>
     System Janitor <hubcap_at_hubcap.clemson.edu>

     - Saul

CONCLUSION 1.
                                                                    {M0010}

      The application requires too much memory. Performance
      degradation occurred when the memory management subsystem
      had to access the disks for processes that need either data
      or code pages.

      Redesign the application or add memory.

      Total number of occurrences: 8

[...]
      EVIDENCE

                   Process w/ Largest Resident Memory Size
                 -------------------------------------------
Pagefile Free User Run Page Flt Resident
I/O Rate Pages Name Command Rate Pages Time
-------- ------ ---------- ------------- --------- --------
---------------
  25.773 19 root kernel idle 0.000 30903 26-Oct
10:14:00
 120.420 9 root kernel idle 0.000 31810 26-Oct
15:38:00
 178.182 20 root kernel idle 0.000 31586 26-Oct
15:42:00
 173.696 18 root kernel idle 0.000 31999 26-Oct
16:26:00
  33.590 9 root kernel idle 0.000 31972 26-Oct
17:18:00
[...]
-- 
Saul Tannenbaum, Manager, Academic Systems | "It's still rocket  
                stannenb_at_emerald.tufts.edu |    science" - Vint Cerf
Tufts University Computing and             |
                Communications Services    |http://www.tufts.edu/~stannenb
Received on Fri Oct 27 1995 - 16:55:10 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT