? virtual memory parameters - system hangs up

From: Kim Greer <klg_at_dec4.mc.duke.edu>
Date: Tue, 25 Jul 2000 16:08:11 -0400 (EDT)

Hi,

To sum up the problem: Free space as shown from vmstat gradually decreases
until the point that no more processess can start. The system is essentially
"frozen" or "locked up" then.

I've tried to include as much info as possible which makes for a rather long
message, because I realize you/we are not Clara Voyant.

Details:

  I have a 3000/700 machine with 320 MB memory and ~ 2 GB swap space (overkill
I'm sure). I recently did a full install from scratch of Tru64 5.0 on it,
replacing 4.0D. It ran great with 4.0D, but now I have some problem with
memory not getting freed up. I can't seem to find an error messages or anything
relevant in log files.

  Over the space of say 3-4 hours with it running very little non-OS programs,
it eventually shows "free" in vmstat going from 23K down to 600 or less, at
which time it cannot start new processes. Note that programs already running
seem to pretty much continue to run. Example: you can ping the machine,
getting an "alive" response, or get some banner info back from telnet (this is
at times of very low "free" values) but which does not respond with the
"login:" prompt. rsh commands from other computers are unable to execute; dumps
continue to dump.

  I had trouble from the beginning with the default parameters in sysconfigtab
instituted after initial OS install setup, which I have several times told
dxkerneltuner to reset to default. I'm not really certain that it did,
though, after comparing the current sysconfigtab file with various versions
saved in /etc. (I surmise that "restore" only refers to answers present at
startup of dxkerneltuner, which are changed during that session only).

  I've read through the "System Configuration and Tuning" docs at
http://tru64unix.compaq.com/faqs/publications/base_doc/DOCUMENTATION/V50_HTML/
... calculating values that seem in line with default values. I've searched
through http://www-archive.ornl.gov:8000 without finding an answer in general
for 5.0, or for patches.

  Both "vm_aggressive_swap = 1" and "vm_aggressive_swap = 0" have been tried
in combination with "vm_swap_eager = 1". I'm getting ready to try
        "vm_swap_eager = 0" with
        "vm_aggressive_swap = 0"


Note that I get:
        vmstat:
        Virtual Memory Statistics: (pagesize = 8192)
          procs memory pages intr cpu
          r w u act free wire fault cow zero react pin pout in sy cs us sy id
          5 141 27 42K 8998 4318 491K 55K 343K 1306 49K 0 553 1K 389 2 18 80

at the same time "dxsysinfo" says that I have "In-Use Memory" = 59% and "Available
Swap" = 94%. I would have thought that 8.99K divided by 23K (or 27K *immediatley*
after reboot) would have indicated something more like 38% or 33% - but maybe its
not that simple.

At a system reboot, "vmstat 60" shows:
vmstat 60
Virtual Memory Statistics: (pagesize = 8192)
  procs memory pages intr cpu
  r w u act free wire fault cow zero react pin pout in sy cs us sy id
  3 100 22 19K 23K 3078 122K 26K 62K 1281 15K 0 632 440 225 14 12 74
  2 101 22 19K 23K 3173 73 86 153 0 142 0 71 285 134 3 3 94
  3 100 23 19K 22K 3195 370 52 147 0 89 0 236 300 402 2 5 92
  2 103 22 19K 23K 3210 2013 443 565 8 382 0 344 349 601 1 8 91
  2 103 22 19K 22K 3321 1063 226 394 0 154 0 46 132 91 0 3 97
  2 102 23 20K 22K 3353 1924 386 601 0 386 0 161 317 235 1 6 93
  2 103 22 20K 22K 3335 481 77 151 0 62 0 64 156 123 2 3 95
  2 103 22 21K 22K 3350 1893 227 962 0 229 0 68 134 110 5 4 92
  2 104 22 21K 22K 3350 795 212 173 0 164 0 18 87 77 0 3 97
  2 103 22 21K 22K 3351 1295 183 618 0 198 0 32 147 95 5 3 92
  2 103 22 21K 22K 3351 743 193 275 6 118 0 25 100 85 0 3 97
  2 104 23 21K 21K 3369 491 33 47 0 48 0 15 210 74 3 3 95
  2 106 22 21K 21K 3377 894 313 276 0 349 0 34 165 85 2 3 95
  2 107 23 22K 21K 3417 1092 236 327 0 238 0 296 228 521 1 7 91

showing that "free" is disappearing rather quickly. (These are at 60 second
intervals, as you can see from above command of "vmstat 60").

=======================================================
The system:

from "uerf -r 300":
EVENT CLASS OPERATIONAL EVENT
OS EVENT TYPE 300. SYSTEM STARTUP
SEQUENCE NUMBER 0.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Tue Jul 25 10:49:25 2000
OCCURRED ON SYSTEM dec4
SYSTEM ID x00060004 CPU TYPE: DEC 3000
SYSTYPE x00000000
MESSAGE Alpha boot: available memory from
                                         _0x1798000 to 0x14000000
                                        Digital UNIX V5.0 (Rev. 910); Wed Jul
                                         _5 14:13:51 EDT 2000
                                        physical memory = 320.00 megabytes.
                                        available memory = 296.42 megabytes.
                                        using 1221 buffers containing 9.53
                                         _megabytes of memory
                                        Firmware revision: 7.0
                                        PALcode: UNIX version 1.45
                                        DEC 3000 - M700
=======================================================

syslog files:

vmunix: Alpha boot: available memory from 0x1798000 to 0x14000000
vmunix: Digital UNIX V5.0 (Rev. 910); Wed Jul 5 14:13:51 EDT 2000
vmunix: physical memory = 320.00 megabytes.
vmunix: available memory = 296.42 megabytes.
vmunix: using 1221 buffers containing 9.53 megabytes of memory
vmunix: Firmware revision: 7.0
vmunix: PALcode: UNIX version 1.45
vmunix: DEC 3000 - M700

vmunix: vm_swap_init: swap is set to eager allocation mode
vmunix: datalink: links=128, macs=6

=======================================================

ps -o vsz
  VSZ
2.70M
2.70M

=======================================================
output of sysconfig -q vm:

vm:
ubc_minpercent = 10
ubc_maxpercent = 100
ubc_borrowpercent = 20
vm_max_wrpgio_kluster = 32768
vm_max_rdpgio_kluster = 16384
vm_cowfaults = 4
vm_segmentation = 1
vm_ubcpagesteal = 24
vm_ubcfilemaxdirtypages = -1
vm_ubcdirtypercent = 10
ubc_maxdirtywrites = 5
vm_ubcseqstartpercent = 50
vm_ubcseqpercent = 10
vm_csubmapsize = 1048576
vm_ubcbuffers = 256
vm_syncswapbuffers = 128
vm_asyncswapbuffers = 4
vm_clustermap = 1048576
vm_clustersize = 65536
vm_syswiredpercent = 80
vm_inswappedmin = 1
vm_page_free_target = 128
vm_page_free_swap = 74
vm_page_free_hardswap = 2048
vm_page_free_min = 20
vm_page_free_reserved = 10
vm_page_free_optimal = 74
vm_swap_eager = 1
swapdevice = /dev/disk/dsk0b, /dev/disk/dsk1b, /dev/disk/dsk6b
vm_page_prewrite_target = 256
vm_ffl = 1
ubc_ffl = 1
vm_rss_maxpercent = 100
anon_rss_enforce = 0
vm_rss_block_target = 74
vm_rss_wakeup_target = 74
dump_user_pte_pages = 0
kernel_stack_pages = 0
vm_min_kernel_address = 18446744071562067968
malloc_percpu_cache = 1
vm_aggressive_swap = 1
new_wire_method = 1
vm_segment_cache_max = 50
vm_page_lock_count = 0
gh_chunks = 0
gh_min_seg_size = 8388608
gh_fail_if_no_mem = 1
private_text = 0
vm_page_private_color = 0
private_cache_percent = 0
gh_keep_sorted = 0
gh_front_alloc = 1
delayed_swapon = 0
enable_yellow_zone = 0

Thanks for any help you can give. I really need to have this machine running
for more than a couple of hours at a time. Reboots are causing some nasty
problems for users, especially when the system freezes during edit sessions or
long simulation or image reconstruction jobs.

KG


Kim L. Greer klg_at_dec3.mc.duke.edu
Duke University Medical Center voice: 919-684-7223
Div. Nuclear Medicine POB DUMC-3949 fax: 919-684-7123
Durham, NC 27710
Received on Tue Jul 25 2000 - 20:10:35 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT