SUMMARY: paging/swapping improvements

From: Udo Grabowski <grabow_at_imk.fzk.de>
Date: Thu, 30 Sep 1999 14:13:55 +0200

Hello Managers !

The problem with swap and paging (original post see below)
is entirely solved, a warm thanks to Jeremy Hibberd and
Bryan Lavelle ! Thanks to Donn Aiken and Frank Wortner
who suggested to setup a mfs system and just to buy more
hardware, which, of course, would be the most effective
(though most costly) improvement...

The solution is to install patch kit #4 ,
wich includes a couple of improvements to scheduler/kernel/
malloc/sysconfig-base, and rebuild the kernel. The tuning I
did was in the right direction, additionally enabling vm-agressive
also helps. The memory-demanding application now runs fast
while constantly paging, vmstat 1 shows that page-ins and -outs
are now balanced (compare to original posting), and even when
approaching vm-swap-free-reserved no freeze of the system occurs.
The physical memory is also much more filled because the process
does not get swapped out entirely any more. Sorry that I forgot
to mention that we are using lazy swap mode, so the process is
indeed not at the limit of available space.
---------------------------------------------------------------
load averages: 0.10, 0.14, 0.12
58 processes: 1 running, 1 waiting, 10 sleeping, 46 idle
CPU states: 9.2% user, 0.0% nice, 4.4% system, 86.3% idle
Memory: Real: 230M/620M act/tot Virtual: 1303M/2364M use/tot Free: 16M

  PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
 1406 grabow 42 0 1756M 512M WAIT 4:01 11.70% <Kopra>
 1446 root 44 0 880K 344K sleep 0:06 0.90% <vmstat>
 1447 grabow 44 0 2712K 352K run 0:02 0.00% <top>
   21 root 44 0 1680K 49K sleep 0:02 0.00% <update>
  374 root 44 0 1848K 40K sleep 0:00 0.00% <snmpd>

Virtual Memory Statistics: (pagesize = 8192)
  procs memory pages intr cpu
  r w u act free wire fault cow zero react pin pout in sy cs us sy id
  2 70 25 72K 1746 4934 4434 0 4422 736 11 89 112 19 605 0 1 98
  2 70 25 72K 2053 4934 85 0 59 1 22 73 98 23 567 0 1 99
  2 71 24 72K 1978 4934 110 0 59 1 51 38 144 44 775 5 3 92
  2 71 24 72K 1887 4936 248 0 147 1 97 62 145 465 720 45 7 48
  2 71 24 72K 2016 4937 697 0 679 2 18 86 147 23 767 10 2 88
  2 71 24 72K 1991 4937 177 0 59 4 118 46 139 47 744 32 3 64
  2 71 24 72K 2024 4938 314 0 271 0 43 49 180 30 917 16 3 82
  3 71 23 73K 1051 4944 236 0 59 0 177 35 145 363 873 41 23 36
  2 71 24 72K 1802 4959 4699 0 4671 931 28 69 103 417 1K 20 18 62
  2 71 24 72K 2028 4959 199 0 190 125 9 86 123 56 683 33 2 65
  4 70 23 73K 767 4968 243 0 125 5 118 9 235 88 4K 50 28 22
  2 71 24 73K 1102 4977 3490 0 3469 354 21 231 97 416 1K 4 61 34
  2 71 24 73K 1134 4977 203 0 190 12 13 80 99 15 568 0 2 98
  2 71 24 73K 1115 4978 120 0 106 0 13 72 100 43 579 15 2 83
  2 71 24 73K 1152 4978 71 0 59 8 13 83 105 15 586 0 1 98
  2 71 24 73K 1188 4978 72 0 59 2 13 80 101 28 580 0 2 98
  2 69 26 73K 1212 4978 70 0 59 0 11 78 95 422 566 1 3 96
  2 71 24 73K 1249 4978 211 0 191 0 20 76 113 34 613 0 2 98
  2 71 24 73K 1301 4978 75 0 59 2 16 91 115 19 610 4 1 95
  2 71 24 73K 1357 4984 2427 0 2419 62 8 114 143 51 1K 26 22 52
  2 71 24 72K 1686 4984 73 0 59 0 14 105 125 295 635 5 2 93
-------------------------------------------------------------------
The answers in detail:
-------------------------
Well the similar kernel changes we made have not fixed our problem ( you
might remember my original posting about this ). We have come across a patch
which might be relevant (vm_perform_v40dbl11) which fixes a virtual memory
problem in DU 4.0D. Apparently it's included in pk#4 ( compaq tech support
says patch 640 apparently ). We have a Trucluster (v 1.5) system comprising
two AS8400 5/625 systems running Tru64 V4.0D each with 4GB memory and 12GB
swap. They are running Sap R3 v4.0B with an Oracle database ( v 8.0.4 ).
Current patch kit is #3.

  Patch for the fix :

BLITZ TITLE: DIGITAL UNIX V4.0D/E VM PERFORMANCE PATCH - which addresses the
manner in which the operating system was
managing VM resources (e.g., page swapping) for systems operating near the
lower limit of available virtual memory.

  We will be applying patch kit 4 shortly and I will keep you posted.

Jeremy Hibberd
-------------------------------------------------------------------
There are some virtual memory performance patches that I would suggest that
you get from Compaq and install. They force the system to start reclaiming
memory pages earlier and more aggressively. They are available for 4.0d
patch kit 3, not patch kit 2 (also available for 4.0e no patch kit). There
used to be one for PK2, but I don't know if it's available anymore. If you
have a software contract with Compaq, log a call and tell them you need the
vm performance patch for 4.0d and patch level. If you don't have a contract
you can pay for a service call on a per call basis.

Bryan
-------------------------------------------------------------------
My apologies if this sounds like a trite answer. It is not meant to be.

Were I in your position, I would seriously consider getting more memory. I
would also strongly consider increasing the amount of available paging
space. I think what you are seeing is a tremendous and sudden demand for
large amounts of memory scattered throughout the program. Given that your
program requires a large amount of virtual memory, its size of 1.7GB is
uncomfortably (in my opinion) close to your paging space size. If the page
space is fragmented, the system will have a difficult time servicing
requests for large amounts of additional paging space. Perhaps that is what
you are seeing here.

Sorry if this isn't the solution you are looking for, but I just think you
are trying to stuff too big a program into too little VM space. I wouldn't
mind being proved wrong, though. :-)

Frank
-------------------------------------------------------------------
I'm really lousy at this stuff. Do you have any money to spend to upgrade
the hardware? If so, I would go with the obvious. More RAM, faster disks.
Would you be able to set up a mfs (Memory File System) to change how your
process allocates memory? I have never done this, so I'm not sure it will
be of help. Probably depends on how your program is structured.

Donn Aiken
Regents College
====================================================================
My original post:
====================================================================
We have an application with a high memory demand (~ 1.7 GB).
Our Dec 500au 4.0D (Patchkit 2 applied) is equipped with
640 MB RAM and 2.2 GB swap space, user/proc limits 2.2 GB.
Because the system freezes (as also reported a few days ago here
on the list), which occurs when only vm-page-free-reserved pages
are left, I've modified the vm-section parameters with dxkerneltuner:

vm-page-free- target 2048 appr. 16MB
               swap 1664 13MB
               optimal 1536 12MB
               min 1024 8MB
               reserved 768 6MB

(some ubc tuning also occured as recommended here on the list and in
the docs). What we observe now is that paging starts as requested at
free-target, but then very quickly the process demands the rest of the
pages so that we still get down to the reserved limit -> FREEZE.
The second effect is that if hard swapping starts, the most demanding
process is swapped out -- just the one we would keep running on the
basis of paging :-< ...

So I tried to push limits of vm-page-free-target up to several thousands
of pages to start paging long before we are at the limit. But then
a sys_check complains that our limit is at 2048 pages. I did not
found the parameter to increase this value in the docs. Is it vm_max_wired ?
Will it help to increase this value ? What else do we have to push up to keep
our process running while paging, not swapping ?

Here are some stats from top and vmstat 1:
------------------------------------------------
load averages: 0.26, 0.06, 0.03 16:10:09
57 processes: 1 running, 1 waiting, 16 sleeping, 39 idle
CPU states: 0.0% user, 0.0% nice, 0.0% system, 99.9% idle
Memory: Real: 228M/621M act/tot Virtual: 995M/2364M use/tot Free: 12M

  PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
21751 grabow 42 0 1738M 14M WAIT 3:55 7.60% <Kopra> shortly before freezing
22965 root 44 0 864K 335K sleep 0:10 0.90% <vmstat>
  548 root 42 0 2648K 212K sleep 0:01 0.30% <pim>
21635 grabow 44 0 2640K 344K run 0:05 0.00% <top>
   21 root 44 0 1624K 57K sleep 0:03 0.00% <update>
-------------------------------------------------
  procs memory pages intr cpu
  r w u act free wire fault cow zero react pin pout in sy cs us sy id
  2 72 24 72K 1676 4797 540 0 60 256 480 0 483 417 2K 15 6 78
  2 72 24 73K 1186 4797 677 0 192 14 485 0 490 23 2K 13 6 82
  2 72 24 73K 922 4797 489 0 60 16 427 0 266 15 1K 6 3 90 <- before freezing
  5 69 24 73K 767 4797 265 0 60 4104 205 0 194 4K 42K 5 3 92 <- after freeze
  2 70 26 71K 3047 4801 376 0 193 4572 131 12 100 489 713 0 0 100
  2 73 25 71K 2890 4827 383 73 91 256 127 0 124 616 715 3 4 93
  2 72 24 71K 2793 4820 753 146 164 256 229 0 183 3K 922 5 6 88
  2 72 24 72K 2461 4820 448 0 60 256 314 0 347 15 1K 1 3 96
  3 71 24 72K 1982 4820 542 0 60 256 451 0 469 411 2K 10 7 83
  2 72 24 73K 1425 4820 757 0 196 13 560 0 573 25 2K 16 6 79
  2 72 24 73K 871 4820 626 0 60 1888 567 0 546 287 2K 16 8 77 <- before freezing
  8 65 24 73K 767 4820 201 0 60 4040 140 0 263 29K285K 2 3 96 <- after freeze
  4 69 24 73K 934 4796 406 1 252 13K 90 17 168 32K 1K 0 0 100 note the high
  2 71 24 73K 1053 4794 202 7 60 2184 102 24 135 293 833 0 4 96 context switch rate!
  2 70 24 71K 3088 4799 126 1 60 87 41 3 93 214 592 0 4 96
  2 70 24 71K 2786 4799 398 0 58 256 197 0 310 15 1K 1 4 95
  2 70 24 72K 2429 4799 583 0 58 256 435 0 353 18 1K 4 3 93
  2 70 24 72K 1965 4799 570 0 186 255 381 0 474 417 2K 12 7 80
  2 70 24 73K 1431 4805 578 0 58 255 520 0 526 25 2K 15 6 78
  2 70 24 73K 872 4805 616 0 58 143 558 1 576 15 2K 16 7 78 <- before freezing
  7 65 23 73K 767 4805 308 0 58 5562 249 0 283 28K287K 1 3 96 <- after freeze
  2 69 25 71K 3039 4804 407 8 249 13K 45 2 106 32K 796 0 0 100
  2 70 24 71K 2895 4810 279 0 58 256 136 0 173 59 862 1 3 96
  2 70 24 72K 2446 4816 632 0 58 256 436 0 441 15 2K 4 5 91
  2 70 24 72K 2063 4816 402 0 58 256 345 0 398 20 1K 8 4 88
  2 70 24 73K 1512 4816 605 0 58 256 545 0 545 427 2K 16 9 76
  2 70 24 73K 943 4816 746 0 190 208 556 0 576 15 2K 16 6 78 <- before freezing
-------------------------------------------------
Dr. Udo Grabowski email: udo.grabowski_at_imk.fzk.de
Institut f. Meteorologie und Klimaforschung II, Forschungszentrum Karslruhe
Postfach 3640, D-76021 Karlsruhe, Germany Tel: (+49) 7247 82-6026
http://www.fzk.de/imk/imk2/ame/grabowski/ Fax: " -6141
Received on Thu Sep 30 1999 - 12:15:59 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT