A: where has all the memory gone?

From: Ian 'Ivo' Veach <ivo_at_scsr.nevada.edu>
Date: Mon, 23 Oct 2000 15:07:43 -0700 (PDT)

Greetings -

It seems as if my "problem" can be explained primarily by the ubc taking
up a fair amount of memory for buffers/etc - as it should do for a fairly
normal system. Considering the function of this machine, it's not
surprising that it fluctuates.

The system does seem to bonk a bit when free memory gets down to zero;
However, most of the time, ubc usage amount stays fairly constant under a
heavy app, and I assume it's because the app is simply taking over some of
the ubc buffers for its own use. The system rarely goes to swap, so I
think the "slowdown factor" is simply waiting for some ubc buffers to
flush before they can be reused.

Thanks much to Dr. Tom and Alan (from Compaq) who quickly and immediately
nailed it on the head and gave a quick lesson. You guys must get tired of
"winning first place" all the time on this list. 8^) Thanks also to:

        Tom Webster
        Nick Hill
        Ecce Potestas Casei
        Jim Belonis

Who either offered up the same useful explanations, or alternative
information.

Ivo
elric_at_nevada.edu

=====================================================================
ORIGINAL QUESTION
=====================================================================

I've got an AlphaServer 1000A, that is running 4.0F. It has 576M of
memory, and is primarily running Apache + support (perl / PHP / cronolog /
cgiwrap / etc) + multiple WebCT instances + a couple java based
applications.

In running 'top' to casually note heavy processes and load, I've noticed
that the "Free" memory reported tends to vary wildly. That is, sometimes
it has been 230MB free, sometimes it has been 37MB free (as it is now).
That would lead me to believe that there is one or more processes taking
up all that memory. But if I use 'ps aux' I cannot see anything remotely
definitive of where the memory is actually being used. Although I've seen
a general trend of decreasing memory, it has also gone back up.

So, my question is, is there a better way to tell what is going on? The
only thing I've found so far is vmstat, which is a little cryptic (time to
crack open the man page I guess) and doesn't seem to report memory usage
by process. Is top lying? Is there a memory leak (and how would I find
that)? Any pointers to get me started would be appreciated. Thanks in
advance!!

=====================================================================
RESPONSE (plus more correspondance)
=====================================================================

You seem to be assuming that the varying free memory is indicative of
some problem. It may not be.

Among the things Tru64 UNIX does in memory management is attempting to
use memory as a file system cache (sometimes called the unified buffer
cache). File pages are read into memory, if they are modified, they
are eventually flushed out of memory back into the file system, if they
are not modified, they may sit in memory for a long time until there is
a need for memory for some other purpose, then they may get discarded.
None of this shows up in user process memory accounting. Plus, there
are buffers for active I/O to raw devices, buffers for networking, and
all kinds of other uses of memory.

It sounds to me like you've got enough physical memory so that you've
got some free most of the time. That's good. And that's about all you
really should worry about -- you can't really control most of what is
being done inside the kernel's virtual memory subsystem unless you're
much more knowledgeable about Tru64 UNIX kernel internals than most
people are (including myself).

Do read the "vmstat" reference page. You might discover "vmstat -P"
which shows you what physical memory is being used for on the system
at the moment you issue the command. I suspect that the output that
"top" is generating is based on the data that you can examine in much
more detail with "vmstat -P".

> If most of that was being used by the ubc, and I requested memory,
> do you know what kind of performance hit I would see? The reason I
> started worrying about this in the first place is that I was indexing a
> website (swish-e, very memory intensive). I could watch it eat down the
> memory until top reported k memory free instead of M memory free - and
> accordingly, swish-e would really slow down. Maybe this has nothing to do
> with the ubc, and it was simply bonking because it was simply out of
> memory (although swap didn't seem like it was being used). In some cases,
> swish-e would core, claiming memory misallocation, and that was a concern,
> but my next thought was how to start with more memory free, which is how I
> got here: wondering where all that memory was going to.

This *is* interesting.. Perhaps swish-e has problems with memory. I
have no experience with that program. What happens when you get lots
of memory demand is that pages that have been read in from the file
system get discarded if they have not been modified. Doing this, of
course, takes time. Depending on how swish-e is implemented and what
it is really doing, you may not get swapping activity, it may be that
all of the paging is coming in from the file system. Usually, you can
see this happening with the "vmstat" snapshot (the normal display).

In any case, good luck in your investigations, I would be curious to
know what you figure out.

> I'll look into what you said and let you know. I did note
> however, that top shows 270MB free memory now (instead of 37MB), and
> vmstat -P accordingly shows remarkably less pages being used for ubc (and
> more free pages). So maybe ubc is just eating up the free pages and it is
> taking time to deallocate them (large files being written to disk via
> pages?).

If file pages are memory mapped, they live in the UBC as well as in the
process' address space, but their HOME is the file system; if they get
modified, they still live in the UBC, and when memory is tight, any that
have not been accessed "in a while" get forced back out to the file system
but this takes time. That may well be what you are seeing. It doesn't
show up as swapping activity, it's file system I/O.

=====================================================================
RESPONSE
=====================================================================

        Tru64 UNIX, like most modern UNIX systems, uses a unified
        buffer cache where all of available memory can be used
        for file system buffers, given the right load. Reading
        or writing a large file will use up lots of pages, but
        not show as process data space (since it isn't process
        data). This may be one source of the reduction in
        free memory.

        Buffers used for reading are very fluid and don't take
        much work to reclaim for process use. Write buffers
        tend to be dirty which requires I/O to clean them.
        On a busy Web server, I'd expect to see lot of buffer
        cache get used from time to time.

        Look on the Freeware CDROM to see if there's a version
        of "vmubc". This is a graphic utility that can show
        however memory is being used.

=====================================================================
RESPONSE
=====================================================================

Top is actually telling you what is going on. The problem is that there
is an additional category that isn't listed -- but can be derived.

The "Real" memory in use figure is the amount of physical RAM used by
running processes. "Free" memory is totally unallocated memory.
The difference between the two is buffer cache -- the memory that the
system uses for buffering I/O and caching data on disks. Even on a
relatively unloaded system, the free memory will tend to be a pretty
small number. In truth, the free figure isn't worth much.

Some variants of top (like the version for Linux) will show a "Buff"
figure. They also show another interesting figure "Shrd" for the
amount of memory marked as sharable. It shows you how much of the
memory on your system is locked to processes as non-shared memory
(the delta between Real and Shrd).

Hope this helps,

=====================================================================
RESPONSE - [note: My version of top, supposedly the latest stable,
doesn't seem to have this option. In fact, the src for top3.4 doesn't
seem to have a sorting option for decosf (only AIX, Sun, etc)]
=====================================================================

top
then hit M to sort by memory usage.

=====================================================================
RESPONSE - [note: I did all these, but was coming up way short -
explainable by the amount of memory being used internally]
=====================================================================

swapon -l shows swap file usage.

ps auxw and add up the VSZ column. that's virtual memory size.
           this may be higher than virtual memory usage if there is shared
           memory involved.

ps auxw and add up the RSS column (Resident something Set) shows RAM in use.

There are similar
ps -elaf
columns... RSS shows by default, but you may have to use some custom options
          -o something
          to get virtual memory.

It is possible that your RAM usage IS fluctuating wildly
if people are active. It should be stable if nothing much is going on.
There may be a LOT of RAM in use as disk I/O buffers
(up to essentially all RAM that isn't being used by processes directly).
I don't know how that factors into 'top' displays, but I haven't heard any
complaints about top if you get it working at all
(old non-64-bit versions are still around).


=====================================================================
RESPONSE
=====================================================================

If you haven't done any tuning of the system then the UBC or unified buffer
cache can consume as much memory as it likes. This will not show up under a
process in ps. This is not necessarily a bad thing as the UBC is the
filesystem cache. If you let it then this uses free memory pages to cache
filesystem access. I guess a system which is primary function is a web
server then this isn't such a bad idea. If you issue a vmstat -P command and
look at the Managed Pages Break Down section it will show where the memory
pages are being used including the UBC.
Received on Mon Oct 23 2000 - 22:08:55 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT