To get the dirt on vmstat and other performance monitoring and tuning
tools.
John P Speno suggested...
<URL:
http://www.unix.digital.com/faqs/publications/base_doc/DOCUMENTATION/V40D_HTML/AQ0R3FTE/TITLE.HTM>
This was a very interesting DU specific explanation of how the virtual
memory system goes about it's business. It also provides great insight
into what the stats you can gather from vmstat really mean.
Kurt Carlson was very helpful as well, though his comments were more
specific to the task so I expect of less interest to the list.
I have included them in just in case below.
Thanks to others for input as well.
Bryan Rank
>Hello,
>
>I am doing some rough bench marking on some of my company's software on
>a couple of different classes of alpha. Part of this process involves
>running vmstat in the background while concurrent processes are running.
>After all the processes have completed I take their log files and run them
>past a log from vmstat to derive things like the number of process that
>were running, and cpu and memory info at that point in time. From this
>part of the test I hope to get an estimate of the cpu utilization from
>the amount of idle time, and memory consumption from the rate of paging.
>I am not particularly concerned with i/o performance here, as what I am
>benchmarking has to do with in memory (mmap'd) table lookups
io is generally the single most limiting factor for most systems
unless it's memory rich enough to avoid io's. even still, simple
thing like database journal rolling or transaction dumps can
kill an application if the io is not carefully considered.
>I went back in the list and read Tony Warner's "SUMMARY: memory
>allocation" but I am looking for a bunch of parsible snapshots to graph
>while the processes are busy doing their thing, not just one.
>
>Summary from Evi Nemeth's "UNIX System Admin Handbook...
>Memory Usage. There are basically two numbers that quantify memory
>activity: the total amount of virtual memory ("act"), and the paging rate
>("pin"). The total amount of virtual memory is an indication of the
>total memory demand on a system. The paging is a reflection of what
>portion of that memory is actively used.
>
>My questions...
>
>o "act" is all vm, I was hoping to get a snapshot of just "active
> virtual memory". Since the inactive list has yet to be paged and
> still could be saved, I would consider this OK to lump in, but
> what about the Unified Buffer Cache (UBC). The man page states
> that vmstat doesn't report on it, then under the section on "act"
> it states that does. Which is it, anyone know?
you hit it... ubc performance and tuning is fairly critical for
du, best measures i found for it (4.0b and older) were snaps from
dbx. here's what i used for digital in a data collection loop:
echo "
dbx `date`" >> vmstat.$stamp
uptime >> vmstat.$stamp
sudo dbx -k /vmunix /dev/mem >> vmstat.$stamp << ! 2>&1
print "_dbx_ubc_pages_in_bytes: ", ubc_pages*8192
print "_dbx_vm_perfsum: ", vm_perfsum
quit
!
vmstat -s >> vmstat.$stamp
vmstat -M >> vmstat.$stamp
ipcs -ma >> vmstat.$stamp
echo "
vmstat `date` $INT $CNT" >> vmstat.$stamp
vmstat $INT $CNT >> vmstat.$stamp
there is some discussion on ubc benchmarking in the examples
directory of:
ftp://raven.alaska.edu/pub/sois/uaio.tar.Z
i never fully mastered it... enough to by-feel adjust it
for the particular application environment i was dealing
with (large oracle... ubc had to be throttled to reduce
double-buffering with oracle, that hurt non-oracle applications
which required carefully splitting io to disks and controllers).
>o What's the best indicator of paging? "pin" seems to reflect the
> activity more closely than anything else, but "react" (hard page
> faults?) seems like the best indicator of memory consumption.
it depends. transaction-based applications can't really sustain
any paging. you want to avoid the hard faults in the critical apps,
pin remains the best number to monitor to pro-actively avoid
hard faults. the tools in unix are poor for determining who
is faulting.
>o In relating cpu utilization, I was planning on taking the ratio of
> idle time to system time, but even on a nice fast 8400, idle
> time seems to always be zero when you are running a bunch of
> memory/cpu hungry processes. I am rethinking this to be a ratio of
> (system_time/user_time); seems more interesting. Does this seem
> like a valid measure?
yes, more reasonable to use user time. idle time is meaningless.
the ratio of user|system is a little questionable unless you
have a real good handle on what's running (specific purpose
system... no ad hoc people running things). high system time
by itself is an indication of somthing thrashing (paging,
ubc manipulation, excessive io, excessive system calls, ...).
as you can probably imagine, "your mileage will vary" with
any system or application mix. good luck, kurt
Received on Fri Jun 04 1999 - 13:15:12 NZST