|
OpenVMS Performance Management
3.8.3 Controlling the Overhead
Two system parameters determine the maximum sizes for the two data
structures in the process header as follows:
- GBLPAGES---Defines the size of the global page table. The system
working set size as defined by SYSMWCNT must be increased whenever you
increase GBLPAGES.
- GBLSECTIONS---Defines the size of the global section table.
3.8.4 Installing Shared Images
Once an image has been created, it can be installed as a permanently
shared image. (See the OpenVMS Linker Utility Manual and the OpenVMS System Manager's Manual, Volume 2: Tuning, Monitoring, and Complex Systems). This will
save memory whenever there is more than one process actually mapped to
the image at a time.
Also, use AUTHORIZE to increase the user's working set characteristics
(WSDEF, WSQUO, WSEXTENT) wherever appropriate, to correspond to the
expected use of shared code. (Note, however, that this increase does
not mean that the actual memory usage will increase. Sharing of code by
many users actually decreases the memory requirement.)
3.8.5 Verifying Memory Sharing
If physical memory is especially limited, investigate whether there is
much concurrent image activation that results in savings. If you find
there is not, there is no reason to employ code sharing. You can use
the following procedure to determine if there is active sharing on
image sections that have been installed as shareable:
- Invoke the OpenVMS Install utility (INSTALL) and enter the
LIST/FULL command. For example:
$ INSTALL
INSTALL> LIST/FULL LOGINOUT
|
INSTALL displays information in the following format:
DISK$AXPVMSRL4:<SYS0.SYSEXE>.EXE
LOGINOUT;3 Open Hdr Shar Priv
Entry access count = 44
Current / Maximum shared = 3 / 5
Global section count = 2
Privileges = CMKRNL SYSNAM TMPMBX EXQUOTA SYSPRV
|
- Observe the values shown for the Current/Maximum shared access
counts:
- The Current value is the current count of concurrent accesses of
the known image.
- The Maximum value is the highest count of concurrent accesses of
the image since it became known (installed). This number appears only
if the image is installed with the /SHARED qualifier.
The Maximum value should be at least 3 or 4. A lower value indicates
that overhead for sharing is excessive.
Note
In general, your intuition, based on knowledge of the work load, is the
best guide. Remember that the overhead required to share memory is
counted in bytes of memory, while the savings are counted in pages of
physical memory. Thus, if you suspect that there is occasional
concurrent use of an image, the investment required to make it
shareable is worthwhile.
|
3.9 OpenVMS Scheduling
The scheduler uses a modified round-robin form of scheduling: processes
receive a chance to execute on rotating basis, according to process
state and priority.
3.9.1 Time Slicing
Each computable process receives a time slice for execution. The time
slice equals the system parameter QUANTUM, and rotating the time slices
among processes is called time slicing. Once its
quantum starts, each process executes until one of the following events
occurs:
- A process of higher priority becomes computable
- The process is no longer computable because of a resource wait
- The process itself voluntarily enters a wait state
- The quantum ends
If there is no other computable (COM) process at the same priority
ready to execute when the quantum ends, the current process receives
another time slice.
3.9.2 Process State
A change in process state causes the scheduler to reexamine which
process should be allowed to run.
3.9.3 Process Priority
When required to select the next process for scheduling, the scheduler
examines the priorities assigned to all the processes that are
computable and selects the process with the highest priority.
Priorities are numbers from 0 to 31.
Processes assigned a priority of 16 or above receive maximum access to
the CPU resource (even over system processes) whenever they are
computable. These priorities, therefore, are used for real-time
processes.
Processes
A process receives a default base priority from:
- /PRIORITY qualifier in the UAF record
- DEFAULT record in the UAF record
A process can change its priority using the following:
- $SETPRI system service.
- DCL command SET PROCESS/PRIORITY to reduce the priority of your
process. You need ALTPRI privilege to increase the priority of your
process.
A user requires GROUP or WORLD privilege to change the priority of
other processes.
Subprocesses and Detached Processes
A subprocess or detached process receives its base priority from:
- $CREPRC system service
- DCL command RUN
If you do not specify a priority, the system uses the priority of the
creator.
Batch Jobs
When a batch queue is created, the DCL command
INITIALIZE/QUEUE/PRIORITY establishes the default priority for a job.
However, when you submit a job with the DCL command SUBMIT or change
characteristics of that job with the DCL command SET QUEUE/ENTRY, you
can adjust the priority with the /PRIORITY qualifier.
With either command, increases are permitted only for submitters with
the OPER privilege.
3.9.4 Priority Boosting
For processes below priority 16, the scheduler can increase or decrease
process priorities as shown in the following table:
Stage |
Description |
1
|
While processes run, the scheduler recognizes events such as I/O
completions, the completion of an interval of time, and so forth.
|
2
|
As soon as one of the recognized events occurs and the associated
process becomes computable, the scheduler may increase the priority of
that process. The amount of the increase is related to the associated
event.
1
|
3
|
The scheduler examines which computable process has the highest
priority and, if necessary, causes a
context switch so that the highest priority process
runs.
|
4
|
As soon as a process is scheduled, the scheduler reduces its priority
by one to allow processes that have received a priority boost to begin
to return to their base priority.
2
|
1For example, if the event is the completion of terminal I/O
input, the scheduler gives a large increase so that the process can run
again sooner.
2The priority is never decreased below the base priority or
increased into the real-time range.
3.9.5 Scheduling Real-Time Processes
When real-time processes (those with priorities from 16 to 31) execute,
the following conditions apply:
- They never receive a priority boost.
- They do not experience automatic working set adjustments.
- They do not experience quantum-based time slicing.
The system permits real-time processes to run until either they
voluntarily enter a wait state or a higher priority real-time process
becomes computable.
3.9.6 Tuning
From a tuning standpoint, you have very few controls you can use to
influence process scheduling. However, you can modify:
- Base priorities of processes
- Length of time for a quantum
All other aspects of process scheduling are fixed by both the behavior
of the scheduler and the characteristics of the work load.
3.9.7 Class Scheduler
The OpenVMS class scheduler allows you to tailor scheduling for
particular applications. The class scheduler replaces the OpenVMS
scheduler for specific processes. The program SYS$EXAMPLES:CLASS.C
allows applications to do class scheduling.
3.9.8 Processor Affinity
You can associate a process or the initial thread of a multithreaded
process with a particular processor in an SMP system. Application
control is through the system services $PROCESS_AFFINITY and
$SET_IMPLICIT_AFFINITY. The command SET PROCESS/AFFINITY allows bits in
the affinity mask to be set or cleared individually, in groups, or all
at once.
Processor affinity allows you to dedicate a processor to specific
activities. You can use it to improve load balancing. You can use it to
maximize the chance that a kernel thread will be scheduled on a CPU
where its address translations and memory references are more likely to
be in cache.
Maximizing the context by binding a running thread to a specific
processor often shows throughput improvement that can outweigh the
benefits of the symmetric scheduling model. Particularly in larger CPU
configurations and higher-performance server applications, the ability
to control the distribution of kernel threads throughout the active CPU
set has become increasingly important.
Chapter 4 Evaluating System Resources
This chapter describes tools that help you evaluate the performance of
the three major hardware resources---CPU, memory, and disk I/O.
Discussions focus on how the major software components use each
hardware resource. The chapter also outlines the measurement, analysis,
and possible reallocation of the hardware resources.
You can become knowledgeable about your system's operation if you use
MONITOR, ACCOUNTING, and AUTOGEN feedback on a regular basis to capture
and analyze certain key data items.
4.1 Prerequisites
It is assumed that your system is a general timesharing system. It is
further assumed that you have followed the workload management
techniques and installation guidelines described in Chapter 1 and
Section 2.4, respectively.
The procedures outlined in this chapter differ from those in Chapters
5 and 10 in the following ways:
- They are designed to help you conduct an evaluation of your system
and its resources, rather than to execute an investigation of a
specific problem. If you discover problems during an evaluation, refer
to the procedures described in Chapters 5 and 10
for further analysis.
- For simplicity, they are less exhaustive, relying on certain rules
of thumb to evaluate the major hardware resources and to point out
possible deficiencies, but stopping short of pinpointing exact causes.
- They are centered on the use of MONITOR, particularly the summary
reports, both standard and multifile.
Note
Some information in this chapter may not apply to certain specialized
types of systems or to applications such as workstations, database
management, real-time operations, transaction processing, or any in
which a major software subsystem is in control of resources for other
processes.
|
4.2 Guidelines
You should exercise care in selecting the items you want to measure and
the frequency with which you capture the data.
If you are overzealous, the consumption of system resources required to
collect, store, and analyze the data can distort your picture of the
system's work load and capacity.
As you conduct your evaluations, keep the following rules in mind:
- Complete the entire evaluation. It is important to examine all the
resources in order to evaluate the system as a whole. A partial
examination can lead you to attempt an improvement in an area where it
may have minimal effect because more serious problems exist elsewhere.
- Become as familiar as possible with the applications running on
your system. Get to know their resource requirements. You can obtain a
lot of relevant information from the ACCOUNTING image report shown in
Example 4-1. Compaq and third-party software user's guides can also be
helpful in identifying resource requirements.
- If you believe that a change in software parameters or hardware
configuration can improve performance, execute such a change
cautiously, being sure to make only one change at a time. Evaluate the
effectiveness of the change before deciding to make it permanent.
Note
When specific values or ranges of values for MONITOR data items are
recommended, they are intended only as guidelines and will not be
appropriate in all cases.
|
4.3 Collecting and Interpreting Image-Level Accounting Data
Image-level accounting is a feature of ACCOUNTING that provides
statistics and information on a per-image basis.
By knowing which images are heavy consumers of resources at your site,
you can better direct your efforts of controlling them and the
resources they consume.
Frequently used images are good candidates for code sharing; whereas
images that consume large quantities of various resources can be forced
to run in a batch queue where the number of simultaneous processes can
be controlled.
4.3.1 Guidelines
You should be judicious in using image-level accounting on your system.
Consider the following guidelines when using ACCOUNTING:
- Enable image-level accounting only when you plan to invoke
ACCOUNTING to process the information provided in the file
SYS$MANAGER:ACCOUNTING.DAT.
- To obtain accounting information only on specific images, install
those images using the /ACCOUNTING qualifier with the INSTALL commands
ADD or MODIFY.
- Disable image-level accounting once you have collected enough data
for your purposes.
- While image activation data can be helpful in performance analysis,
it wastes processing time and disk storage if it is collected and never
used.
4.3.2 Enabling and Disabling Image-Level Accounting
You enable image-level record collection by issuing the DCL command SET
ACCOUNTING/ENABLE=IMAGE.
Disable image-level accounting by issuing the DCL command SET
ACCOUNTING/DISABLE=IMAGE.
Note
The collection of image-level accounting data consumes CPU cycles. The
collected records can consume a significant amount of disk space.
Remember to enable image-level accounting only for the period of time
needed for the report.
|
4.3.3 Generating a Report
A series of commands like the following generates output similar to
that shown in Example 4-1.
$ ACCOUNTING /TYPE=IMAGE /OUTPUT=BYNAM.LIS -
_$ /SUMMARY=IMAGE -
_$ /REPORT=(PROCESSOR,ELAPSED,DIRECT_IO,FAULTS,RECORDS)
$ SORT BYNAM.LIS BYNAM.ORD /KEY=(POS=16,SIZ=13,DESCEND)
.
.
.
(Edit BYNAM.ORD to relocate heading lines)
.
.
.
$ TYPE BYNAM.ORD
|
4.3.4 Collecting the Data
Example 4-1 assumes that image-level accounting records have been
collected previously.
Example 4-1 Image-Level Accounting Report |
From: 8-MAY-1994 11:09(1) To: 8-MAY-1994 17:31
Image name(2) Processor(3) Elapsed Direct(4)
Page(5) Total(6)
Time Time I/O Faults Records
------------------------------------------------------------------------
EDT 0 00:34:21.34 0 15:51:34.78 5030 132583 390
DTR32 0 00:19:30.94 0 03:17:37.48 7981 83916 12
PASCAL 0 00:15:19.42 0 01:04:19.57 38473 143107 75
MAIL 0 00:10:40.88 1 02:54:02.89 26139 106854 380
LINK 0 00:05:44.41 0 00:23:54.54 7443 57092 111
RTPAD 0 00:04:58.40 0 20:49:19.24 668 8004 72
LOGINOUT 0 00:04:53.98 0 02:01:31.81 2809 67579 893
EMACS 0 00:04:30.40 0 05:25:01.37 420 8461 1
MACRO32 0 00:04:26.22 0 00:14:55.00 1014 34016 46
BLISS32 0 00:03:45.80 0 00:12:58.87 98 32797 8
DIRECTORY 0 00:03:26.20 0 01:22:34.47 1020 27329 275
FORTRAN 0 00:03:13.87 0 00:14:15.08 1157 28003 47
NOTES 0 00:01:39.90 0 02:06:01.95 8011 6272 32
DELETE 0 00:01:37.31 0 00:57:43.31 834 25516 332
TYPE 0 00:01:06.35 0 00:28:58.26 406 14457 173
COPY 0 00:00:57.08 0 00:11:11.40 2197 4943 42
SHOW 0 00:00:56.39 0 00:24:53.22 23 11505 166
ACC 0 00:00:54.43 0 00:03:41.46 132 2007 7
MONITOR 0 00:00:53.91 0 02:37:13.84 159 5649 40
CALENDAR 0 00:00:43.55 0 00:30:15.52 1023 3557 25
PHONE 0 00:00:40.56 0 00:54:59.39 24 1510 33
ERASE 0 00:00:37.88 0 00:03:51.04 105 9873 113
LIBRARIAN 0 00:00:35.58 0 00:03:37.98 1134 10297 62
FAL 0 00:00:34.27 0 00:20:56.63 110 4596 122
SDA 0 00:00:27.34 0 00:09:28.68 52 4797 3
SET 0 00:00:27.02 0 00:02:30.28 160 9447 206
NETSERVER 0 00:00:26.89 0 02:38:17.90 263 10164 407
CDU 0 00:00:24.32 0 00:01:57.67 13 21906 17
VMSHELP 0 00:00:12.83 0 00:05:40.96 121 1943 14
RENAME 0 00:00:09.56 0 00:00:57.44 6 3866 47
SDL 0 00:00:09.55 0 00:01:19.78 11 3158 4
SUBMIT 0 00:00:08.14 0 00:01:08.50 9 2991 28
NCP 0 00:00:07.30 0 00:02:26.20 7 1765 16
QUEMAN 0 00:00:06.44 0 00:01:38.75 201 1561 20
|
This example shows a report of system resource utilization for the
indicated period, summarized by a unique image name, in descending
order of CPU utilization. Only the top 34 CPU consumers are shown. (The
records could easily have been sorted differently.)
- Timestamps
The From: and To: timestamps
show that the accounting period ran for about 6 1/2 hours. You can
specify the /SINCE and /BEFORE qualifiers to select any time period of
interest.
- Image Name
Most image names are
programming languages and operating system utilities, indicating that
the report was probably generated in a program-development environment.
- Processor Time
Data in this column shows
that no single image is by far the highest consumer of the CPU
resource. It is therefore unlikely that the installation would benefit
significantly by attempting to reduce CPU utilization by any one image.
- Direct I/O
In the figures for direct I/O, the two top images are PASCAL and MAIL.
One way to compare them is by calculating I/O operations per second.
The total elapsed time spent running PASCAL is roughly 3860 seconds,
while the time spent running MAIL is a little under 96843 seconds
(several people used MAIL all afternoon). Calculated on a time basis,
MAIL caused roughly 1/4 to 1/3 of an I/O operation per second, whereas
PASCAL caused about 10 operations per second. Note that by the same
calculation, LINK caused about five I/O operations per second. It would
appear that a sequence of PASCAL/LINK commands contributes somewhat to
the overall I/O load. One possible approach would be to look at the RMS
buffer parameters set by the main PASCAL users. You can find out who
used PASCAL and LINK by entering a DCL command:
$ ACCOUNTING/TYPE=IMAGE/IMAGE=(PASCAL,LINK) -
_$ /SUMMARY=(IMAGE,USER)/REPORT=(ELAPSED,DIRECT)
|
This command selects image accounting records for the PASCAL and LINK
images by image name and user name, and requests Elapsed Time and
Direct I/O data. You can examine this data to determine whether the
users are employing RMS buffers of appropriate sizes. Two fairly large
buffers for sequential I/O, each approximately 64 blocks in size, are
recommended.
- Page Faults
As with direct I/O, page
faults are best analyzed on a time basis. One technique is to compute
faults-per-10-seconds of processor time and compare the result with the
value of the SYSGEN parameter PFRATH. A little arithmetic shows that on
a time basis, PASCAL is incurring more than 1555 faults per 10 seconds.
Suppose that the value of PFRATH on this system is 120 (120 page faults
per 10 seconds of processor time), which is considered typical in most
environments. What can you conclude by comparing the two values?
Whenever a process's page fault rate exceeds the PFRATH value,
memory management attempts to increase the process working set, subject
to system management quotas, until the fault rate falls below PFRATH.
So, if an image's fault rate is persistently greater than PFRATH, it is
not obtaining all the memory it needs. Clearly, the PASCAL image is
causing many more faults per CPU second than would be considered normal
for this system. You should, therefore, make an effort to examine the
working set limits and working set adjustment policies for the PASCAL
users. To lower the PASCAL fault rate, the process working sets must be
increased---either by adjusting the appropriate UAF quotas directly or
by setting up a PASCAL batch queue with generous working set values.
- Total Records
These figures represent the
count of activations for images run during the accounting period; in
other words, they show each image's relative popularity. You can use
this information to ensure that the most popular images are installed
(see Section 2.4). For customer applications, you might consider using
linking options such as /NOSYSSHR and reassigning PSECT attributes to
speed up activations (see the OpenVMS Linker Utility Manual). Note that the number
of LOGINOUT activations far exceeds that of all other images. This
situation could result from a variety of causes, including attempts to
breach security, an open terminal line, a runaway batch job, or a large
number of network operations. More ACCOUNTING commands would be
necessary to determine the exact cause. At this site, it turned out
that most of the activations were caused by an open terminal line. The
problem was detected by an astute system manager who checked the count
of LOGFAIL entries in the accounting file. You can also use
information in this field to examine the characteristics of the average
image activation. That knowledge would be useful if you wanted to
determine whether it would be worthwhile to set up a special batch
queue. For example, the average PASCAL image uses 51 seconds of
elapsed time and the average LINK uses 13 seconds. You can therefore
infer that the average PASCAL and LINK sequence takes about a minute.
This information could help you persuade users of those images to run
PASCAL and LINK in batch mode. If, on the other hand, the average time
was only 5 seconds, batch processing would probably not be worthwhile.
|