OpenVMS Performance Management
OpenVMS Performance Management
Order Number:
AA--R237C--TE
April 2001
This manual is a conceptual guide for experienced users responsible for
optimizing performance on OpenVMS systems. For information about
OpenVMS performance on AlphaServer GS80/160/320 systems, see the
OpenVMS on AlphaServer GS-Series Systems Configuration and
Performance Guidelines, available at
http://www.openvms.compaq.com/gsseries/index.html.
Revision/Update Information:
This manual supersedes the OpenVMS Performance Management, OpenVMS Version 7.2.
Software Version:
OpenVMS Version 7.3
Compaq Computer Corporation Houston, Texas
© 2001 Compaq Computer Corporation
Compaq, VAX, VMS, and the Compaq logo Registered in U.S. Patent and
Trademark Office.
OpenVMS is a trademark of Compaq Information Technologies Group, L.P.
in the United States and other countries.
All other product names mentioned herein may be the trademarks of their
respective companies.
Confidential computer software. Valid license from Compaq required for
possession, use, or copying. Consistent with FAR 12.211 and 12.212,
Commercial Computer Software, Computer Software Documentation, and
Technical Data for Commercial Items are licensed to the U.S. Government
under vendor's standard commercial license.
Compaq shall not be liable for technical or editorial errors or
omissions contained herein.
The information in this document is provided "as is" without warranty
of any kind and is subject to change without notice. The warranties for
Compaq products are set forth in the express limited warranty
statements accompanying such products. Nothing herein should be
construed as constituting an additional warranty.
The following are trademarks of Compaq Computer Corporation: Alpha,
ACMS, DDIF, DECdirect, DECnet, HSC, and MicroVAX.
The following are third-party trademarks:
Motif and UNIX are trademarks of The Open Group in the United States
and other countries.
ZK6491
The Compaq OpenVMS documentation set is available on CD-ROM.
This document was prepared using DECdocument, Version V3.3-1e.
Preface
This manual presents techniques for evaluating, analyzing, and
optimizing performance on a system running OpenVMS. Discussions address
such wide-ranging concerns as:
- Understanding the relationship between work load and system capacity
- Learning to use performance-analysis tools
- Responding to complaints about performance degradation
- Helping the site adopt programming practices that result in the
best system performance
- Using the system features that distribute the work load for better
resource utilization
- Knowing when to apply software corrections to system
behavior---tuning the system to allocate resources more effectively
- Evaluating the effectiveness of a tuning operation; knowing how to
recognize success and when to stop
- Evaluating the need for hardware upgrades
The manual includes detailed procedures to help you evaluate resource
utilization on your system and to diagnose and overcome performance
problems resulting from memory limitations, I/O limitations, CPU
limitations, human error, or combinations of these. The procedures
feature sequential tests that use OpenVMS tools to generate performance
data; the accompanying text explains how to evaluate it.
Whenever an investigation uncovers a situation that could benefit from
adjusting system values, those adjustments are described in detail, and
hints are provided to clarify the interrelationships of certain groups
of values. When such adjustments are not the appropriate or available
action, other options are defined and discussed.
Decision-tree diagrams summarize the step-by-step descriptions in the
text. These diagrams should also serve as useful reference tools for
subsequent investigations of system performance.
This manual does not describe methods for capacity planning, nor does
it attempt to provide details about using OpenVMS RMS features
(hereafter referred to as RMS). Refer to the Guide to OpenVMS File Applications for that
information. Likewise, the manual does not discuss DECnet for OpenVMS
performance issues, because the DECnet-Plus for OpenVMS Network
Management manual provides that information.
Intended Audience
This manual addresses system managers and other experienced users
responsible for maintaining a consistently high level of system
performance, for diagnosing problems on a routine basis, and for taking
appropriate remedial action.
Document Structure
This manual is divided into 13 chapters and 4 appendixes, each covering
a related group of performance management topics as follows:
- Chapter 1 provides a review of workload management concepts and
describes guidelines for evaluating user complaints about system
performance.
- Chapter 2 lists postinstallation operations for enhancing
performance and discusses performance investigation and tuning
strategies.
- Chapter 3 discusses OpenVMS memory management concepts.
- Chapter 4 explains how to use utilities and tools to collect and
analyze data on your system's hardware and software resources. Included
are suggestions for reallocating certain resources should analysis
indicate such a need.
- Chapter 5 outlines procedures for investigating performance
problems.
- Chapter 6 describes how to evaluate system resource
responsiveness.
- Chapter 7 describes how to evaluate the performance of the
memory resource and how to isolate specific memory resource limitations.
- Chapter 8 describes how to evaluate the performance of the disk
I/O resource and how to isolate specific disk I/O resource limitations.
- Chapter 9 describes how to evaluate the performance of the CPU
resource and how to isolate specific CPU resource limitations.
- Chapter 10 provides general recommendations for improving
performance with available resources.
- Chapter 11 provides specific recommendations for improving the
performance of the memory resource.
- Chapter 12 provides specific recommendations for improving the
performance of the disk I/O resource.
- Chapter 13 provides specific recommendations for improving the
performance of the CPU resource.
- Appendix A lists the decision trees used in the various
performance evaluations described in this manual.
- Appendix B summarizes the MONITOR data items you will find useful
in evaluating your system.
- Appendix C provides an example of a MONITOR multifile summary
report.
- Appendix D provides ODS-1 performance information.
Related Documents
For additional information on the topics covered in this manual, you
can refer to the following documents:
- OpenVMS System Manager's Manual
- Guide to OpenVMS File Applications
- OpenVMS System Management Utilities Reference Manual
- Guidelines for OpenVMS Cluster Configurations
- OpenVMS Cluster Systems
For additional information about Compaq OpenVMS products and
services, access the Compaq website at the following location:
http://www.openvms.compaq.com/
|
Reader's Comments
Compaq welcomes your comments on this manual. Please send comments to
either of the following addresses:
Internet
|
openvmsdoc@compaq.com
|
Mail
|
Compaq Computer Corporation
OSSG Documentation Group, ZKO3-4/U08
110 Spit Brook Rd.
Nashua, NH 03062-2698
|
How To Order Additional Documentation
Use the following World Wide Web address to order additional
documentation:
http://www.openvms.compaq.com/
|
If you need help deciding which documentation best meets your needs,
call 800-282-6672.
Conventions
In this manual, every use of DECwindows and DECwindows Motif refers to
Compaq DECwindows Motif for OpenVMS software.
The following conventions are also used in this manual:
.
.
.
|
A vertical ellipsis indicates the omission of items from a code example
or command format; the items are omitted because they are not important
to the topic being discussed.
|
( )
|
In command format descriptions, parentheses indicate that you must
enclose choices in parentheses if you specify more than one.
|
[ ]
|
In command format descriptions, brackets indicate optional choices. You
can choose one or more items or no items. Do not type the brackets on
the command line. However, you must include the brackets in the syntax
for OpenVMS directory specifications and for a substring specification
in an assignment statement.
|
|
|
In command format descriptions, vertical bars separate choices within
brackets or braces. Within brackets, the choices are optional; within
braces, at least one choice is required. Do not type the vertical bars
on the command line.
|
bold text
|
This typeface represents the introduction of a new term. It also
represents the name of an argument, an attribute, or a reason.
|
italic text
|
Italic text indicates important information, complete titles of
manuals, or variables. Variables include information that varies in
system output (Internal error
number), in command lines (/PRODUCER=
name), and in command parameters in text (where
dd represents the predefined code for the device type).
|
UPPERCASE TEXT
|
Uppercase text indicates a command, the name of a routine, the name of
a file, or the abbreviation for a system privilege.
|
Monospace text
|
Monospace type indicates code examples and interactive screen displays.
In the C programming language, monospace type in text identifies the
following elements: keywords, the names of independently compiled
external functions and files, syntax summaries, and references to
variables or identifiers introduced in an example.
|
-
|
A hyphen at the end of a command format description, command line, or
code line indicates that the command or statement continues on the
following line.
|
numbers
|
All numbers in text are assumed to be decimal unless otherwise noted.
Nondecimal radixes---binary, octal, or hexadecimal---are explicitly
indicated.
|
Chapter 1 Performance Management
Managing system performance involves being able to evaluate and
coordinate system resources and workload demands.
A system resource is a hardware or software component
or subsystem under the direct control of the operating system, which is
responsible for data computation or storage. The following subsystems
are system resources:
- CPU
- Memory
- Disk I/O
- Network I/O
- LAN I/O
- Internet I/O
- Cluster communication other than LAN (CI, FDDI, MC)
In addition to this manual, specific cluster information can be found
in the Guidelines for OpenVMS Cluster Configurations and the OpenVMS Cluster Systems.
Performance management means optimizing your hardware
and software resources for the current work load. This involves
performing the following tasks:
- Acquiring a thorough knowledge of your work load and an
understanding of how that work load exercises the system's resources
- Monitoring system behavior on a routine basis in order to determine
when and why a given resource is nearing capacity
- Investigating reports of degraded performance from users
- Planning for changes in the system work load or hardware
configuration and being prepared to make any necessary adustments to
system values
- Performing certain optional system management operations after
installation
To help you understand the scope and interrelationship of these issues,
this chapter covers the following topics:
- A review of workload management concepts
- Guidelines for developing a performance management strategy
Because many different networking options are available, network I/O is
not formally covered in this manual. General performance concepts
discussed here apply to networking, and networking should be considered
within the scope of analyzing any system performance problem. You
should consult the documentation available for the specific products
that you have installed for specific guidelines concerning
configuration, monitoring, and diagnosis of a networking product.
Similarly, database products are extremely complex and perform much of
their own internal management. The settings of parameters external to
OpenVMS may have a profound effect upon how efficiently OpenVMS is
used. Thus, reviewing server application specific-material is a must if
you are to efficiently understand and resolve a related performance
issue.
1.1 System Performance Management Guidelines
Even if you are familiar with basic concepts discussed in this section,
there are some details discussed that are specific to this process, so
please read the entire section.
1.1.1 The Performance Management Process
Long term measurement and observation of your system is key to
understanding how well it is working and is invaluable in identifying
potential performance problems before they become so serious that the
system grinds to a halt and it negatively affects your business. Thus,
performance management should be a routine process of monitoring and
measuring your systems to assure good operation through deliberate
planning and resource management.
Waiting until a problem cripples a system before addressing system
performance is not performance management, rather it is crisis
management. Performance management involves:
- Systematically measuring the system
- Gathering and analyzing the data
- Evaluating trends
- Archiving data to maintain a performance history
You will often observe trends and thus be able to address performance
issues before they become serious and adversely affect your business
operations. Should an unforeseen problem occur, your historical data
will likely prove invaluable for pinpointing the cause and rapidly and
efficiently resolving the problem. Without past data from your formerly
well-running system, you may have no basis upon which to judge the
value of the metrics you can collect on your currently poorly running
system. Without historical data you are guessing; resolution will take
much longer and cost far more.
Upgrades and Reconfigurations
Some systems are so heavily loaded that the cost of additional
functionality of new software can push the system beyond the maximum
load that the system was intended to handle and thus deliver
unacceptable response times and throughput. If your system is running
near its limit now during peak workload periods, you want to ensure
that you take the steps necessary to avoid pushing your system beyond
its limits when you cannot afford it.
If your system is not a finely tuned, well-running machine, you are
advised to use caution when considering changes to anything. Your
system is already being pushed to, or beyond, its original designed
capacity if you have observed users complaining about:
- Slow response times
- Erratic system behavior
- Unexplained system pauses, hangs, or crashes
If this is the case, you need a performance audit to determine your
current workload and the resources necessary to adequately support your
current and possibly future workloads. Implementing changes not
specifically designed to increase such a system's capacity or reduce
its workload can degrade performance further. Thus, investing in a
performance audit will pay off by delivering you a more reliable,
productive, available, and lower maintenance system.
Many factors involved in upgrades and reconfigurations contribute to
increased resource consumption. Future workloads your system will be
asked to support may be unforseeable due to changes in the system,
workload, and business.
Blind reconfiguration without measurement, analysis, modification, and
contingency plans can result in serious problems. Significant increases
in CPU, disk, memory, and LAN utilization demand serious consideration,
measurement, and planning for additional workload and upgrades.
1.1.2 Conducting a Performance Audit
The goals of a performance audit are to:
- Evaluate whether your systems are viable candidates for proposed
changes.
- Identify modifications that must be made.
- Insure that planned and implemented changes deliver expected
results.
A proper performance audit will:
- Characterize CPU, disk, memory, and LAN utilization on the systems
under consideration before reconfiguration.
- Measure system activity after an installation.
Without scientific measurement before installation and modification, as
well as after, you will not acquire the data necessary to understand,
plan for, and resolve potential problems in the immediate as well as
distant future. Keep the following in mind:
- Measure, plan, understand, test, and confirm. To understand how
system workloads vary, you should perform measurements for one week, if
not longer, before installing your network
- Take into account that workloads follow business cycles which vary
predictably throughout the day, the week, the month, and the year.
These variations may be affected by financial and legal deadlines as
well as seasonal factors such as holidays and other cyclic activity.
- Seek to identify periods of peak heavy loads
(relatively long periods of heavy load lasting approximately five or
more minutes). Understanding their frequency and the factors affecting
them is key to successful system planning and management.
Peak Workloads and the Cyclic Nature of Workloads
You must first identify periods of activity during which you cannot
afford to have system performance degrade and then measure overall
system activity during these periods.
These periods will vary from system to system minute to minute, hour to
hour, day to day, week to week, and month to month. Holidays and other
such periods are often significant factors and should be considered.
These periods depend upon the business cycles that the system is
supporting.
If the periods you have identified as critical cannot be measured at
this time, then measurements taken in the immediate future will have to
be used as the basis for estimates of the activity during those
periods. In such cases you will have to take measurements in the near
term and make estimates based on the data you collect in combination
with other data such as order rates from the previous calendar month or
year, as well as projections or forecasts. But factors other than the
CPU may become a bottleneck and slow down the system. For example, a
fixed number of assistants can only process so many orders per hour,
regardless of the number of callers waiting on the phone for service.
1.2 Strategies and Procedures
This manual describes several strategies and procedures for evaluating
performance, evaluating system resources, and diagnosing resource
limitations as shown in the following list:
- Develop workload strategy (Chapter 1)
- Managing the work load
- Distributing the work load
- Sharing application code
- Develop tuning strategy (Chapter 2)
- Automatic Working Set Adjustment (AWSA)
- AUTOGEN
- Active memory management
- Perform general system resource evaluation (Chapter 4)
- CPU resource
- Memory resource
- Disk I/O resource
- Conduct a preliminary investigation of specific resource
limitations (Chapter 5)
- Isolating memory resource limitations
- Isolating disk I/O resource limitations
- Isolating CPU resource limitations
- Review techniques for improving system resource responsiveness
(Chapter 6)
- Providing equitable sharing of resources
- Reducing resource consumption
- Ensuring load balancing
- Initiating offloading
- Apply specific remedy to compensate for resource limitations
(Chapter 10)
- Compensating for memory-limited behavior
- Compensating for I/O-limited behavior
- Compensating for CPU-limited behavior
1.3 System Manager's Role
As a system manager, you must be able to do the following:
- Assume the responsibility for understanding the system's work load
sufficiently to be able to recognize normal and abnormal behavior.
- Predict the effects of changes in applications, operations, or
usage.
- Recognize typical throughput rates.
- Evaluate system performance.
- Perform tuning as needed.
1.3.1 Prerequisites
Before you adjust any system parameters, you should:
- Be familiar with system tools and utilities.
- Know your work load.
- Develop a strategy for evaluating performance.
1.3.2 System Utilities and Tools
You can observe system operation using the following tools:
- Accounting utility (ACCOUNTING)
- Audit Analysis utility (ANALYZE/AUDIT)
- Authorize utility (AUTHORIZE)
- AUTOGEN command procedure
- DCL SHOW commands
- DECamds (Compaq Availability Manager)
- DECevent utility (Alpha only)
- Error Log utility (ANALYZE/ERROR_LOG)
- Monitor utility (MONITOR)
On Alpha platforms, Compaq recommends using the DECevent utility
instead of the Error Log utility, ANALYZE/ERROR_LOG. (You invoke the
DECevent utility with the DCL command, DIAGNOSE.) You can use
ANALYZE/ERROR_LOG on Alpha systems, but the DECevent utility provides
more comprehensive reports.
|