HP OpenVMS Systems Documentation |
OpenVMS Performance Management
Chapter 9
|
If you suspect... | Then... |
---|---|
The lock waiting is caused by file sharing 1 | Attempt to reduce the level of sharing. |
The lock waiting results from user or third-party application locks | Attempt to influence the redesign of such applications. |
A high amount of locking activity in an SMP environment | Assign a CPU to perform dedicated lock management. |
Processes can also enter the LEF state or the other voluntary wait
states (common event flag wait [CEF], hibernate [HIB], and suspended
[SUSP]) when system services are used to synchronize applications. Such
processes have temporarily abdicated use of the CPU; they do not
indicate problems with other resources.
9.1.5.2 Involuntary Wait States
Involuntary wait states are not requested by processes but are invoked by the system to achieve process synchronization in certain circumstances:
The presence of processes in the MWAIT state indicates that there might be a shortage of a systemwide resource (usually page or swapping file capacity) and that the shortage is blocking these processes from the CPU.
If you see processes in this state, do the following:
$ MONITOR /INPUT=SYS$MONITOR:file-spec /VIEWING_TIME=1 PROCESSES |
The most common types of resource waits are those signifying depletion of the page and swapping files as shown in the following table:
State | Description |
---|---|
RWSWP | Indicates a swapping file of deficient size. |
RWMBP, RWMPE, RWPGF | Indicates a paging file that is too small. |
RWAST |
Indicates that the process is waiting for a resource whose availability
will be signaled by delivery of an asynchronous system trap (AST).
In most instances, either an I/O operation is outstanding (incomplete), or a process quota has been exhausted. |
You can determine paging and swapping file sizes and the amount of available space they contain by entering the SHOW MEMORY/FILES/FULL command.
The AUTOGEN feedback report provides detailed information about paging
and swapping file use. AUTOGEN uses the data in the feedback report to
resize or to recommend resizing the paging and swapping files.
9.2 Detecting CPU Limitations
The surest way to determine whether a CPU limitation could be degrading performance is to check for a state queue with the MONITOR STATES command. See Figure A-16. If any processes appear to be in the COM or COMO state, a CPU limitation may be at work. However, if no processes are in the COM or COMO state, you need not investigate the CPU limitation any further.
If processes are in the COM or COMO state, they are being denied access to the CPU. One or more of the following conditions is occurring:
If you suspect the system is performing suboptimally because processes are blocked by a process running at higher priority, do the following:
If you find that this condition exists, your option is to adjust the
process priorities. See Section 13.3 for a discussion of how to change
the process priorities assigned in the UAF, define priorities in the
login command procedure, or change the priorities of processes while
they execute.
9.2.2 Time Slicing Between Processes
Once you rule out the possibility of preemption by higher priority processes, you need to determine if there is a serious problem with time slicing between processes at the same priority. Using the list of top CPU users, compare the priorities and assess how many processes are operating at the same one. Refer to Section 13.3, if you conclude that the priorities are inappropriate.
However, if you decide that the priorities are correct and will not
benefit from such adjustments, you are confronted with a situation that
will not respond to any form of system tuning. Again, the only
appropriate solution here is to adjust the work load to decrease the
demand or add CPU capacity (see Section 13.7).
9.2.3 Excessive Interrupt State Activity
If you discover that blocking is not due to contention with other processes at the same or higher priorities, you need to find out if there is too much activity in interrupt state. In other words, is the rate of interrupts so excessive that it is preventing processes from using the CPU?
You can determine how much time is spent in interrupt state from the MONITOR MODES display. A percentage of time in interrupt state less than 10 percent is moderate; 20 percent or more is excessive. (The higher the percentage, the more effort you should dedicate to solving this resource drain.)
If the interrupt time is excessive, you need to explore which devices cause significant numbers of interrupts on your system and how you might reduce the interrupt rate.
The decisions you make will depend on the source of heavy interrupts.
Perhaps they are due to communications devices or special hardware used
in real-time applications. Whatever the source, you need to find ways
to reduce the number of interrupts so that the CPU can handle work from
other processes. Otherwise, the solution may require you to adjust the
work load or acquire CPU capacity (see Section 13.7).
9.2.4 Disguised Memory Limitation
Once you have either ruled out or resolved a CPU limitation, you need
to determine which other resource limitation produces the block. Your
next check should be for the amount of idle time. See Figure A-17.
Use the MONITOR MODES command. If there is any idle time, another
resource is the problem and you may be able to tune for a solution. If
you reexamine the MONITOR STATES display, you will likely observe a
number of processes in the COMO state. You can conclude that this
condition reflects a memory limitation, not a CPU limitation. Follow
the procedures described in Chapter 7 to find the cause of the
blockage, and then take the corrective action recommended in
Chapter 10.
9.2.5 Operating System Overhead
If the MONITOR MODES display indicates that there is no idle time, your CPU is 100 percent busy. You will find that processes are in the COM state on the MONITOR STATES display. You must answer one more question. Is the CPU being used for real work or for nonessential operating system functions? If there is operating system overhead, you may be able to reduce it.
Analyze the MONITOR MODES display carefully. If your system exhibits excessive kernel mode activity, it is possible that the operating system is incurring overhead in the areas of memory management, I/O handling, or scheduling. Investigate the memory limitation and I/O limitation (Chapters 7 and 8), if you have not already done so.
Once you rule out the possibility of improving memory management or I/O
handling, the problem of excessive kernel mode activity might be due to
scheduling overhead. However, you can do practically nothing to tune
the scheduling function. There is only one case that might respond to
tuning. The clock-based rescheduling that can occur at quantum end is
costlier than the typical rescheduling that is event driven by process
state. Explore whether the value of the system parameter QUANTUM is too
low and can be increased to bring about a performance improvement by
reducing the frequency of this clock-based rescheduling (see
Section 13.4). If not, your only other recourse is to adjust the work
load or acquire CPU capacity (see Section 13.7).
9.2.6 RMS Misused
If the MONITOR MODES display indicates that a great deal of time is
spent in executive mode, it is possible that RMS is being misused. If
you suspect this problem, proceed to the steps described in
Section 8.3.3 for RMS induced I/O limitations, making any changes that
seem indicated. You should also consult the Guide to OpenVMS File Applications.
9.2.7 CPU at Full Capacity
If at this point in your investigation the MONITOR MODES display
indicates that most of the time is spent in supervisor mode or user
mode, you are confronted with a situation where the CPU is performing
real work and the demand exceeds the capacity. You must either make
adjustments in the work load to reduce demand (by more efficient coding
of applications, for example) or you must add CPU capacity (see
Section 13.7).
9.3 MONITOR Statistics for the CPU Resource
Use the following MONITOR commands to obtain the appropriate statistic:
Command | Statistic |
---|---|
Compute Queue | |
STATES | Number of processes in compute (COM) and compute outswapped (COMO) scheduling states |
Estimating CPU Capacity | |
STATES | All items |
MODES | Idle time |
Voluntary Wait States | |
STATES | Number of processes in local event flag wait (LEF), common event flag wait (CEF), hibernate (HIB), and suspended (SUSP) states |
LOCK | ENQs Forced to Wait Rate |
MODES | MP synchronization |
Involuntary Wait States | |
STATES | Number of processes in miscellaneous resource wait (MWAIT) state |
PROCESSES | Types of resource waits (RW xxx) |
Reducing CPU Consumption | |
MODES | All items |
Interrupt State | |
IO | Direct I/O Rate, Buffered I/O Rate, Page Read I/O Rate, Page Write I/O Rate |
DLOCK | All items |
SCS | All items |
MP Synchronization Mode | |
MODES | MP Synchronization |
IO | Direct I/O Rate, Buffered I/O Rate |
DLOCK | All items |
PAGE | All items |
DISK | Operation Rate |
Kernel Mode | |
MODES | Kernel mode |
IO | Page Fault Rate, Inswap Rate, Logical Name Translation Rate |
LOCK | New ENQ Rate, Converted ENQ Rate, DEQ Rate |
FCB | All items |
PAGE | Demand Zero Fault Rate, Global Valid Fault Rate, Page Read I/O |
DECNET | Sum of packet rates |
CPU Load Balancing | |
MODES | Time spent by processors in each mode |
See Table B-1 for a summary of MONITOR data items.
Previous | Next | Contents | Index |