|
OpenVMS System Manager's Manual
20.8.4 Using Live Recording Monitoring
Use live recording to capture MONITOR data for future use. Possible
uses include the following ones:
- Installation checkout, tuning, troubleshooting; that is, all the
uses that are listed for live display monitoring.
Choose recording
over display when you want to capture more classes than you can
reasonably watch at a terminal, when a terminal is not available, or
when you want to gather data about the system but cannot spend time at
the terminal until later.
- Routine performance data gathering for long-term analysis.
You
can record MONITOR data on a routine basis and summarize it to gather
data about system resource use over long periods of time.
Caution
Because data is continuously added to the recording file, be careful
that the file does not grow too large.
|
The following example shows how to use the live recording mode of
operation.
Example
$ MONITOR/NODE=(LARRY,MOE)/NODISPLAY/RECORD MODES+STATES
|
The command in this example records data on the time spent in each
processor mode and on the number of processes in each scheduler state
for nodes LARRY and MOE. The command does not display this information.
20.8.5 Using Concurrent Display and Recording Monitoring
Use the concurrent display and recording mode of operation when you
want to both retain performance data and watch as it is being
collected. Because MONITOR allows shared read access to the recording
file, a separate display process can play back the recording file as it
is being written by another process.
The following examples show how to use the concurrent display and
recording mode of operation. The first example both collects and
records data in the same command. The second and third examples show
how you can perform concurrent recording and display using two separate
processes: the process in the second example performs recording; the
process in the third example plays back the file to obtain a summary.
Examples
-
$ MONITOR/RECORD FCP/AVERAGE,FILE_SYSTEM_CACHE/MINIMUM
|
This command collects and records file system data and file system
cache data every 3 seconds. It also displays, in bar graph form,
average FCP statistics and minimum FILE_SYSTEM_CACHE statistics. The
display alternates between the two graphs every 3 seconds. You can
obtain current statistics in a subsequent playback request.
-
$ MONITOR/RECORD=SYS$MANAGER:ARCHIVE.DAT -
_$ /INTERVAL=300/NODISPLAY ALL_CLASSES
|
This command archives data for all classes once every 5 minutes.
You might find it convenient to execute a similar command in a batch
job, taking care to monitor disk space usage.
-
$ MONITOR/INPUT=SYS$MANAGER:ARCHIVE.DAT: -
_$ /NODISPLAY/SUMMARY/BEGINNING="-1" PAGE,IO
|
The command in this example produces a summary of page and I/O
activity that occurred during the previous hour, perhaps as part of an
investigation of a reported performance problem. Note that because the
recording process executes an OpenVMS RMS flush operation every 5
minutes, up to 5 minutes of the most recently collected data is not
available to the display process. You can specify the time between
flush operations explicitly with the /FLUSH_INTERVAL qualifier. Note
also that the display process must have read access to the recording
file.
20.8.6 Using Playback Monitoring
Use playback of a recording file to obtain terminal output and summary
reports of all collected data or a subset of it. You can make a subset
of data according to class, node, or time segment. For example, if you
collect several classes of data for an entire day, you can examine or
summarize the data on one or more classes during any time period in
that day.
You can also display or summarize data with a different interval than
the one at which it was recorded. You control the actual amount of time
between displays of screen images with the /VIEWING_TIME qualifier. The
following examples show how to use the playback mode of operation.
Examples
-
$ MONITOR/RECORD/INTERVAL=5 IO
.
.
.
$ MONITOR/INPUT IO
|
The commands in this example produce system I/O statistics. The
first command gathers and displays data every 5 seconds, beginning when
you enter the command and ending when you press Ctrl/Z. In addition,
the first command records binary data in the default output file
MONITOR.DAT. The second command plays back the I/O statistics
display, using the data in MONITOR.DAT for input. The default viewing
time for the playback is 3 seconds, but each screen display represents
5 seconds of monitored I/O statistics.
-
$ MONITOR/RECORD/NODISPLAY -
_$ /BEGINNING=08:00:00 -
_$ /ENDING=16:00:00 -
_$ /INTERVAL=120 DISK
.
.
.
$ MONITOR/INPUT/DISPLAY=HOURLY.LOG/INTERVAL=3600 DISK
|
The sequence of commands in this example illustrates data recording
with a relatively small interval and data playback with a relatively
large interval. This is useful for producing average, minimum, and
maximum statistics that cover a wide range of time, but have greater
precision than if they had been gathered using the larger interval.
The first command records data on I/O operations for all disks on
the system for the indicated 8-hour period, using an interval of 2
minutes. The second command plays the data back with an interval of 1
hour, storing display output in the file HOURLY.LOG. You can then type
or print this file to show the cumulative average disk use at each hour
throughout the 8-hour period.
Note
The current statistic in HOURLY.LOG shows the current data in terms of
the original collection interval of 120 seconds, not the new collection
interval of 3600 seconds.
|
-
$ MONITOR/INPUT/NODISPLAY/SUMMARY=DAILY.LOG DISK
|
The command in this example uses the recording file created in the
previous example to produce a one-page summary report file showing
statistics for the indicated 8-hour period. The summary report has the
same format as a screen display. For example:
OpenVMS Monitor Utility
DISK I/O STATISTICS
on node TLC From: 25-JAN-2000 08:00:00
SUMMARY To: 25-JAN-2000 16:00:00
I/O Operation Rate CUR AVE MIN MAX
DSA0: SYSTEM_0 0.53 1.50 0.40 3.88
DSA1: SYSTEM_1 0.00 0.39 0.00 8.38
DSA4: WORK_0 0.00 0.11 0.00 1.29
DSA5: WORK_1 0.03 0.87 0.00 5.95
DSA6: WORK_2 0.03 0.25 0.00 2.69
DSA7: WORK_3 0.04 0.97 0.00 20.33
DSA17: TOM_DISK 0.00 0.04 0.00 0.80
DSA23: MKC 0.00 0.00 0.00 0.13
$4$DUA0: (RABBIT) SYSTEM_0 0.20 0.65 0.17 1.97
$4$DUA2: (RABBIT) SYSTEM_0 0.20 0.65 0.17 1.97
$4$DUA3: (RABBIT) SYSTEM_1 0.00 0.14 0.00 2.49
PLAYBACK SUMMARIZING
|
20.8.7 Using Remote Playback Monitoring
If suitably privileged, you can collect MONITOR data from any system to
which your system has a DECnet connection. You can then display the
data live on your local system. To do so, follow these steps:
- In the default DECnet directory on each remote system, create a
file named MONITOR.COM, similar to the following example:
$ !
$ ! * Enable MONITOR remote playback *
$ !
$ MONITOR /NODISPLAY/RECORD=SYS$NET ALL_CLASSES
|
- On your local system, define a logical name for the remote system
from which you want to collect data. Use the following syntax:
DEFINE remotenodename_mon node::task=monitor
|
You might want to define, in a login command procedure, a series of
logical names for all the systems you want to access.
- To display the remote MONITOR data as it is being collected, enter
a command using the following syntax:
MONITOR/INPUT=remotenodename_mon classnames
|
You can also place MONITOR.COM files in directories other than the
default DECnet directory and use access control strings or proxy
accounts to invoke these command files remotely.
When you invoke MONITOR on your local system, a process is created on
the remote system that executes the MONITOR.COM command file. The
remote system therefore experiences some associated CPU and DECnet
overhead. You can regulate the overhead in the MONITOR.COM file by
using the /INTERVAL qualifier and the list of class names.
Section 20.8.10 describes remote monitoring in a mixed-version cluster
system.
20.8.8 Rerecording Monitoring
Rerecording is a combination of playback and recording. You can use it
for data reduction of recording files. When you play back an existing
recording file, all MONITOR options are available to you; thus, you can
choose to record a subset of the recorded classes and a subset of the
recorded time segment and a larger interval value.
All these techniques produce a new, smaller recording file at the
expense of some of the recorded data. A larger interval value reduces
the volume of the collected data, so displays and summary output
produced from the newer recorded file will be less precise. Note that
average rate values are not affected in this case, but average level
values are less precise (since the sample size is reduced), as are
maximum and minimum values. The following example shows how to use the
rerecording mode of operation:
Example
MONREC.COM contains the following commands:
$ MONITOR/NODISPLAY/RECORD/INTERVAL=60 /BEGINNING=8:00/ENDING=16:00 DECNET,LOCK
$ MONITOR/INPUT/NODISPLAY/RECORD DECNET
|
The first command runs in a batch job, recording DECnet and lock
management data once every minute between the hours of 8 A.M. and 4
P.M.. The second command, which is issued after the first command
completes, rerecords the data by creating a new version of the
MONITOR.DAT file, containing only the DECnet data.
20.8.9 Running MONITOR Continuously
You can develop a database of performance information for your system
by running MONITOR continuously as a background process.
This section contains examples of procedures that you, as cluster
manager, might use to create multifile clusterwide summaries.
You can adapt the command procedures to suit conditions at your site.
Note that you must define the logical names SYS$MONITOR and MON$ARCHIVE
in SYSTARTUP.COM before executing any of the command files.
The directory with the logical name SYS$EXAMPLES includes three command
procedures that you can use to establish the database. Instructions for
installing and running the procedures are in the comments at the
beginning of each procedure. Table 20-10 contains a brief summary of
these procedures.
Table 20-10 MONITOR Command Procedures
Procedure |
Description |
MONITOR.COM
|
Creates a summary file from the recording file of the previous boot,
and then begins recording for this boot. The recording interval is 10
minutes.
|
MONSUM.COM
|
Generates two clusterwide multifile summary reports that are mailed to
the system manager: one report is for the previous 24 hours, and the
other is for the previous day's prime-time period (9 A.M. to 6 P.M.).
The procedure resubmits itself to run each day at midnight.
|
SUBMON.COM
|
Starts MONITOR.COM as a detached process. Invoke SUBMON.COM from the
site-specific startup command procedure.
|
While MONITOR records data continuously, a summary report can cover any
finite time segment. The MONSUM.COM command procedure, which is
executed every midnight, produces and mails the two multifile summary
reports described in Table 20-10. Because these reports are not
saved as files, to keep them, you must either extract them from your
mail file or alter the MONSUM.COM command procedure to save them.
20.8.9.1 Using the MONITOR.COM Procedure
The procedure in Example 20-7 archives the recording file and summary
file from the previous boot and initiates continuous recording for the
current boot. (Note that this procedure does not purge recording files.)
Example 20-7 MONITOR.COM Procedure |
$ SET VERIFY
$ !
$ ! MONITOR.COM
$ !
$ ! This command file is to be placed in a cluster-accessible directory
$ ! called SYS$MONITOR and submitted at system startup time as a detached
$ ! process via SUBMON.COM. For each node, MONITOR.COM creates, in
$ ! SYS$MONITOR, a MONITOR recording file that is updated throughout the
$ ! life of the boot. It also creates, in MON$ARCHIVE, a summary file from
$ ! the recording file of the previous boot, along with a copy of that
$ ! recording file. Include logical name definitions for both cluster-
$ ! accessible directories, SYS$MONITOR and MON$ARCHIVE, in SYSTARTUP.COM.
$ !
$ SET DEF SYS$MONITOR
$ SET NOON
$ PURGE MONITOR.LOG/KEEP:2
$ !
$ ! Compute executing node name and recording and summary file names
$ ! (incorporating node name and date).
$ !
$ NODE = F$GETSYI("NODENAME")
$ SEP = ""
$ IF NODE .NES. "" THEN SEP = "_"
$ DAY = F$EXTRACT (0,2,F$TIME())
$ IF F$EXTRACT(0,1,DAY) .EQS. " " THEN DAY = F$EXTRACT(1,1,DAY)
$ MONTH = F$EXTRACT(3,3,F$TIME())
$ ARCHFILNAM = "MON$ARCHIVE:"+NODE+SEP+"MON"+DAY+MONTH
$ RECFIL = NODE+SEP+"MON.DAT"
$ SUMFIL = ARCHFILNAM+".SUM"
$ !
$ ! Check for existence of recording file from previous boot and skip
$ ! summary if not present.
$ !
$ OPEN/READ/ERROR=NORECFIL RECORDING 'RECFIL'
$ CLOSE RECORDING
$ !
$ !
$ ! Generate summary file from previous boot.
$ !
$ MONITOR /INPUT='RECFIL' /NODISPLAY /SUMMARY='SUMFIL' -
$ ALL_CLASSES+MODE/ALL+STATES/ALL+SCS/ITEM=ALL+SYSTEM/ALL+DISK/ITEM=ALL
$ !
$ !
$ ! Compute subject string and mail summary file to cluster manager.
$ !
$ !
$ A="""
$ B=" MONITOR Summary "
$ SUB = A+NODE+B+F$TIME()+A
$ MAIL/SUBJECT='SUB' 'SUMFIL' CLUSTER_MANAGER
$ !
$ !
$ ! Archive recording file and delete it from SYS$MONITOR.
$ !
$ COPY 'RECFIL' 'ARCHFILNAM'.DAT
$ DELETE 'RECFIL';*
$ !
$ NORECFIL:
$ SET PROCESS/PRIORITY=15
$ !
$ !
$ ! Begin recording for this boot. The specified /INTERVAL value is
$ ! adequate for long-term summaries; you might need a smaller value
$ ! to get reasonable "semi-live" playback summaries (at the expense
$ ! of more disk space for the recording file).
$ !
$ MONITOR /INTERVAL=300 /NODISPLAY /RECORD='RECFIL' ALL_CLASSES
$ !
$ !
$ ! End of MONITOR.COM
$ !
|
20.8.9.2 Using the SUBMON.COM Procedure
The procedure in Example 20-8 submits MONITOR.COM as a detached
process from SYSTARTUP.COM to initiate continuous recording for the
current boot.
Example 20-8 SUBMON.COM Procedure |
$ SET VERIFY
$ !
$ ! SUBMON.COM
$ !
$ ! This command file is to be placed in a cluster-accessible directory
$ ! called SYS$MONITOR. At system startup time, for each node, it is
$ ! executed by SYSTARTUP.COM, following logical name definitions for
$ ! the cluster-accessible directories SYS$MONITOR and MON$ARCHIVE.
$ !
$ !
$ ! Submit detached MONITOR process to do continuous recording.
$ !
$ !
$ RUN SYS$SYSTEM:LOGINOUT.EXE -
/UIC=[1,4] -
/INPUT=SYS$MONITOR:MONITOR.COM -
/OUTPUT=SYS$MONITOR:MONITOR.LOG -
/ERROR=SYS$MONITOR:MONITOR.LOG -
/PROCESS_NAME="Monitor" -
/WORKING_SET=512 -
/MAXIMUM_WORKING_SET=512 -
/EXTENT=512/NOSWAPPING
$ !
$ !
$ ! End of SUBMON.COM
$ !
|
20.8.9.3 Using the MONSUM.COM Procedure
The procedure in Example 20-9 produces daily and prime-time
clusterwide summaries.
Example 20-9 MONSUM.COM Procedure |
$ SET VERIFY
$ !
$ ! MONSUM.COM
$ !
$ ! This command file is to be placed in a cluster-accessible directory
$ ! called SYS$MONITOR and executed at the convenience of the cluster
$ ! manager. The file generates both 24-hour and "prime time" cluster
$ ! summaries and resubmits itself to run each day at midnight.
$ !
$ SET DEF SYS$MONITOR
$ SET NOON
$ !
$ ! Compute file specification for MONSUM.COM and resubmit the file.
$ !
$ FILE = F$ENVIRONMENT("PROCEDURE")
$ FILE = F$PARSE(FILE,,,"DEVICE")+F$PARSE(FILE,,,"DIRECTORY")+F$PARSE(FILE,,,"NAME")
$ SUBMIT 'FILE' /AFTER=TOMORROW /NOPRINT
$ !
$ ! Generate 24-hour cluster summary.
$ !
$ !
$ MONITOR/INPUT=(SYS$MONITOR:*MON*.DAT;*,MON$ARCHIVE:*MON*.DAT;*) -
/NODISPLAY/SUMMARY=MONSUM.SUM -
ALL_CLASSES+DISK/ITEM=ALL+SCS/ITEM=ALL-
/BEGIN="YESTERDAY+0:0:0.00" /END="TODAY+0:0:0.00" /BY_NODE
$ !
$ !
$ ! Mail 24-hour summary file to cluster manager and delete the file from
$ ! SYS$MONITOR.
$ !
$ !
$ MAIL/SUBJECT="Daily Monitor Clusterwide Summary" MONSUM.SUM CLUSTER_MANAGER
$ DELETE MONSUM.SUM;*
$ !
$ ! Generate prime-time cluster summary.
$ !
$ !
$ MONITOR/INPUT=(SYS$MONITOR:*MON*.DAT;*,MON$ARCHIVE:*MON*.DAT;*) -
/NODISPLAY/SUMMARY=MONSUM.SUM -
ALL_CLASSES+DISK/ITEM=ALL+SCS/ITEM=ALL-
/BEGIN="YESTERDAY+9:0:0.00" /END="YESTERDAY+18:0:0.00" /BY_NODE
$ !
$ !
$ ! Mail prime-time summary file to cluster manager and delete the file
$ ! from SYS$MONITOR.
$ !
$ !
$ MAIL/SUBJECT="Prime-Time Monitor Clusterwide Summary" MONSUM.SUM CLUSTER_MANAGER
$ DELETE MONSUM.SUM;*
$ !
$ ! End of MONSUM.COM
$ !
|
Note that Mail commands in this procedure send files to user
CLUSTER_MANAGER. Replace CLUSTER_MANAGER with the appropriate user name
or logical name for your site.
Because summary data might be extensive, Compaq recommends that you
print out summary files.
20.8.10 Using Remote Monitoring
MONITOR is capable of using both TCP/IP and DECnet as a transport
mechanism.
Beginning with OpenVMS Version 7.0, to use TCP/IP, you must start the
TCP/IP server by issuing the following command inside
SYS$STARTUP:SYSTARTUP_VMS.COM:
$ @SYS$STARTUP:VPM$STARTUP.COM
|
DECnet continues to work as in the past: a network object is created at
the time of the request.
Remote Monitoring in a Mixed-Version OpenVMS Cluster System
You can monitor any node in an OpenVMS Cluster system either by issuing
the MONITOR CLUSTER command or by adding the /NODE qualifier to any
interactive MONITOR request.
Remote monitoring in an OpenVMS Cluster system might not be compatible,
however, between nodes that are running different versions of OpenVMS.
Table 20-11 shows the compatibility of versions for remote
monitoring.
If you attempt to monitor a remote node that is incompatible, the
system displays the following message:
%MONITOR-E-SRVMISMATCH, MONITOR server on remote node is an incompatible version
|
If this is the case, contact your Compaq support representative for a
remedial kit that corrects this problem. Before you install the
remedial kit, you can still use MONITOR to obtain data about the remote
node. To do this, record the data on the remote node and then run the
MONITOR playback feature to examine the data on the local node.
Another difference exists when you monitor remote nodes in an OpenVMS
Cluster system. Beginning with OpenVMS Version 6.2, the limit on the
number of disks that can be monitored was raised from 799 to 909 for
record output and from 799 to 1817 for display and summary outputs. If
you monitor a remote node running OpenVMS Version 6.2 or later from a
system running a version earlier than OpenVMS Version 6.2, the old
limit of 799 applies.
For more information about MONITOR, refer to the OpenVMS System Management Utilities Reference Manual.
|