HP OpenVMS Systems Documentation

Content starts here

OpenVMS System Manager's Manual


Previous Contents Index

19.8.3 System Load Test Phase

The purpose of the system load test is to simulate a number of terminal users who are demanding system resources simultaneously. The system load tests, directed by the file UETLOAD00.DAT, create a number of detached processes that execute various command procedures. Each process simulates a user logged in at a terminal; the commands within each procedure are the same types of commands that a user enters from a terminal. The load test creates the detached processes in quick succession, and the processes generally execute their command procedures simultaneously. The effect on the system is analogous to an equal number of users concurrently issuing commands from terminals. In this way, the load test creates an environment that is similar to normal system use.

The load test uses the logical name LOADS to determine the number of detached processes to create. When you initiate the UETP command procedure, it prompts for the number of users to be simulated (see Section 19.4.3) and consequently the number of detached processes to be created. Your response, which depends on the amount of memory and the swapping and paging space in your system, defines the group logical name LOADS.

The UETP master command procedure deassigns all group logical names assigned by its tests as part of the termination phase. The group logical name LOADS remains assigned only if the UETP package does not complete normally.

The command procedures executed by the load test can generate a large amount of output, depending on the number of detached processes created. For each detached process (or user), the test creates a version of an output file called UETLOnnnn.LOG (nnnn represents a string of numeric characters). The console displays only status information as the load test progresses.

Whether the load test runs as part of the entire UETP or as an individual phase, UETP combines the UETLOnnnn.LOG files, writes the output to the file UETP.LOG, and deletes the individual output files.

You can run the system load test as a single phase by selecting LOAD from the choices offered in the startup dialog. (See Section 19.4.1.)

19.8.4 DECnet for OpenVMS Test Phase

If DECnet for OpenVMS software is included in your OpenVMS system, a run of the entire UETP automatically tests DECnet hardware and software. Because communications devices are allocated to DECnet and the DECnet devices cannot be tested by the UETP device test, UETP will not test the Ethernet adapter if DECnet for OpenVMS or another application has allocated the device. The DECnet node and circuit counters are zeroed at the beginning of the DECnet test to allow for failure monitoring during the run.

As with other UETP phases, you can run the DECnet for OpenVMS phase individually by following the procedure described in Section 19.4.1.

19.8.4.1 Environment

The DECnet for OpenVMS test will work successfully on OpenVMS systems connected to all DECnet supported node types, including routing and nonrouting nodes and several different types of operating systems (such as RSTS, RSX, TOPS, and RT). To copy files between systems, the remote systems must have some type of default access. The DECnet phase tests the following nodes and circuits:

  • The node on which UETP is running.
  • All circuits in sequence, unless you have defined the logical name UETP$NODE_ADDRESS to be the remote node that you want to run the test on. If you have defined a remote node, the DECnet phase tests only one circuit.
  • All adjacent or first-hop nodes and all circuits in parallel.

No limit exists on the number of communication lines supported by the tests. A test on one adjacent node should last no more than two minutes at normal communications transfer rates.

Note

UETP assumes your system has default access for the FAL object, even though the network configuration command procedure NETCONFIG.COM does not provide access for the FAL object by default. When you install DECnet software with the defaults presented by NETCONFIG.COM, the UETP DECnet phase can produce error messages. You can ignore these error messages. See Section 19.7.13 for more information.

19.8.4.2 How the DECnet Phase Works

UETP (under the control of UETPHAS00.EXE) reads the file UETDNET00.DAT and completes the following steps during the DECnet for OpenVMS phase:

  1. Executes a set of Network Control Program (NCP) LOOP EXECUTOR commands to test the node on which UETP is running.
  2. Uses NCP to execute the command SHOW ACTIVE CIRCUITS. The results are placed in UETININET.TMP, from which UETP creates the data file UETININET.DAT. The UETININET.TMP file contains the following information for any circuit in the ON state but not in transition:
    • Circuit name
    • Node address
    • Node name (if one exists)

    The UETININET.TMP file is used throughout the DECnet phase to determine which devices to test.
  3. Uses the UETININET.TMP file to create an NCP command procedure for each testable circuit. Each command procedure contains a set of NCP commands to zero the circuit and node counters and to test the circuit and adjacent node by copying files back and forth.

    Note

    If you do not want the counters zeroed, do not test the DECnet for OpenVMS software.
  4. Executes the command procedures from Step 3 in parallel to simulate a heavy user load. The simulated user load is the lesser of the following values:
    • The number of testable circuits, multiplied by two
    • The maximum number of user-detached processes that can be created on the system before it runs out of resources (determined by UETINIT00)
  5. Executes a program, UETNETS00.EXE, that uses the UETININET.DAT file to check the circuit and node counters for each testable circuit. If a counter indicates possible degradation (by being nonzero), its name and value are reported to the console. All counters are reported in the log file, but only the counters that indicate degradation are reported to the console. An example of UETNETS00 output follows.


    %UETP-S-BEGIN, UETNETS00 beginning at  22-JUN-2000 13:45:33.18
    %UETP-W-TEXT, Circuit DMC-0 to (NODENAME1) OK.
    %UETP-I-TEXT, Node  (NODENAME2) over DMC-1 response timeouts = 1.
    %UETP-I-TEXT, Circuit DMC-1 to (NODENAME2) local buffer errors = 34.
    %UETP-I-TEXT, Node  (NODENAME3) over DMP-0 response timeouts = 3.
    %UETP-S-ENDED, UETNETS00 ended at 22-JUN-2000 13:45:36.34
    

    Because degradation is not necessarily an error, the test's success is determined by you, not by the system. The following counters indicate possible degradation:
    For Circuits
    • Arriving congestion loss
    • Corruption loss
    • Transit congestion loss
    • Line down
    • Initialization failure
    • Data errors inbound
    • Data errors outbound
    • Remote reply timeouts
    • Local reply timeouts
    • Remote buffer errors
    • Local buffer errors
    • Selection timeouts
    • Remote process errors
    • Local process errors
    • Locally initiated resets
    • Network initiated resets

    For Nodes
    • Response timeouts
    • Received connect resource errors
    • Node unreachable packet loss
    • Node out of range packet loss
    • Oversized packet loss
    • Packet format error
    • Partial routing update loss
    • Verification reject

19.8.5 Cluster-Integration Test Phase

The cluster-integration test phase consists of a single program and a command file that depend heavily on DECnet for OpenVMS software. This phase uses DECnet for OpenVMS software to create SYSTEST_CLIG processes on each OpenVMS node in the cluster and to communicate with each node. SYSTEST_CLIG is an account that is parallel to SYSTEST, but limited so that it can only be used as part of the cluster-integration test. The following restrictions on the SYSTEST_CLIG account are necessary for a correct run of the cluster test phase:

  • The account must be enabled and the password must be null. For more information, see Section 19.3.16.
  • The UIC must be the same as that of the SYSTEST account.
  • The account must have the same privileges and quotas as the SYSTEST account. For more information, see Section 19.7.2.
  • The account can allow login only through DECnet for OpenVMS software.
  • The account must be locked into running UETCLIG00.COM when it logs in.

These items are necessary to ensure the security and privacy of your system. If the test cannot create a SYSTEST_CLIG process on an OpenVMS node, it gives the reason for the failure and ignores that node for the lock tests and for sharing access during the file test. Also, the test does not copy log files from any node on which it cannot create the SYSTEST_CLIG process. If a communication problem occurs with a SYSTEST_CLIG process after the process has been created, the test excludes the process from further lock and file sharing tests. At the end of the cluster-integration test, an attempt is made to report any errors seen by that node.

UETCLIG00.EXE has two threads of execution: the primary and the secondary. The first, or primary thread, checks the cluster configuration (OpenVMS nodes, HSC nodes, and the attached disks that are available to the node running the test). For selected OpenVMS nodes, the primary thread attempts to start up a SYSTEST_CLIG process through DECnet software. If the primary thread was able to start a SYSTEST_CLIG process on a node, the node runs the command file UETCLIG00.COM, which starts up UETCLIG00.EXE and runs the secondary execution thread.

The process running the primary thread checks to see that it can communicate with the processes running the secondary threads. It then instructs them to take out locks so that a deadlock situation is created.

The primary thread tries to create a file on some disk on selected OpenVMS and HSC nodes in the cluster. It writes a block, reads it back, and verifies it. Next, it selects one OpenVMS node at random and asks that node to read the block and verify it. The primary thread then extends the file by writing another block and has the secondary thread read and verify the second block. The file is deleted.

The secondary processes exit. They copy the contents of their SYS$ERROR files to the primary process, so that the UETP log file and console report show all problems in a central place. DECnet for OpenVMS software automatically creates a NETSERVER.LOG in SYS$TEST as the test is run, so that if necessary, you can read that file later from the node in question.

During the test run, the primary process uses the system service SYS$BRKTHRU to announce the beginning and ending of the test to each OpenVMS node's console terminal.

You can define the group logical name MODE to the equivalence string DUMP to trace most events as they occur. Note that the logical name definitions apply only to the node on which they were defined. You must define MODE on each node in the cluster on which you want to trace events.


Chapter 20
Getting Information About the System

This chapter discusses setting up and maintaining system log files, maintaining error log files, and using system management utilities to monitor the system.

This chapter describes the following tasks:

Task Section
Using the Error Formatter (ERRFMT) Section 20.3
Using ERROR LOG to produce reports Section 20.4
Using DECevent to report system events Section 20.5
Setting up, maintaining, and printing the operator log file Section 20.6
Using security auditing Section 20.7
Using the Monitor utility to monitor system performance Section 20.8

This chapter explains the following concepts:

Concept Section
System log files Section 20.1
Error logging Section 20.2
Error Log utility (ERROR LOG) Section 20.4.1
DECevent Event Management utility Section 20.5.1
Operator log file Section 20.6.1
OPCOM messages Section 20.6.2
Security auditing Section 20.7.1
Monitor utility (MONITOR) Section 20.8.1

20.1 Understanding System Log Files

In maintaining your system, collect and review information about system events. The operating system provides several log files that record information about the use of system resources, error conditions, and other system events. Table 20-1 briefly describes each file and provides references to sections that discuss the files in more detail.

Table 20-1 System Log Files
Log File Description For More Information
Error log file The system automatically records device and CPU error messages in this file. Section 20.2
Operator log file The operator communication manager (OPCOM) records system events in this file. Chapter 2 and Section 20.6
Accounting file The accounting file tracks the use of system resources. Chapter 21
Security audit log file The audit server process preallocates disk space to and writes security-relevant system events to this file. Section 20.7

20.2 Understanding Error Logging

The error logging facility automatically writes error messages to the latest version of the error log file, SYS$ERRORLOG:ERRLOG.SYS. Error log reports are primarily intended for use by Compaq support representatives to identify hardware problems. System managers often find error log reports useful in identifying recurrent system failures that require outside attention.

Starting with OpenVMS Version 7.2, DECevent Version 2.9 or later is required for analyzing error log files. DECevent Version 2.9 provides a separate utility, the Binary Error Log Translation utility, in the DECevent kit. This utility converts the new Common Event Header (CEH) binary error log file into a binary error log file whose header format and structure can be read by earlier versions of DECevent and by the older Error Log utility.

For more information about the Binary Error Log Translation utility, refer to its documentation, which is included in the DECevent kit shipped with the OpenVMS kit.

Parts of the Error Logging Facility

The error logging facility consists of the parts shown in Table 20-2.

Table 20-2 Parts of the Error Logging Facility
Part Description
Executive routines Detect errors and events, and write relevant information into error log buffers in memory.
Error Formatter (ERRFMT) The ERRFMT process, which starts when the system is booted, periodically empties error log buffers, transforms the descriptions of errors into standard formats, and stores formatted information in an error log file on the system disk. (See Section 20.3.2.)

The Error Formatter allows you to send mail to the SYSTEM account or another user if the ERRFMT process encounters a fatal error and deletes itself. (See Section 20.3.3.)

Error Log utility (ERROR LOG) Invokes the Error Log Report Formatter (ERF), which selectively reports the contents of an error log file. You invoke ERROR LOG by entering the DCL command ANALYZE/ERROR_LOG. (See Section 20.4.2.)
DECevent Selectively reports the contents of an event log file; you invoke DECevent by entering the DCL command DIAGNOSE. (See Section 20.5.) DECevent Version 2.9 and higher includes the Binary Error Log Translation utility.

The executive routines and the Error Formatter (ERRFMT) process operate continuously without user intervention. The routines fill the error log buffers in memory with raw data on every detected error and event. When one of the available buffers becomes full, or when a time allotment expires, ERRFMT automatically writes the buffers to SYS$ERRORLOG:ERRLOG.SYS.

Sometimes a burst of errors can cause the buffer to fill up before ERRFMT can empty them. You can detect this condition by noting a skip in the error sequence number of the records reported in the error log reports. As soon as ERRFMT frees the buffer space, the executive routines resume preserving error information in the buffers.

The ERRFMT process displays an error message on the system console terminal and stops itself if it encounters excessive errors while writing the error log file. Section 20.3.1 explains how to restart the ERRFMT process.

20.3 Using the Error Formatter

The Error Formatter (ERRFMT) process is started automatically at boot time. The following sections explain how to perform these tasks:

Task Section
Restart the ERRFMT process, if necessary Section 20.3.1
Maintain error log files Section 20.3.2
Send mail if the ERRFMT process is deleted Section 20.3.3

20.3.1 Restarting the ERRFMT Process

To restart the ERRFMT process, follow these steps:

  1. Log in to the system manager's account so that you have the required privileges to perform the operation.
  2. Execute the site-independent startup command procedure (STARTUP.COM), specifying ERRFMT as the command parameter, as follows:


    $ @SYS$SYSTEM:STARTUP ERRFMT
    

    Note

    If disk quotas are enabled on the system disk, ERRFMT starts only if UIC [1,4] has sufficient quotas.

20.3.2 Maintaining Error Log Files

Because the error log file, SYS$ERRORLOG:ERRLOG.SYS, is a shared file, ERRFMT can write new error log entries while the Error Log utility reads and reports on other entries in the same file.

ERRLOG.SYS increases in size and remains on the system disk until you explicitly rename or delete it. Therefore, devise a plan for regular maintenance of the error log file. One method is to rename ERRLOG.SYS on a daily basis. If you do this, the system creates a new error log file. You might, for example, rename the current copy of ERRLOG.SYS to ERRLOG.OLD every morning at 9:00. To free space on the system disk, you can then back up the renamed version of the error log file on a different volume and delete the file from the system disk.

Another method is to keep the error log file on a disk other than the system disk by defining the logical name SYS$ERRORLOG to be the device and directory where you want to keep error log files; for example:


$ DEFINE/SYSTEM/EXECUTIVE SYS$ERRORLOG DUA2:[ERRORLOG]

To define this logical name each time you start up the system, add the logical name definition to your SYLOGICALS.COM procedure. See Section 5.2.5 for details.

Be careful not to delete error log files inadvertently. You might also want to adopt a file-naming convention that includes a beginning or ending date for the data in the file name.

20.3.3 Using ERRFMT to Send Mail

The Error Formatter (ERRFMT) allows users to send mail to the system manager or to another designated user if the ERRFMT process encounters a fatal error and deletes itself.

Two system logical names, ERRFMT$_SEND_MAIL and ERRFMT$_SEND_TO, control this feature:

  • ERRFMT$_SEND_MAIL
    To enable sending mail, must translate to the string TRUE, and is case insensitive. Any other value disables the sending of mail.
  • ERRFMT$_SEND_TO
    Must translate to a user name (the current default is SYSTEM).
    Compaq recommends that you do not use distribution lists and multiple user names.

You can define these logical names in one of two ways:

  • Dynamically, using DCL DEFINE/SYSTEM commands
    After you make the changes, you must stop and restart ERRFMT for the changes to take effect.
  • Permanently, in SYS$STARTUP:SYLOGICAL.COM
    The logical names you define take effect the next time the system is rebooted. The following instructions use this method.


Previous Next Contents Index