HP OpenVMS Systems Documentation

OpenVMS Version 7.3 Release Notes

5.9.17 Multipath Failover Fails Infrequently on HSZ70/HSZ80 Controllers

V7.2-1

Under heavy load, a host-initiated manual or automatic path switch from one controller to another may fail on an HSZ70 or HSZ80 controller. Testing has shown this to occur infrequently.

Note

This problem has been corrected for the HSZ70 in the firmware revision HSOF V7.7 (and later versions) and will be corrected for the HSZ80 in a future release. It does not occur on the HSG80 controller.

5.9.18 SCSI Multipath Incompatibility with Some Third-Party Products

V7.2

OpenVMS Alpha Version 7.2 introduced the SCSI multipath feature, which provides support for failover between the multiple paths that can exist between a system and a SCSI device.

This SCSI multipath feature may be incompatible with some third-party disk caching, disk shadowing, or similar products. Compaq advises you to avoid the use of such software on SCSI devices that are configured for multipath failover (for example, SCSI devices that are connected to HSZ70 and HSZ80 controllers in multibus mode) until this feature is supported by the manufacturer of the software.

Third-party products that rely on altering the Driver Dispatch Table (DDT) of the OpenVMS Alpha SCSI disk class driver (SYS$DKDRIVER.EXE) may require changes to work correctly with the SCSI multipath feature. Manufacturers of such software can contact Compaq at vms_drivers@zko.dec.com for more information.

For more information about OpenVMS Alpha SCSI multipath features, see Guidelines for OpenVMS Cluster Configurations.

5.9.19 Gigabit Ethernet Switch Restriction in an OpenVMS Cluster System

V7.3

Attempts to add a Gigabit Ethernet node to an OpenVMS Cluster system over a Gigabit Ethernet switch will fail if the switch does not support autonegotiation. The DEGPA enables autonegotiation by default, but not all Gigabit Ethernet switches support autonegotiation. For example, the current Gigabit Ethernet switch made by Cabletron does not.

Furthermore, the messages that are displayed may be misleading. If the node is being added using CLUSTER_CONFIG.COM and the option to install a local page and swap disk is selected, the problem may look like a disk-serving problem. The node running CLUSTER_CONFIG.COM displays the message "waiting for node-name to boot," while the booting node displays "waiting to tune system." The list of available disks is never displayed because of a missing network path. The network path is missing because of the autonegotiation mismatch between the DEGPA and the switch.

To avoid this problem, disable autonegotiation on the new node's DEGPA, as follows:

Perform a conversational boot when first booting the node into the cluster.
Set the new node's system parameter LAN_FLAGS to a value of 32 to disable autonegotiation on the DEGPA.

5.9.20 DQDRIVER Namespace Collision Workaround

V7.3

Multiple systems in a cluster could each have IDE, ATA, or ATAPI devices potentially sharing the following names: DQA0, DQA1, DQB0, and DQB1.

Such sharing of device names could lead to confusion or errors. Starting with OpenVMS Version 7.2-1, you can avoid this problem by creating devices with unique names.

To create a list of uniquely named devices on your cluster, use the following procedure:

In SYSGEN, make sure DEVICE_NAMING is set to 1 and ALLOCLASS is set to a nonzero value.
Create a file named SYS$SYSTEM:SYS$DEVICES.DAT that specifies a port allocation class of 0 for the two DQ controllers (DQA and DQB).
You can either edit this file to add the information manually, or update this file automatically by using the following commands at bootstrap time:
SYSBOOT> SET /CLASS DQA 0 SYSBOOT> SET /CLASS DQB 0

Following is a sample SYS$SYSTEM:SYS$DEVICES.DAT file (for node ACORN::):

[Port ACORN$DQA]
Allocation Class = 0

[Port ACORN$DQB]
Allocation Class = 0

This procedure causes all DQ devices to be named according to the following format, which allows for unique device names across the cluster:

node-name$DQxn:

where:

node-name is the system name.
x is either A or B.
n is either 0 or 1.

Port allocation classes are described in the OpenVMS Cluster Systems manual, where this technique is fully documented.

You have the option of using a nonzero port allocation class in the SYS$DEVICES.DAT file. However, if you use nonzero port allocation classes, be sure to follow the rules outlined in the OpenVMS Cluster Systems manual.

Restriction:

If you attempt to use the DCL command $INITIALIZE to initialize an IDC hard drive on a remote system using the mass storage control protocol (MSCP) server, you may receive a warning message about the lack of a bad block file on the volume. You can safely ignore this warning message.

Additionally, previously unused drives from certain vendors contain factory-written data that mimics the data pattern used on a head alignment test disk. In this case, the OpenVMS software will not initialize this disk remotely. As a workaround, initialize the disk from its local system. Note that this workaround also avoids the bad block file warning message.

5.9.21 Shadowing Restriction on Fibre Channel Multiple-Switch Fabrics Removed

V7.3

Multiple-switch Fibre Channel fabrics are supported by OpenVMS, starting with the DEC-AXPVMS-VMS721_FIBRECHAN-V0200 remedial kit. However, a significant restriction existed in the use of Volume Shadowing for OpenVMS in configurations with a multiple-switch fabric. All Fibre Channel hosts that mounted the shadow set had to be connected to the same switch, or all the Fibre Channel shadow set members had to be connected to the same switch. If the Fibre Channel host or shadow set member was connected to multiple fabrics, then this rule had to be followed for each fabric.

Changes have been made to Volume Shadowing for OpenVMS in OpenVMS Version 7.3 that remove these configuration restrictions. These same changes are available for OpenVMS Versions 7.2-1 and 7.2-1H1 in the current version-specific Volume Shadowing remedial kits.

5.9.22 Fibre Channel Installation May Require Additional NPAGEVIR

V7.3

This problem is corrected in OpenVMS Alpha Version 7.2-1H1 and OpenVMS Alpha Version 7.3 with a new system parameter, NPAGECALC, which automatically calculates a value for NPAGEVIR and NPAGEDYN based on the amount of physical memory in the system.

5.9.23 Fibre Channel Adapters Off Line After a System Boot

V7.3

The problem of Fibre Channel adapters being off line after a system boot has been corrected in the following versions:

OpenVMS Alpha Version 7.2-1 with the remedial kit DEC-AXPVMS-VMS721_FIBRECHAN-V0300-4.PCSI
OpenVMS Alpha Version 7.2-1H1
OpenVMS Alpha Version 7.3

5.9.24 SHOW DEVICE Might Fail in Large Fibre Channel Configurations

V7.2-1

The problem of SHOW DEVICE failing in large Fibre Channel configurations has been corrected in the following versions:

OpenVMS Alpha Version 7.2-1 with the remedial kit DEC-AXPVMS-VMS721_UPDATE-V0100-4.PCSI
OpenVMS Alpha Version 7.2-1H1
OpenVMS Alpha Version 7.3

Before the correction, SHOW DEVICE might fail with a virtual address space full error on systems with more than 2400 unit control blocks (UCBs). In multipath SCSI and FC configurations, there is a UCB for each path from the host to every storage device.

Note that any procedure that calls SHOW DEVICE, such as CLUSTER_CONFIG, can also experience this problem.

5.9.25 Boot Failure with the KGPSA Loopback Connector Installed

V7.2-1

The problem of boot failure with the KGPSA loopback connector has been corrected in the following versions:

OpenVMS Version 7.2-1 systems with the current FIBRE_SCSI update kit for OpenVMS Alpha Version 7.2-1 installed
OpenVMS Alpha Version 7.2-1H1
OpenVMS Alpha Version 7.3

Before the correction, the system failed to boot if there was a KGPSA in the system with a loopback connector installed. The loopback connector is the black plastic protective cover over the GLMs/fiber-optic ports of the KGPSA.

If possible, install OpenVMS Alpha Version 7.2-1 and the current FIBRE_SCSI update kit for OpenVMS Alpha Version 7.2-1 kit before installing the KGPSA in your system.

If the KGPSA is installed on your system and the current FIBRE_SCSI update kit for OpenVMS Alpha Version 7.2-1 is not installed, you can connect the KGPSA to your Fibre Channel storage subsystem and then boot OpenVMS.

If you are not ready to connect the KGPSA to your Fibre Channel storage subsystem, you can do either of the following:

Remove the loopback connector but leave the KGPSA in the Alpha system, then boot OpenVMS. Do not replace the loopback connector after you boot OpenVMS.
Remove the KGPSA from the Alpha system, then boot OpenVMS.

If you attempt to boot OpenVMS when a KGPSA is installed with the loopback connector still attached, the system hangs early in the boot process, at the point when it should configure the Fibre Channel adapters.

5.9.26 Fibre Channel Path Name Syntax Permits Quotation Marks

V7.2

Enclosing a Fibre Channel path name in quotation marks is valid, starting in OpenVMS Alpha Version 7.3.

Prior to OpenVMS Version 7.3, the documentation and help text indicated that a path name could be enclosed in quotation marks, for example:

$ SET DEVICE $1$dga166:/PATH="PGA0.5000-1FE1-0000-1501"/SWITCH

In versions of the system prior to OpenVMS Version 7.3, this command fails with the following error:

%SET-E-NOSUCHPATH, path "PGA0.5000-1FE1-0000-1501" does not exist for device $1$DGA166:

To prevent this problem on systems running prior versions of OpenVMS, omit the quotation marks that surround the path identifier string, as follows:

$ SET DEVICE $1$dga166:/PATH=PGA0.5000-1FE1-0000-1501/SWITCH

5.9.27 Reconfigured Fibre Channel Disks Do Not Come On Line

V7.2

The following problem is corrected in OpenVMS Alpha Version 7.2-1H1 and OpenVMS Alpha Version 7.3.

Each Fibre Channel device has two identifiers on the HSG80. The first is the logical unit number (LUN). This value is established when you use the command ADD UNIT Dnnn on the HSG80, where nnn is the LUN value. The LUN value is used by the HSG80 to deliver packets to the correct destination. The LUN value must be unique on the HSG80 subsystem. The second identifier is the device identifier. This value is established when you use the following command, where nnnnn is the device identifier:

$ SET Dmmm IDENTIFIER=nnnnn

The device identifier is used in the OpenVMS device name and must be unique in your cluster.

5.9.28 Device Identifier Requirement for the HSG80 CCL

V7.2

In OpenVMS Alpha Version 7.2-1H1 and OpenVMS Alpha Version 7.3, assigning a device identifier to the HSG CCL is optional. If you do not assign one, OpenVMS will not configure the $1$GGA device but will configure the other devices on the HSG subsystem.

5.9.29 Undesired Automatic Path Switches

V7.2

The problem described in this note is corrected in OpenVMS Alpha Version 7.3 by giving preference to the current path, thereby avoiding path switching after a transient error.

Every I/O error that invokes mount verification causes the multipath failover code to search for a working path. In earlier versions of OpenVMS, the multipath algorithm started with the primary path (that is, the first path configured by OpenVMS) and performed a search, giving preference to any direct paths to an HSx controller that has the device on line. Before the correction, the algorithm did not test the current path first, and did not stay on that path if the error condition had cleared.

5.10 OpenVMS Registry

The release notes in this section pertain to the OpenVMS Registry.

5.10.1 Registry Services in a Mixed OpenVMS V7.3/V7.2-1 Cluster

V7.3

Removing the data transfer size restrictions on the OpenVMS NT Registry required a change in the communication protocol used by the Registry. The change means that components of the Registry (the $REGISTRY system service and the Registry server) in OpenVMS V7.3 are incompatible with their counterparts in OpenVMS V7.2-1.

If you plan to run a cluster with mixed versions of OpenVMS, and you plan to use the $REGISTRY service or a product that uses the $REGISTRY service (such as Advanced Server, or COM for OpenVMS) then you are restricted to running these services on the OpenVMS V7.3 nodes only, or on the V7.2-1 nodes only, but not both.

If you need to run Registry services on both V7.3 and V7.2-1 nodes in the same cluster, please contact your Compaq Services representative.

5.10.2 Backup and Restore of the OpenVMS NT Registry Database

V7.3

The backup and restore of the OpenVMS NT Registry database is discussed in the OpenVMS Connectivity Developer Guide. Compaq would like to stress the importance of periodic backups. Database corruptions are rare, but have been exposed during testing of previous versions of OpenVMS with databases larger than 2 megabytes. A database of this size is itself rare; initial database size is 8 kilobytes. The corruptions are further isolated by occurring only when rebooting an entire cluster.

The Registry server provides a way of backing up the database automatically. By default, the Registry server takes a snapshot of the database once per day. However, this operation is basically a file copy and, by default, it purges the copies to the most recent five. It is conceivable that a particular area of the database may become corrupted and Registry operations will continue as long as applications do not access that portion of the database. This means that the daily backup could in fact be making a copy of an already corrupt file.

To safeguard against this, Compaq recommends that you take these additional steps:

Ensure that the SYS$REGISTRY directory is part of your incremental or full backup regimen, so that previous versions of the database are always preserved. If, for example, you perform backups on a weekly basis, you may want to increase the number of snapshot versions that are retained from 5 to 8. See the OpenVMS Connectivity Developer Guide for instructions on how to change this parameter.
Periodically export the database. Exporting the database has several advantages. First, it causes the server to read every key and value, so it effectively performs a validation of the database. Second, it writes out the database in a form that can be edited and repaired. This is not true of the snapshot files which may be difficult to repair. Third, by periodically exporting the database, creating a new database, and importing the saved export file; you are effectively compacting the database and thereby keeping it smaller and more efficient.

It should also be noted that in previous versions of OpenVMS, the EXPORT command may have failed to complete the operation under some conditions. You could normally recover simply by re-invoking the REG$CP image and retrying the operation until it was successful.

In addition, in previous versions of OpenVMS, the IMPORT command failed to properly import keys with classnames or links. The only way to recover from this was to modify the keys to add in the classnames or links, or to recreate the keys in question.

5.11 Performance---Comparing Application Performance Data

V7.3

The OpenVMS virtual I/O cache (VIOC) and the extended file cache (XFC) are file-oriented disk caches that can help to reduce I/O bottlenecks and improve performance. (Note that the XFC appears on Alpha systems beginning with Version 7.3.) Cache operation is transparent to application software. Frequently used file data can be cached in memory so that the file data can be used to satisfy I/O requests directly from memory rather than from disk.

Prior to Version 7.0, when an I/O was avoided because the data was returned from the cache, the direct I/O (DIO) count for the process was not incremented because the process did not actually perform an I/O operation to a device. Starting with Version 7.0, a change was made to cause all I/O requests---even those I/Os that were actually avoided because of data being returned from the cache---to be counted as direct I/Os.

This change can be a potential cause for confusion when you are comparing application performance data on different versions of OpenVMS. Applications running on Version 7.0 and later may appear to be performing more I/O than they did when run on earlier versions, even though the actual amount of I/O to the disk remains the same.

5.12 Point-to-Point Utility Documentation

V7.3

The Point-to-Point utility (PPPD) initiates and manages a Point-to-Point Protocol (PPP) network connection and its link parameters from an OpenVMS Alpha host system.

A chapter in the OpenVMS System Management Utilities Reference Manual: M--Z describes the PPPD commands with their parameters and qualifiers, which support PPP connections.

5.13 Queue Manager---Long Boot Times

V7.3

At certain instances the queue journal file (SYS$QUEUE_MANAGER.QMAN$JOURNAL) may grow to a large size (over 500,000 blocks), especially if there is a very large volume of queue activity. This may cause either a long boot time or the display of an error message, QMAN-E-NODISKSPACE , in the OPERATOR.LOG. The long boot time is caused by the queue manager needing a large space to accommodate the queue journal file.

The following example shows the error messages displayed in the operator.log:

%%%%%%%%%%%  OPCOM   2-MAR-2000 23:05:31.24  %%%%%%%%%%%
Message from user QUEUE_MANAGE on PNSFAB
%QMAN-E-OPENERR, error opening $1$DUA0:[SYS3.SYSCOMMON.][SYSEXE]
    SYS$QUEUE_MANAGER.QMAN$JOURNAL;1

%%%%%%%%%%%  OPCOM   2-MAR-2000 23:05:32.42  %%%%%%%%%%%
Message from user QUEUE_MANAGE on PNSFAB
-RMS-F-FUL, device full (insufficient space for allocation)

%%%%%%%%%%%  OPCOM   2-MAR-2000 23:05:32.87  %%%%%%%%%%%
Message from user QUEUE_MANAGE on PNSFAB
-SYSTEM-W-DEVICEFULL, device full - allocation failure

%%%%%%%%%%%  OPCOM   2-MAR-2000 23:05:32.95  %%%%%%%%%%%
Message from user QUEUE_MANAGE on PNSFAB
%QMAN-E-NODISKSPACE, disk space not available for queue manager to continue

%%%%%%%%%%%  OPCOM   2-MAR-2000 23:05:33.07  %%%%%%%%%%%
Message from user QUEUE_MANAGE on PNSFAB
-QMAN-I-FREEDISK, free up 191040 blocks on disk _$1$DUA0

You can shrink the size of the journal file by having a privileged user issue the following DCL command:

$ MCR JBC$COMMAND DIAG 7

Executing this DCL command check points the queue journal file and shrinks the file to the minimum size required for queue system operation.

Until this problem is fixed, use this workaround to shrink the size to a small file.

5.14 RMS Journaling

The following release notes pertain to RMS Journaling for OpenVMS.

5.14.1 Modified Journal File Creation

V7.2

Prior to Version 7.2, recovery unit (RU) journals were created temporarily in the [SYSJNL] directory on the same volume as the file that was being journaled. The file name for the recovery unit journal had the form RMS$process_id (where process_id is the hexadecimal representation of the process ID) and a file type of RMS$JOURNAL.

The following changes have been introduced to RU journal file creation in OpenVMS Version 7.2:

The files are created in node-specific subdirectories of the [SYSJNL] directory.
The file name for the recovery unit journal has been shortened to the form: YYYYYYYY, where YYYYYYYY is the hexadecimal representation of the process ID in reverse order.

These changes reduce the directory overhead associated with journal file creation and deletion.

The following example shows both the previous and current versions of journal file creation:

Previous versions: [SYSJNL]RMS$214003BC.RMS$JOURNAL;1
Current version: [SYSJNL.NODE1]CB300412.;1

If RMS does not find either the [SYSJNL] directory or the node-specific directory, RMS creates them automatically.

5.14.2 Recovery Unit Journaling Incompatible with Kernel Threads (Alpha Only)

V7.3

Because DECdtm Services is not supported in a multiple kernel threads environment and RMS recovery unit journaling relies on DECdtm Services, RMS recovery unit journaling is not supported in a process with multiple kernel threads enabled.

5.14.3 After-Image (AI) Journaling

V6.0

You can use after-image (AI) journaling to recover a data file that becomes unusable or inaccessible. AI recovery uses the AI journal file to roll forward a backup copy of the data file to produce a new copy of the data file at the point of failure.

In the case of either a process deletion or system crash, an update can be written to the AI journal file, but not make it to the data file. If only AI journaling is in use, the data file and journal are not automatically made consistent. If additional updates are made to the data file and are recorded in the AI journal, a subsequent roll forward operation could produce an inconsistent data file.

If you use Recovery Unit (RU) journaling with AI journaling, the automatic transaction recovery restores consistency between the AI journal and the data file.

Under some circumstances, an application that uses only AI journaling can take proactive measures to guard against data inconsistencies after process deletions or system crashes. For example, a manual roll forward of AI-journaled files ensures consistency after a system crash involving either an unshared AI application (single accessor) or a shared AI application executing on a standalone system.

However, in a shared AI application, there may be nothing to prevent further operations from being executed against a data file that is out of synchronization with the AI journal file after a process deletion or system crash in a cluster. Under these circumstances, consistency among the data files and the AI journal file can be provided by using a combination of AI and RU journaling.

Contents

Index