HP OpenVMS Systems Documentation

Content starts here

Guidelines for OpenVMS Cluster Configurations


Previous Contents Index

6.7.10 Automatic Failback to a Direct Path

Multipath failover, as described in Section 6.7.9 applies to MSCP served paths as well. That is, if the current path is via an MSCP served path and the served path fails, mount verification can trigger an automatic failover back to a working direct path.

However, an I/O error on the MSCP-path is required to trigger the failback to a direct path. Consider the following sequence of events:

  • A direct path is being used to a device.
  • All direct paths fail and the next I/O to the device fails.
  • Mount verification provokes an automatic path switch to the MSCP served path.
  • Some time later a direct path to the device is restored

In this case the system would continue to use the MSCP served path to the device even though the direct path is preferable. This is because no error occured on the MSCP served path to provoke the path selection procedure.

The automatic failback feature is designed to address this situation. Multipath path polling attempts to fail back the device to a direct path when it detects that all of the following conditions apply:

  • A direct path to a device is responsive
  • The current path to the device is an MSCP served path
  • The current path was not selected by a manual path switch
  • An automatic failback has not been attempted on this device within the last MPDEV_AFB_INTRVL seconds.

The automatic failback is attempted by triggering mount verification and, as a result, the automatic failover procedure on the device.

The main purpose of multipath path polling is test the status of unused paths and avoid situations such as the following:

  • A system has paths A and B to a device.
  • The system is using path A.
  • If Path B becomes inoperative it goes unnoticed.
  • Much later and independently, path A breaks.
  • The system attempts to failover to path B, but finds that it is broken.

The poller would detect the failure of path B within one minute of its failure and issue an OPCOM message. An alert system manager can initiate corrective action immediately.

Note that it is possible that a device might successfully respond to the SCSI INQUIRY commands that are issued by the path poller but fail to successfully complete a path switch or mount verification on that path. Therefore there are three ways that a system manager or operator can control automatic failback:

  1. The minimum interval between automatic failback attempts on any given device is specified by the MPDEV_AFB_INTRVL system parameter. The default value is 300 seconds.
  2. If the value of MPDEV_AFB_INTRVL is 0, no automatic failback is attempted on any device on this system.
  3. Automatic failback on a specific device can be temporarily disabled by manually switching the device to the MSCP served path. You can do this even if the current path is an MSCP served path.

Because of the path selection procedure, the automatic failover procedure, and the automatic failback feature, the current path to a mounted device is usually a direct path when there are both direct and MSCP served paths to that device. The primary exceptions to this are when the path has been manually switched to the MSCP served path or when there are no working direct paths.

6.7.11 Enabling or Disabling Paths as Path Switch Candidates

By default, all paths are candidates for path switching. You can disable or re-enable a path as a switch candidate by using the SET DEVICE command with the /[NO]ENABLE qualifier. The reasons you might want to do this include the following:

  • You know a specific path is broken, or that a failover to that path will cause some members of the cluster to lose access.
  • To prevent automatic switching to a selected path while it is being serviced.

Note that the current path cannot be disabled.

The command syntax for enabling a disabled path is:


$ SET DEVICE device-name/[NO]ENABLE/PATH=path-identifier

The following command enables the MSCP served path of device $2$DKA502.


$ SET DEVICE $2$DKA502/ENABLE/PATH=MSCP

The following command disables a local path of device $2$DKA502.


$ SET DEVICE $2$DKA502/ENABLE/PATH=PKC0.5

Be careful when disabling paths. Avoid creating an invalid configuration, such as the one shown in Figure 6-20

6.7.12 Performance Considerations

The presence of an MSCP served path in a multipath set has no measurable effect on steady state I/O performance when the MSCP path is not the current path.

Note that the presence of an MSCP served path in a multipath set may increase the time it takes to find a working path during mount verification under certain unusual failure cases. Because direct paths are tried first, the presence of an MSCP-path should not normally affect recovery time.

However, the ability to dynamically switch from a direct path to an MSCP served path might significantly increase the I/O serving load on a given MSCP server system with a direct path to the multipath disk storage. Because served I/O takes precedence over almost all other activity on the MSCP server, failover to an MSCP served path can affect the reponsiveness of other applications on that MSCP server, depending on the capacity of the server and the increased rate of served I/O requests.

For example, a given OpenVMS Cluster configuration may have sufficient CPU and I/O bandwidth to handle an application workload when all the shared SCSI storage is accessed by direct SCSI paths. Such a configuration might be able to work acceptably as failures force a limited number of devices to switch over to MSCP served paths. However, as more failures occur, the load on the MSCP served paths may approach the capacity of the cluster and cause the performance of the application to degrade to an unacceptable level.

System parameters MSCP_BUFFER and MSCP_CREDITS allow the system manager to control the resources allocated to MSCP serving. If the MSCP server does not have enough resources to serve all incoming I/O requests, there will be a degradation of performance on systems that are accessing devices on the MSCP path on this MSCP server.

You can use the MONITOR MSCP command to determine if the MSCP server is short of resources. If the "Buffer Wait Rate" is non-zero, the MSCP server has had to stall some I/O while waiting for resources.

It is not possible to recommend correct values for these parameters. However, please note that the default value for MSCP_BUFFER has been increased from 128 to 1024 between the V7.2-1 and later releases of OpenVMS Alpha.

As noted in the online help for the SYSGEN utility, MSCP_BUFFER specifies the number of pagelets to be allocated to the MSCP server's local buffer area and MSCP_CREDITS specifies the number of outstanding I/O requests that can be active from one client system. For example, a system with many disks being served to several OpenVMS systems may have MSCP_BUFFER set to a value of 4000 or higher and MSCP_CREDITS set to 128 or higher.

Please see the "Managing System Parameters" chapter in the OpenVMS System Manager's Manual for information on making modifications to system parameters.

Compaq recommends that you test configurations that rely on failover to MSCP served paths at the worst-case MSCP served path load level. If you are configuring a multi-site disaster tolerant cluster that uses a multi-site SAN, consider the possible failures that can partition the SAN and force the use of MSCP served paths. In a symmetric dual-site configuration, Compaq recommends that you provide capacity for fifty percent of the SAN storage to be accessed by an MSCP served path.

You can test the capacity of your configuration by using manual path switching to force the use of MSCP served paths.

6.7.13 Console Considerations

This section describes how to use the console with parallel SCSI multipath devices. Refer to Section 7.6 for information on using the console with FC multipath devices.

The console uses traditional, path-dependent, SCSI device names. For example, the device name format for disks is DK, followed by a letter indicating the host adapter, followed by the SCSI target ID, and the LUN.

This means that a multipath device will have multiple names, one for each host adapter it is accessible through. In the following sample output of a console show device command, the console device name is in the left column. The middle column and the right column provide additional information, specific to the device type.

Notice, for example, that the devices dkb100 and dkc100 are really two paths to the same device. The name dkb100 is for the path through adapter PKB0, and the name dkc100 is for the path through adapter PKC0. This can be determined by referring to the middle column, where the informational name includes the HSZ allocation class. The HSZ allocation class allows you to determine which console "devices" are really paths to the same HSZ device.

Note

The console may not recognize a change in the HSZ allocation class value until after you issue a console INIT command.


>>>sho dev
dkb0.0.0.12.0              $55$DKB0                       HSZ70CCL  XB26
dkb100.1.0.12.0            $55$DKB100                        HSZ70  XB26
dkb104.1.0.12.0            $55$DKB104                        HSZ70  XB26
dkb1300.13.0.12.0          $55$DKB1300                       HSZ70  XB26
dkb1307.13.0.12.0          $55$DKB1307                       HSZ70  XB26
dkb1400.14.0.12.0          $55$DKB1400                       HSZ70  XB26
dkb1500.15.0.12.0          $55$DKB1500                       HSZ70  XB26
dkb200.2.0.12.0            $55$DKB200                        HSZ70  XB26
dkb205.2.0.12.0            $55$DKB205                        HSZ70  XB26
dkb300.3.0.12.0            $55$DKB300                        HSZ70  XB26
dkb400.4.0.12.0            $55$DKB400                        HSZ70  XB26
dkc0.0.0.13.0              $55$DKC0                       HSZ70CCL  XB26
dkc100.1.0.13.0            $55$DKC100                        HSZ70  XB26
dkc104.1.0.13.0            $55$DKC104                        HSZ70  XB26
dkc1300.13.0.13.0          $55$DKC1300                       HSZ70  XB26
dkc1307.13.0.13.0          $55$DKC1307                       HSZ70  XB26
dkc1400.14.0.13.0          $55$DKC1400                       HSZ70  XB26
dkc1500.15.0.13.0          $55$DKC1500                       HSZ70  XB26
dkc200.2.0.13.0            $55$DKC200                        HSZ70  XB26
dkc205.2.0.13.0            $55$DKC205                        HSZ70  XB26
dkc300.3.0.13.0            $55$DKC300                        HSZ70  XB26
dkc400.4.0.13.0            $55$DKC400                        HSZ70  XB26
dva0.0.0.1000.0            DVA0
ewa0.0.0.11.0              EWA0              08-00-2B-E4-CF-0B
pka0.7.0.6.0               PKA0                  SCSI Bus ID 7
pkb0.7.0.12.0              PKB0                  SCSI Bus ID 7  5.54
pkc0.7.0.13.0              PKC0                  SCSI Bus ID 7  5.54

The console does not automatically attempt to use an alternate path to a device if I/O fails on the current path. For many console commands, however, it is possible to specify a list of devices that the console will attempt to access in order. In a multipath configuration, you can specify a list of console device names that correspond to the multiple paths of a device. For example, a boot command, such as the following, will cause the console to attempt to boot the multipath device through the DKB100 path first, and if that fails, it will attempt to boot through the DKC100 path:


BOOT DKB100, DKC100


Chapter 7
Configuring Fibre Channel as an OpenVMS Cluster Storage Interconnect

A major benefit of OpenVMS is its support of a wide range of interconnects and protocols for network configurations and for OpenVMS Cluster System configurations. This chapter describes OpenVMS Alpha support for Fibre Channel as a storage interconnect for single systems and as a shared storage interconnect for multihost OpenVMS Cluster systems.

The following topics are discussed:

For information about multipath support for Fibre Channel configurations, see Chapter 6.

Note

The Fibre Channel interconnect is shown generically in the figures in this chapter. It is represented as a horizontal line to which the node and storage subsystems are connected. Physically, the Fibre Channel interconnect is always radially wired from a switch, as shown in Figure 7-1.

The representation of multiple SCSI disks and SCSI buses in a storage subsystem is also simplified. The multiple disks and SCSI buses, which one or more HSGx controllers serve as a logical unit to a host, are shown in the figures as a single logical unit.

For ease of reference, the term HSG is used throughout this chapter to represent both an HSG60 and an HSG80, except where it is important to note any difference, as in Table 7-2. In those instances, HSG60 or HSG80 is used.

7.1 Overview of Fibre Channel

Fibre Channel is an ANSI standard network and storage interconnect that offers many advantages over other interconnects. Its most important features are described in Table 7-1.

Table 7-1 Fibre Channel Features
Feature Description
High-speed transmission 1.06 gigabits per second, full duplex, serial interconnect (can simultaneously transmit and receive 100 megabytes of data per second)
Choice of media OpenVMS support for fiber-optic media.
Long interconnect distances OpenVMS support for multimode fiber at 500 meters per link and for single-mode fiber up to 100 kilometers per link.
Multiple protocols OpenVMS support for SCSI--3. Possible future support for IP, 802.3, HIPPI, ATM, IPI, and others.
Numerous topologies OpenVMS support for switched FC (highly scalable, multiple concurrent communications) and for multiple switches. Possible future support for mixed arbitrated loop and switches.

Currently, the OpenVMS implementation supports:

  • Single FC switch topology
  • Multiple FC switches (fabric)
  • Multimode fiber-optic media, at distances of up to 500 meters per link
  • Single-mode fiber-optic media for interswitch links (ISLs) at distances up to 100 kilometers

Figure 7-1 shows a logical view of a switched topology. The FC nodes are either Alpha hosts, or storage subsystems. Each link from a node to the switch is a dedicated FC connection. The switch provides store-and-forward packet delivery between pairs of nodes. Concurrent communication between disjoint pairs of nodes is supported by the switch.

Figure 7-1 Switched Topology (Logical View)


Figure 7-2 shows a physical view of a Fibre Channel switched topology. The configuration in Figure 7-2 is simplified for clarity. Typical configurations will have multiple Fibre Channel interconnects for high availability, as shown in Section 7.3.4.

Figure 7-2 Switched Topology (Physical View)


7.2 Fibre Channel Configuration Support

OpenVMS Alpha supports the Fibre Channel devices listed in Table 7-2. Note that Fibre Channel hardware names typically use the letter G to designate hardware that is specific to Fibre Channel. Fibre Channel configurations with other Fibre Channel equipment are not supported. To determine the required minimum versions of the operating system and firmware, see the release notes.

Compaq recommends that all OpenVMS Fibre Channel configurations use the latest update kit for the OpenVMS version they are running:

The root name of these kits is FIBRE_SCSI, a change from the earlier naming convention of FIBRECHAN. The kits are available from the following web site:


http://h18000.www1.hp.com/support/

Table 7-2 Fibre Channel Hardware Components
Component Name Description
AlphaServer 800, 1 1000A, 2 1200, 4000, 4100, 8200, 8400, DS10, DS20, DS20E, ES40, GS60, GS60E, GS80, GS140, GS160, and GS320 Alpha host.
HSG80 Fibre Channel controller module with two Fibre Channel host ports and support for six SCSI drive buses.
HSG60 Fibre Channel controller module with two Fibre Channel host ports and support for two SCSI buses.
MDR Fibre Channel Modular Data Router, a bridge to a SCSI tape or a SCSI tape library. The MDR must be connected to a Fibre Channel switch. It cannot be connected directly to an Alpha system.
KGPSA-BC, KGPSA-CA OpenVMS Alpha PCI to multimode Fibre Channel host adapters.
DSGGA-AA or -AB and DSGGB-AA or -AB 8-port or 16-port Fibre Channel switch.
VLGBICs Very long Gigabit interface converters (GBICs), which are used in long-distance configurations with single-mode fibre-optic cables. The order number is 169887-B21 for a pair of VLGBICs.
Single-mode, fiber-optic cable Single-mode fibre-optic cable up to 100 kilometers can be used.
BNGBX- nn Multimode fiber-optic cable ( nn denotes length in meters).

1On the AlphaServer 800, the integral S3 Trio must be disabled when the KGPSA is installed.
2Console support for FC disks is not available on this model.

OpenVMS supports the Fibre Channel SAN configurations described in the latest Compaq StorageWorks Heterogeneous Open SAN Design Reference Guide (order number: AA-RMPNA-TE) and the Data Replication Manager (DRM) user documentation. This includes support for:

  • Multiswitch FC fabrics.
  • Support for up to 500 meters of multimode fiber, and support for up to 100-kilometer interswitch links (ISLs) using single-mode fiber. In addition, DRM configurations provide longer-distance ISLs through the use of the Open Systems Gateway and Wave Division Multiplexors.
  • Sharing of the fabric and the HSG storage with non-OpenVMS systems.

The StorageWorks documentation is available from their web site. First locate the product; then you can access the documentation. The WWW address is:


http://h18000.www1.hp.com/storage

Within the configurations described in the StorageWorks documentation, OpenVMS provides the following capabilities and restrictions:

  • All OpenVMS disk functions are supported: system disk, dump disks, shadow set member, quorum disk, and MSCP served disk. Each virtual disk must be assigned an identifier that is unique clusterwide.
  • OpenVMS provides support for the number of hosts, switches, and storage controllers specified in the StorageWorks documentation. In general, the number of hosts and storage controllers is limited only by the number of available fabric connections.
  • The number of Fibre Channel host bus adapters per platform depends on the platform type. Currently, the largest platforms support up to 26 adapters (independent of the number of OpenVMS instances running on the platform).
  • OpenVMS requires that the HSG operate in SCSI-3 mode, and if the HSG is in a dual redundant configuration, then the HSG must be in multibus failover mode. The HSG can only be shared with other systems that operate in these modes.
  • The OpenVMS Fibre Channel host bus adapter must be connected directly to the FC switch. The host bus adapter is not supported on a Fibre Channel loop, nor in a point-to-point connection to another Fibre Channel end node. Other ports on the Fibre Channel switch can be connected to Fibre Channel loops or to QuickLoops. (The Compaq StorageWorks SAN Switch QuickLoop is a Fibre Channel topology that combines the services of fabric and Fibre Channel Arbitrated Loop (FC-AL) management.)
  • Neither the KGPSA-BC nor the KGPSA-CA can be connected to the same PCI bus as the S3 Trio 64V+ Video Card (PB2GA-JC/JD). On the AlphaServer 800, the integral S3 Trio must be disabled when the KGPSA is installed.
  • Hosts on the fabric can be configured as a single cluster or as multiple clusters and/or nonclustered nodes. It is critical to ensure that each cluster and each non-clustered system has exclusive access to its storage devices. HSG access IDs and/or FC switch zoning can be used to ensure that each HSG storage device is accessible to only one cluster or one non-clustered system.
  • The HSG supports a limited number of connections. A connection is a non-volatile record of a particular host bus adapter communicating with a particular port on the HSG. (Refer to the HSG CLI command SHOW CONNECTIONS.) The HSG ACS V8.5 allows a maximum of 64 connections and ACS V8.4 allows a maximum of 32 connections. The connection limit is the same for both single and dual redundant controllers.
    If your FC fabric is large, and the number of active connections exceeds the HSG limit, then you must reconfigure the fabric, or use FC switch zoning to "hide" some of the adapters from some of the HSG ports, in order to reduce the number of connections.
    The HSG does not delete connection information from the connection table when a host bus adapter is disconnected. Instead, the user must prevent the table from becoming full by explicitly deleting the connection information using a CLI command.

This configuration support is in effect as of the revision date of this document. OpenVMS plans to increase these limits in future releases.

In addition to the configurations already described, OpenVMS also supports the SANworks Data Replication Manager. This is a remote data vaulting solution that enables the use of Fibre Channel over longer distances. For more information, see the Compaq StorageWorks web site at:


http://h18000.www1.hp.com/products/storageworks


Previous Next Contents Index