HP OpenVMS Systems Documentation

Guidelines for OpenVMS Cluster Configurations

10.5.1 Two-Node MEMORY CHANNEL Cluster

In Figure 10-11, two nodes are connected by a MEMORY CHANNEL interconnect, a LAN (Ethernet, FDDI, or ATM) interconnect, and a Fibre Channel interconnect.

Figure 10-11 Two-Node MEMORY CHANNEL OpenVMS Cluster

The advantages and disadvantages of the configuration shown in Figure 10-11 include:

Advantages

Both nodes have shared, direct access to all storage.
The Ethernet/FDDI/ATM interconnect enables failover if the MEMORY CHANNEL interconnect fails.
The limit of two MEMORY CHANNEL nodes means that no hub is required; one PCI adapter serves as a virtual hub.

Disadvantages

The amount of storage that is directly accessible to all nodes is limited.
A single SCSI interconnect or HSZ controller can become a single point of failure.

If the OpenVMS Cluster in Figure 10-11 required more processing power and better redundancy, this could lead to a configuration like the one shown in Figure 10-12.

10.5.2 Three-Node MEMORY CHANNEL Cluster

In Figure 10-12, three nodes are connected by a high-speed MEMORY CHANNEL interconnect, as well as by a LAN (Ethernet, FDDI, or ATM) interconnect. These nodes also have shared, direct access to storage through the Fibre Channel interconnect.

Figure 10-12 Three-Node MEMORY CHANNEL OpenVMS Cluster

The advantages and disadvantages of the configuration shown in Figure 10-12 include:

Advantages

All nodes have shared, direct access to storage.
The Ethernet/FDDI/ATM interconnect enables failover if the MEMORY CHANNEL interconnect fails.
The addition of a MEMORY CHANNEL hub increases the limit on the number of nodes to a total of four.

Disadvantage

The amount of storage that is directly accessible to all nodes is limited.

If the configuration in Figure 10-12 required more storage, this could lead to a configuration like the one shown in Figure 10-13.

10.5.3 Four-Node MEMORY CHANNEL OpenVMS Cluster

Figure 10-13, each node is connected by a MEMORY CHANNEL interconnect as well as by a CI interconnect.

Figure 10-13 MEMORY CHANNEL Cluster with a CI Cluster

The advantages and disadvantages of the configuration shown in Figure 10-13 include:

Advantages

All nodes have shared, direct access to all of the storage.
This configuration has more than double the storage and processing capacity as the one shown in Figure 10-12.
If the MEMORY CHANNEL interconnect fails, the CI can take over internode communication.
The CIPCA adapters on two of the nodes enable the addition of Alpha systems to a CI cluster that formerly comprised VAX (CIXCD-based) systems.
Multiple CIs between processors and storage provide twice the performance of one path. Bandwidth further increases because MEMORY CHANNEL offloads internode traffic from the CI, enabling the CI to be devoted only to storage traffic. This improves the performance of the entire cluster.
Volume shadowed, dual-ported disks increase data availability.

Disadvantage

This configuration is complex and requires the care of an experienced system manager.

10.6 Scalability in SCSI OpenVMS Clusters

SCSI-based OpenVMS Clusters allow commodity-priced storage devices to be used directly in OpenVMS Clusters. Using a SCSI interconnect in an OpenVMS Cluster offers you variations in distance, price, and performance capacity. This SCSI clustering capability is an ideal starting point when configuring a low-end, affordable cluster solution. SCSI clusters can range from desktop to deskside to departmental and larger configurations.

Note the following general limitations when using the SCSI interconnect:

Because the SCSI interconnect handles only storage traffic, it must always be paired with another interconnect for node-to-node traffic. In the figures shown in this section, MEMORY CHANNEL is the alternate interconnect; but CI, DSSI, Ethernet, and FDDI could also be used.
Total SCSI cable lengths must take into account the system's internal cable length. For example, an AlphaServer 1000 rackmount uses 1.6 m of internal cable to connect the internal adapter to the external connector. Two AlphaServer 1000s joined by a 2 m SCSI cable would use 1.6 m within each system, resulting in a total SCSI bus length of 5.2 m.
Reference: For more information about internal SCSI cable lengths as well as highly detailed information about clustering SCSI devices, see Appendix A.

The figures in this section show a progression from a two-node SCSI configuration with modest storage to a four-node SCSI hub configuration with maximum storage and further expansion capability.

10.6.1 Two-Node Fast-Wide SCSI Cluster

In Figure 10-14, two nodes are connected by a 25-m, fast-wide differential (FWD) SCSI bus, with MEMORY CHANNEL (or any) interconnect for internode traffic. The BA356 storage cabinet contains a power supply, a DWZZB single-ended to differential converter, and six disk drives. This configuration can have either narrow or wide disks.

Figure 10-14 Two-Node Fast-Wide SCSI Cluster

The advantages and disadvantages of the configuration shown in Figure 10-14 include:

Advantages

Low cost SCSI storage is shared by two nodes.
With the BA356 cabinet, you can use a narrow (8 bit) or wide (16 bit) SCSI bus.
The DWZZB converts single-ended signals to differential.
The fast-wide SCSI interconnect provides 20 MB/s performance.
MEMORY CHANNEL handles internode traffic.
The differential SCSI bus can be 25 m.

Disadvantage

Somewhat limited storage capability.

If the configuration in Figure 10-14 required even more storage, this could lead to a configuration like the one shown in Figure 10-15.

10.6.2 Two-Node Fast-Wide SCSI Cluster with HSZ Storage

In Figure 10-15, two nodes are connected by a 25-m, fast-wide differential (FWD) SCSI bus, with MEMORY CHANNEL (or any) interconnect for internode traffic. Multiple storage shelves are within the HSZ controller.

Figure 10-15 Two-Node Fast-Wide SCSI Cluster with HSZ Storage

The advantages and disadvantages of the configuration shown in Figure 10-15 include:

Advantages

Costs slightly more than the configuration shown in Figure 10-14, but offers significantly more storage. (The HSZ controller enables you to add more storage.)
Cache in the HSZ, which also provides RAID 0, 1, and 5 technologies. The HSZ is a differential device; no converter is needed.
MEMORY CHANNEL handles internode traffic.
The FWD bus provides 20 MB/s throughput.
Includes a 25 m differential SCSI bus.

Disadvantage

This configuration is more expensive than the one shown in Figure 10-14.

10.6.3 Three-Node Fast-Wide SCSI Cluster

In Figure 10-16, three nodes are connected by two 25-m, fast-wide (FWD) SCSI interconnects. Multiple storage shelves are contained in each HSZ controller, and more storage is contained in the BA356 at the top of the figure.

Figure 10-16 Three-Node Fast-Wide SCSI Cluster

The advantages and disadvantages of the configuration shown in Figure 10-16 include:

Advantages

Combines the advantages of the configurations shown in Figure 10-14 and Figure 10-15:
- Significant (25 m) bus distance and scalability.
- Includes cache in the HSZ, which also provides RAID 0, 1, and 5 technologies. The HSZ contains multiple storage shelves.
- FWD bus provides 20 MB/s throughput.
- With the BA356 cabinet, you can use narrow (8 bit) or wide (16 bit) SCSI bus.

Disadvantage

This configuration is more expensive than those shown in previous figures.

10.6.4 Four-Node Ultra SCSI Hub Configuration

Figure 10-17 shows four nodes connected by a SCSI hub. The SCSI hub obtains power and cooling from the storage cabinet, such as the BA356. The SCSI hub does not connect to the SCSI bus of the storage cabinet.

Figure 10-17 Four-Node Ultra SCSI Hub Configuration

The advantages and disadvantages of the configuration shown in Figure 10-17 include:

Advantages

Provides significantly more bus distance and scalability than the configuration shown in Figure 10-15.
The SCSI hub provides fair arbitration on the SCSI bus. This provides more uniform, predictable system behavior. Four CPUs are allowed only when fair arbitration is enabled.
Up to two dual HSZ controllers can be daisy-chained to the storage port of the hub.
Two power supplies in the BA356 (one for backup).
Cache in the HSZs, which also provides RAID 0, 1, and 5 technologies.
Ultra SCSI bus provides 40 MB/s throughput.

Disadvantage

You cannot add CPUs to this configuration by daisy-chaining a SCSI interconnect from a CPU or HSZ to another CPU.
This configuration is more expensive than those shown in Figure 10-14 and Figure 10-15.
Only HSZ storage can be connected. You cannot attach a storage shelf with disk drives directly to the SCSI hub.

10.7 Scalability in OpenVMS Clusters with Satellites

The number of satellites in an OpenVMS Cluster and the amount of storage that is MSCP served determine the need for the quantity and capacity of the servers. Satellites are systems that do not have direct access to a system disk and other OpenVMS Cluster storage. Satellites are usually workstations, but they can be any OpenVMS Cluster node that is served storage by other nodes in the OpenVMS Cluster.

Each Ethernet LAN segment should have only 10 to 20 satellite nodes attached. Figure 10-18, Figure 10-19, Figure 10-20, and Figure 10-21 show a progression from a 6-satellite LAN to a 45-satellite LAN.

10.7.1 Six-Satellite OpenVMS Cluster

In Figure 10-18, six satellites and a boot server are connected by Ethernet.

Figure 10-18 Six-Satellite LAN OpenVMS Cluster

The advantages and disadvantages of the configuration shown in Figure 10-18 include:

Advantages

The MSCP server is enabled for adding satellites and allows access to more storage.
With one system disk, system management is relatively simple.
Reference: For information about managing system disks, see Section 11.2.

Disadvantage

The Ethernet is a potential bottleneck and a single point of failure.

If the boot server in Figure 10-18 became a bottleneck, a configuration like the one shown in Figure 10-19 would be required.

10.7.2 Six-Satellite OpenVMS Cluster with Two Boot Nodes

Figure 10-19 shows six satellites and two boot servers connected by Ethernet. Boot server 1 and boot server 2 perform MSCP server dynamic load balancing: they arbitrate and share the work load between them and if one node stops functioning, the other takes over. MSCP dynamic load balancing requires shared access to storage.

Figure 10-19 Six-Satellite LAN OpenVMS Cluster with Two Boot Nodes

The advantages and disadvantages of the configuration shown in Figure 10-19 include:

Advantages

The MSCP server is enabled for adding satellites and allows access to more storage.
Two boot servers perform MSCP dynamic load balancing.

Disadvantage

The Ethernet is a potential bottleneck and a single point of failure.

If the LAN in Figure 10-19 became an OpenVMS Cluster bottleneck, this could lead to a configuration like the one shown in Figure 10-20.

10.7.3 Twelve-Satellite LAN OpenVMS Cluster with Two LAN Segments

Figure 10-20 shows 12 satellites and 2 boot servers connected by two Ethernet segments. These two Ethernet segments are also joined by a LAN bridge. Because each satellite has dual paths to storage, this configuration also features MSCP dynamic load balancing.

Figure 10-20 Twelve-Satellite OpenVMS Cluster with Two LAN Segments

The advantages and disadvantages of the configuration shown in Figure 10-20 include:

Advantages

The MSCP server is enabled for adding satellites and allows access to more storage.
Two boot servers perform MSCP dynamic load balancing.
From the perspective of a satellite on the Ethernet LAN, the dual paths to the Alpha and VAX nodes create the advantage of MSCP load balancing.
Two LAN segments provide twice the amount of LAN capacity.

Disadvantages

This OpenVMS Cluster configuration is limited by the number of satellites that it can support.
The single HSJ controller is a potential bottleneck and a single point of failure.

If the OpenVMS Cluster in Figure 10-20 needed to grow beyond its current limits, this could lead to a configuration like the one shown in Figure 10-21.

10.7.4 Forty-Five Satellite OpenVMS Cluster with FDDI Ring

Figure 10-21 shows a large, 51-node OpenVMS Cluster that includes 45 satellite nodes. The three boot servers, Alpha 1, Alpha 2, and Alpha 3, share three disks: a common disk, a page and swap disk, and a system disk. The FDDI ring has three LAN segments attached. Each segment has 15 workstation satellites as well as its own boot node.

Figure 10-21 Forty-Five Satellite OpenVMS Cluster with FDDI Ring

The advantages and disadvantages of the configuration shown in Figure 10-21 include:

Advantages

Decreased boot time, especially for an OpenVMS Cluster with such a high node count.
Reference: For information about booting an OpenVMS Cluster like the one in Figure 10-21 see Section 11.2.4.
The MSCP server is enabled for satellites to access more storage.
Each boot server has its own page and swap disk, which reduces I/O activity on the system disks.
All of the environment files for the entire OpenVMS Cluster are on the common disk. This frees the satellite boot servers to serve only root information to the satellites.
Reference: For more information about common disks and page and swap disks, see Section 11.2.
The FDDI ring provides 10 times the capacity of one Ethernet interconnect.

Disadvantages

The satellite boot servers on the Ethernet LAN segments can boot satellites only on their own segments.

10.7.5 High-Powered Workstation OpenVMS Cluster

Figure 10-22 shows an OpenVMS Cluster configuration that provides high performance and high availability on the FDDI ring.

Figure 10-22 High-Powered Workstation Server Configuration

In Figure 10-22, several Alpha workstations, each with its own system disk, are connected to the FDDI ring. Putting Alpha workstations on the FDDI provides high performance because each workstation has direct access to its system disk. In addition, the FDDI bandwidth is higher than that of the Ethernet. Because Alpha workstations have FDDI adapters, putting these workstations on an FDDI is a useful alternative for critical workstation requirements. FDDI is 10 times faster than Ethernet, and Alpha workstations have processing capacity that can take advantage of FDDI's speed.

10.7.6 Guidelines for OpenVMS Clusters with Satellites

The following are guidelines for setting up an OpenVMS Cluster with satellites:

Extra memory is required for satellites of large LAN configurations because each node must maintain a connection to every other node.
Place only 10 to 20 satellites on each LAN segment.
Maximize resources with MSCP dynamic load balancing, as shown in Figure 10-19 and Figure 10-20.
Keep the number of nodes that require MSCP serving minimal for good performance.
Reference: See Section 10.8.1 for more information about MSCP overhead.
To save time, ensure that the booting sequence is efficient, particularly when the OpenVMS Cluster is large or has multiple segments. See Section 11.2.4 for more information about how to reduce LAN and system disk activity and how to boot separate groups of nodes in sequence.
Use two or more LAN adapters per host (up to four adapters are supported for OpenVMS Cluster communications), and connect to independent LAN paths. This enables simultaneous two-way communication between nodes and allows traffic to multiple nodes to be spread over the available LANs.

10.7.7 Extended LAN Configuration Guidelines

You can use bridges between LAN segments to form an extended LAN (ELAN). This can increase availability, distance, and aggregate bandwidth as compared with a single LAN. However, an ELAN can increase delay and can reduce bandwidth on some paths. Factors such as packet loss, queuing delays, and packet size can also affect ELAN performance. Table 10-3 provides guidelines for ensuring adequate LAN performance when dealing with such factors.

**Table 10-3 ELAN Configuration Guidelines**
Factor	Guidelines
Propagation delay	The amount of time it takes a packet to traverse the ELAN depends on the distance it travels and the number of times it is relayed from one link to another by a bridge or a station on the FDDI ring. If responsiveness is critical, then you must control these factors. When an FDDI is used for OpenVMS Cluster communications, the ring latency when the FDDI ring is idle should not exceed 400 ms. FDDI packets travel at 5.085 microseconds/km and each station causes an approximate 1-ms delay between receiving and transmitting. You can calculate FDDI latency by using the following algorithm: Latency = (distance in km) * (5.085 ms/km) + (number of stations) * (1 ms/station) For high-performance applications, limit the number of bridges between nodes to two. For situations in which high performance is not required, you can use up to seven bridges between nodes.
Queuing delay	Queuing occurs when the instantaneous arrival rate at bridges and host adapters exceeds the service rate. You can control queuing by: Reducing the number of bridges between nodes that communicate frequently. Using only high-performance bridges and adapters. Reducing traffic bursts in the LAN. In some cases, for example, you can tune applications by combining small I/Os so that a single packet is produced rather than a burst of small ones. Reducing LAN segment and host processor utilization levels by using faster processors and faster LANs, and by using bridges for traffic isolation.
Packet loss	Packets that are not delivered by the ELAN require retransmission, which wastes network resources, increases delay, and reduces bandwidth. Bridges and adapters discard packets when they become congested. You can reduce packet loss by controlling queuing, as previously described. Packets are also discarded when they become damaged in transit. You can control this problem by observing LAN hardware configuration rules, removing sources of electrical interference, and ensuring that all hardware is operating correctly. Packet loss can also be reduced by using VMS Version 5.5--2 or later, which has PEDRIVER congestion control. The retransmission timeout rate, which is a symptom of packet loss, must be less than 1 timeout in 1000 transmissions for OpenVMS Cluster traffic from one node to another. ELAN paths that are used for high-performance applications should have a significantly lower rate. Monitor the occurrence of retransmission timeouts in the OpenVMS Cluster. Reference: For information about monitoring the occurrence of retransmission timeouts, see OpenVMS Cluster Systems.
Bridge recovery delay	Choose bridges with fast self-test time and adjust bridges for fast automatic reconfiguration. Reference: Refer to OpenVMS Cluster Systems for more information about LAN bridge failover.
Bandwidth	All LAN paths used for OpenVMS Cluster communication must operate with a nominal bandwidth of at least 10 Mb/s. The average LAN segment utilization should not exceed 60% for any 10-second interval. Use FDDI exclusively on the communication paths that have the highest performance requirements. Do not put an Ethernet LAN segment between two FDDI segments. FDDI bandwidth is significantly greater, and the Ethernet LAN will become a bottleneck. This strategy is especially ineffective if a server on one FDDI must serve clients on another FDDI with an Ethernet LAN between them. A more appropriate strategy is to put a server on an FDDI and put clients on an Ethernet LAN, as Figure 10-21 shows.
Traffic isolation	Use bridges to isolate and localize the traffic between nodes that communicate with each other frequently. For example, use bridges to separate the OpenVMS Cluster from the rest of the ELAN and to separate nodes within an OpenVMS Cluster that communicate frequently from the rest of the OpenVMS Cluster. Provide independent paths through the ELAN between critical systems that have multiple adapters.
Packet size	You can adjust the NISCS_MAX_PKTSZ system parameter to use the full FDDI packet size. Ensure that the ELAN path supports a data field of at least 4474 bytes end to end. Some failures cause traffic to switch from an ELAN path that supports 4474-byte packets to a path that supports only smaller packets. It is possible to implement automatic detection and recovery from these kinds of failures. This capability requires that the ELAN set the value of the priority field in the FDDI frame-control byte to zero when the packet is delivered on the destination FDDI link. Ethernet-to-FDDI bridges that conform to the IEEE 802.1 bridge specification provide this capability.

Contents

Index