 |
Guidelines for OpenVMS Cluster Configurations
7.7.6 Configuring the HSG Host Connections
You can allow the HSG to create default host-connection information
automatically, or you can set host connection information manually, as
described in the HSG80 Array Controller ACS Version 8.4
Configuration and CLI Reference Guide. If you use the automatic
method, you will need to make some changes to the default settings.
Initially, the HSG has no host connections. Then, when hosts are
connected to the FC interconnect and powered up, the HSG detects them
and creates default connection information for them, as shown in
Example 7-9. After this default information is created, you may want
to rename the connections to values that have meaning in your
environment, as shown in Example 7-9.
Then, you must:
- Set the operating system name to VMS for each connection, as shown
in Example 7-9.
- Check that the unit offset value is set to zero (0) for each
connection. If not, set to zero.
Example 7-9 Configuring the HSG Host
Connections |
HSG> SHOW CONNECTIONS
No Connections
HSG> SHOW CONNECTIONS
Connection Unit
Name Operating system Controller Port Address Status Offset
!NEWCON44 WINNT THIS 2 210713 OL this 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
!NEWCON45 WINNT OTHER 2 210713 OL other 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
!NEWCON46 WINNT THIS 1 210713 OL this 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
!NEWCON47 WINNT OTHER 1 210713 OL other 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
HSG> RENAME !NEWCON44 FCNOD1A_T
HSG> RENAME !NEWCON45 FCNOD1A_O
HSG> RENAME !NEWCON46 FCNOD1B_T
HSG> RENAME !NEWCON47 FCNOD1B_O
HSG> SHOW CONNECTIONS
Connection Unit
Name Operating system Controller Port Address Status Offset
FCNOD1A_O WINNT OTHER 2 210713 OL other 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
FCNOD1A_T WINNT THIS 2 210713 OL this 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
FCNOD1B_O WINNT OTHER 1 210713 OL other 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
FCNOD1B_T WINNT THIS 1 210713 OL this 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
HSG> SET FCNOD1A_O OPERATING_SYSTEM = VMS
HSG> SET FCNOD1A_T OPERATING_SYSTEM = VMS
HSG> SET FCNOD1B_O OPERATING_SYSTEM = VMS
HSG> SET FCNOD1B_T OPERATING_SYSTEM = VMS
HSG> SHOW CONNECTIONS
Connection Unit
Name Operating system Controller Port Address Status Offset
FCNOD1A_O VMS OTHER 2 210713 OL other 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
FCNOD1A_T VMS THIS 2 210713 OL this 0
HOST_ID=1000-0000-C920-A7DB ADAPTER_ID=1000-0000-C920-A7DB
FCNOD1B_O VMS OTHER 1 210713 OL other 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
FCNOD1B_T VMS THIS 1 210713 OL this 0
HOST_ID=1000-0000-C920-A694 ADAPTER_ID=1000-0000-C920-A694
|
7.8 Creating a Cluster with a Shared FC System Disk
To configure nodes in an OpenVMS Cluster system, you must execute the
CLUSTER_CONFIG.COM (or CLUSTER_CONFIG_LAN.COM) command procedure. (You
can run either the full version, which provides more information about
most prompts, or the brief version.)
For the purposes of CLUSTER_CONFIG, a shared Fibre Channel (FC) bus is
treated like a shared SCSI bus, except that the allocation class
parameters do not apply to FC. The rules for setting node allocation
class and port allocation class values remain in effect when parallel
SCSI storage devices are present in a configuration that includes FC
storage devices.
To configure a new OpenVMS Cluster system, you must first enable
clustering on a single, or standalone, system. Then you can add
additional nodes to the cluster.
Example 7-10 shows how to enable clustering using brief version of
CLUSTER_CONFIG_LAN.COM on a standalone node called FCNOD1. At the end
of the procedure, FCNOD1 reboots and forms a one-node cluster.
Example 7-11 shows how to run the brief version of
CLUSTER_CONFIG_LAN.COM on FCNOD1 to add a second node, called FCNOD2,
to form a two-node cluster. At the end of the procedure, the cluster is
configured to allow FCNOD2 to boot off the same FC system disk as
FCNOD1.
The following steps are common to both examples:
- Select the default option [1] for ADD.
- Answer Yes when CLUSTER_CONFIG_LAN.COM asks whether there will be a
shared SCSI bus. SCSI in this context refers to FC as well as to
parallel SCSI.
The allocation class parameters are not affected by
the presence of FC.
- Answer No when the procedure asks whether the node will be a
satellite.
Example 7-10 Enabling Clustering on a
Standalone FC Node |
$ @CLUSTER_CONFIG_LAN BRIEF
Cluster Configuration Procedure
Executing on an Alpha System
DECnet Phase IV is installed on this node.
The LAN, not DECnet, will be used for MOP downline loading.
This Alpha node is not currently a cluster member
MAIN MENU
1. ADD FCNOD1 to existing cluster, or form a new cluster.
2. MAKE a directory structure for a new root on a system disk.
3. DELETE a root from a system disk.
4. EXIT from this procedure.
Enter choice [1]: 1
Is the node to be a clustered node with a shared SCSI or Fibre Channel bus (Y/N)? Y
Note:
Every cluster node must have a direct connection to every other
node in the cluster. Since FCNOD1 will be a clustered node with
a shared SCSI or FC bus, and Memory Channel, CI, and DSSI are not present,
the LAN will be used for cluster communication.
Enter this cluster's group number: 511
Enter this cluster's password:
Re-enter this cluster's password for verification:
Will FCNOD1 be a boot server [Y]? Y
Verifying LAN adapters in LANACP database...
Updating LANACP LAN server process volatile and permanent databases...
Note: The LANACP LAN server process will be used by FCNOD1 for boot
serving satellites. The following LAN devices have been found:
Verifying LAN adapters in LANACP database...
LAN TYPE ADAPTER NAME SERVICE STATUS
======== ============ ==============
Ethernet EWA0 ENABLED
CAUTION: If you do not define port allocation classes later in this
procedure for shared SCSI buses, all nodes sharing a SCSI bus
must have the same non-zero ALLOCLASS value. If multiple
nodes connect to a shared SCSI bus without the same allocation
class for the bus, system booting will halt due to the error or
IO AUTOCONFIGURE after boot will keep the bus offline.
Enter a value for FCNOD1's ALLOCLASS parameter [0]: 5
Does this cluster contain a quorum disk [N]? N
Each shared SCSI bus must have a positive allocation class value. A shared
bus uses a PK adapter. A private bus may use: PK, DR, DV.
When adding a node with SCSI-based cluster communications, the shared
SCSI port allocation classes may be established in SYS$DEVICES.DAT.
Otherwise, the system's disk allocation class will apply.
A private SCSI bus need not have an entry in SYS$DEVICES.DAT. If it has an
entry, its entry may assign any legitimate port allocation class value:
n where n = a positive integer, 1 to 32767 inclusive
0 no port allocation class and disk allocation class does not apply
-1 system's disk allocation class applies (system parameter ALLOCLASS)
When modifying port allocation classes, SYS$DEVICES.DAT must be updated
for all affected nodes, and then all affected nodes must be rebooted.
The following dialog will update SYS$DEVICES.DAT on FCNOD1.
There are currently no entries in SYS$DEVICES.DAT for FCNOD1.
After the next boot, any SCSI controller on FCNOD1 will use
FCNOD1's disk allocation class.
Assign port allocation class to which adapter [RETURN for none]: PKA
Port allocation class for PKA0: 10
Port Alloclass 10 Adapter FCNOD1$PKA
Assign port allocation class to which adapter [RETURN for none]: PKB
Port allocation class for PKB0: 20
Port Alloclass 10 Adapter FCNOD1$PKA
Port Alloclass 20 Adapter FCNOD1$PKB
WARNING: FCNOD1 will be a voting cluster member. EXPECTED_VOTES for
this and every other cluster member should be adjusted at
a convenient time before a reboot. For complete instructions,
check the section on configuring a cluster in the "OpenVMS
Cluster Systems" manual.
Execute AUTOGEN to compute the SYSGEN parameters for your configuration
and reboot FCNOD1 with the new parameters. This is necessary before
FCNOD1 can become a cluster member.
Do you want to run AUTOGEN now [Y]? Y
Running AUTOGEN -- Please wait.
The system is shutting down to allow the system to boot with the
generated site-specific parameters and installed images.
The system will automatically reboot after the shutdown and the
upgrade will be complete.
|
Example 7-11 Adding a Node to a Cluster with
a Shared FC System Disk |
$ @CLUSTER_CONFIG_LAN BRIEF
Cluster Configuration Procedure
Executing on an Alpha System
DECnet Phase IV is installed on this node.
The LAN, not DECnet, will be used for MOP downline loading.
FCNOD1 is an Alpha system and currently a member of a cluster
so the following functions can be performed:
MAIN MENU
1. ADD an Alpha node to the cluster.
2. REMOVE a node from the cluster.
3. CHANGE a cluster member's characteristics.
4. CREATE a duplicate system disk for FCNOD1.
5. MAKE a directory structure for a new root on a system disk.
6. DELETE a root from a system disk.
7. EXIT from this procedure.
Enter choice [1]: 1
This ADD function will add a new Alpha node to the cluster.
WARNING: If the node being added is a voting member, EXPECTED_VOTES for
every cluster member must be adjusted. For complete instructions
check the section on configuring a cluster in the "OpenVMS Cluster
Systems" manual.
CAUTION: If this cluster is running with multiple system disks and
common system files will be used, please, do not proceed
unless appropriate logical names are defined for cluster
common files in SYLOGICALS.COM. For instructions, refer to
the "OpenVMS Cluster Systems" manual.
Is the node to be a clustered node with a shared SCSI or Fibre Channel bus (Y/N)? Y
Will the node be a satellite [Y]? N
What is the node's SCS node name? FCNOD2
What is the node's SCSSYSTEMID number? 19.111
NOTE: 19.111 equates to an SCSSYSTEMID of 19567
Will FCNOD2 be a boot server [Y]? Y
What is the device name for FCNOD2's system root
[default DISK$V72_SSB:]? Y
What is the name of FCNOD2's system root [SYS10]?
Creating directory tree SYS10 ...
System root SYS10 created
CAUTION: If you do not define port allocation classes later in this
procedure for shared SCSI buses, all nodes sharing a SCSI bus
must have the same non-zero ALLOCLASS value. If multiple
nodes connect to a shared SCSI bus without the same allocation
class for the bus, system booting will halt due to the error or
IO AUTOCONFIGURE after boot will keep the bus offline.
Enter a value for FCNOD2's ALLOCLASS parameter [5]:
Does this cluster contain a quorum disk [N]? N
Size of pagefile for FCNOD2 [RETURN for AUTOGEN sizing]?
A temporary pagefile will be created until resizing by AUTOGEN. The
default size below is arbitrary and may or may not be appropriate.
Size of temporary pagefile [10000]?
Size of swap file for FCNOD2 [RETURN for AUTOGEN sizing]?
A temporary swap file will be created until resizing by AUTOGEN. The
default size below is arbitrary and may or may not be appropriate.
Size of temporary swap file [8000]?
Each shared SCSI bus must have a positive allocation class value. A shared
bus uses a PK adapter. A private bus may use: PK, DR, DV.
When adding a node with SCSI-based cluster communications, the shared
SCSI port allocation classes may be established in SYS$DEVICES.DAT.
Otherwise, the system's disk allocation class will apply.
A private SCSI bus need not have an entry in SYS$DEVICES.DAT. If it has an
entry, its entry may assign any legitimate port allocation class value:
n where n = a positive integer, 1 to 32767 inclusive
0 no port allocation class and disk allocation class does not apply
-1 system's disk allocation class applies (system parameter ALLOCLASS)
When modifying port allocation classes, SYS$DEVICES.DAT must be updated
for all affected nodes, and then all affected nodes must be rebooted.
The following dialog will update SYS$DEVICES.DAT on FCNOD2.
Enter [RETURN] to continue:
$20$DKA400:<VMS$COMMON.SYSEXE>SYS$DEVICES.DAT;1 contains port
allocation classes for FCNOD2. After the next boot, any SCSI
controller not assigned in SYS$DEVICES.DAT will use FCNOD2's
disk allocation class.
Assign port allocation class to which adapter [RETURN for none]: PKA
Port allocation class for PKA0: 11
Port Alloclass 11 Adapter FCNOD2$PKA
Assign port allocation class to which adapter [RETURN for none]: PKB
Port allocation class for PKB0: 20
Port Alloclass 11 Adapter FCNOD2$PKA
Port Alloclass 20 Adapter FCNOD2$PKB
Assign port allocation class to which adapter [RETURN for none]:
WARNING: FCNOD2 must be rebooted to make port allocation class
specifications in SYS$DEVICES.DAT take effect.
Will a disk local only to FCNOD2 (and not accessible at this time to FCNOD1)
be used for paging and swapping (Y/N)? N
If you specify a device other than DISK$V72_SSB: for FCNOD2's
page and swap files, this procedure will create PAGEFILE_FCNOD2.SYS
and SWAPFILE_FCNOD2.SYS in the [SYSEXE] directory on the device you
specify.
What is the device name for the page and swap files [DISK$V72_SSB:]?
%SYSGEN-I-CREATED, $20$DKA400:[SYS10.SYSEXE]PAGEFILE.SYS;1 created
%SYSGEN-I-CREATED, $20$DKA400:[SYS10.SYSEXE]SWAPFILE.SYS;1 created
The configuration procedure has completed successfully.
FCNOD2 has been configured to join the cluster.
The first time FCNOD2 boots, NETCONFIG.COM and
AUTOGEN.COM will run automatically.
|
7.9 Online Reconfiguration
The FC interconnect can be reconfigured while the hosts are running
OpenVMS. This includes the ability to:
- Add, move, or remove FC switches and HSGs.
- Add, move, or remove HSG virtual disk units.
- Change the device identifier or LUN value of the HSG virtual disk
units.
- Disconnect and reconnect FC cables. Reconnection can be to the same
or different adapters, switch ports, or HSG ports.
OpenVMS does not automatically detect most FC reconfigurations. You
must use the following procedure to safely perform an FC
reconfiguration, and to ensure that OpenVMS has adjusted its internal
data structures to match the new state:
- Dismount all disks that are involved in the reconfiguration.
- Perform the reconfiguration.
- Enter the following commands on each host that is connected to the
Fibre Channel:
SYSMAN> IO SCSI_PATH_VERIFY
SYSMAN> IO AUTOCONFIGURE
|
The purpose of the SCSI_PATH_VERIFY command is to check each FC path in
the system's IO database to determine whether the attached device has
been changed. If a device change is detected, then the FC path is
disconnected in the IO database. This allows the path to be
reconfigured for a new device by using the IO AUTOCONFIGURE command.
Note
In the current release, the SCSI_PATH_VERIFY command only operates on
FC disk devices. It does not operate on generic FC devices, such as the
HSG command console LUN (CCL). (Generic FC devices have names such as
$1$GGAnnnnn.
This means that once the CCL of an HSG has been configured by OpenVMS
with a particular device identifier, its device identifier should not
be changed.
|
Chapter 8 Configuring OpenVMS Clusters for Availability
Availability is the percentage of time that a computing system provides
application service. By taking advantage of OpenVMS Cluster features,
you can configure your OpenVMS Cluster system for various levels of
availability, including disaster tolerance.
This chapter provides strategies and sample optimal configurations for
building a highly available OpenVMS Cluster system. You can use these
strategies and examples to help you make choices and tradeoffs that
enable you to meet your availability requirements.
8.1 Availability Requirements
You can configure OpenVMS Cluster systems for different levels of
availability, depending on your requirements. Most organizations fall
into one of the broad (and sometimes overlapping) categories shown in
Table 8-1.
Table 8-1 Availability Requirements
Availability Requirements |
Description |
Conventional
|
For business functions that can wait with little or no effect while a
system or application is unavailable.
|
24 x 365
|
For business functions that require uninterrupted computing services,
either during essential time periods or during most hours of the day
throughout the year. Minimal down time is acceptable.
|
Disaster tolerant
|
For business functions with stringent availability requirements. These
businesses need to be immune to disasters like earthquakes, floods, and
power failures.
|
8.2 How OpenVMS Clusters Provide Availability
OpenVMS Cluster systems offer the following features that provide
increased availability:
- A highly integrated environment that allows multiple systems to
share access to resources
- Redundancy of major hardware components
- Software support for failover between hardware components
- Software products to support high availability
8.2.1 Shared Access to Storage
In an OpenVMS Cluster environment, users and applications on multiple
systems can transparently share storage devices and files. When you
shut down one system, users can continue to access shared files and
devices. You can share storage devices in two ways:
- Direct access
Connect disk and tape storage subsystems to CI
and DSSI interconnects rather than to a node. This gives all nodes
attached to the interconnect shared access to the storage system. The
shutdown or failure of a system has no effect on the ability of other
systems to access storage.
- Served access
Storage devices attached to a node can be served
to other nodes in the OpenVMS Cluster. MSCP and TMSCP server software
enable you to make local devices available to all OpenVMS Cluster
members. However, the shutdown or failure of the serving node affects
the ability of other nodes to access storage.
8.2.2 Component Redundancy
OpenVMS Cluster systems allow for redundancy of many components,
including:
- Systems
- Interconnects
- Adapters
- Storage devices and data
With redundant components, if one component fails, another is available
to users and applications.
8.2.3 Failover Mechanisms
OpenVMS Cluster systems provide failover mechanisms that enable
recovery from a failure in part of the OpenVMS Cluster. Table 8-2
lists these mechanisms and the levels of recovery that they provide.
Table 8-2 Failover Mechanisms
Mechanism |
What Happens if a Failure Occurs |
Type of Recovery |
DECnet--Plus cluster alias
|
If a node fails, OpenVMS Cluster software automatically distributes new
incoming connections among other participating nodes.
|
Manual. Users who were logged in to the failed node can reconnect to a
remaining node.
Automatic for appropriately coded applications. Such applications
can reinstate a connection to the cluster alias node name, and the
connection is directed to one of the remaining nodes.
|
I/O paths
|
With redundant paths to storage devices, if one path fails, OpenVMS
Cluster software fails over to a working path, if one exists.
|
Transparent, provided another working path is available.
|
Interconnect
|
With redundant or mixed interconnects, OpenVMS Cluster software uses
the fastest working path to connect to other OpenVMS Cluster members.
If an interconnect path fails, OpenVMS Cluster software fails over to a
working path, if one exists.
|
Transparent.
|
Boot and disk servers
|
If you configure at least two nodes as boot and disk servers,
satellites can continue to boot and use disks if one of the servers
shuts down or fails.
Failure of a boot server does not affect nodes that have already
booted, providing they have an alternate path to access MSCP served
disks.
|
Automatic.
|
Terminal servers and LAT software
|
Attach terminals and printers to terminal servers. If a node fails, the
LAT software automatically connects to one of the remaining nodes. In
addition, if a user process is disconnected from a LAT terminal
session, when the user attempts to reconnect to a LAT session, LAT
software can automatically reconnect the user to the disconnected
session.
|
Manual. Terminal users who were logged in to the failed node must log
in to a remaining node and restart the application.
|
Generic batch and print queues
|
You can set up generic queues to feed jobs to execution queues (where
processing occurs) on more than one node. If one node fails, the
generic queue can continue to submit jobs to execution queues on
remaining nodes. In addition, batch jobs submitted using the /RESTART
qualifier are automatically restarted on one of the remaining nodes.
|
Transparent for jobs waiting to be dispatched.
Automatic or manual for jobs executing on the failed node.
|
Autostart batch and print queues
|
For maximum availability, you can set up execution queues as autostart
queues with a failover list. When a node fails, an autostart execution
queue and its jobs automatically fail over to the next logical node in
the failover list and continue processing on another node. Autostart
queues are especially useful for print queues directed to printers that
are attached to terminal servers.
|
Transparent.
|
Reference: For more information about cluster alias,
generic queues, and autostart queues, see OpenVMS Cluster Systems.
|