HP OpenVMS Cluster Systems


Previous Contents Index

C.12 Integrity server Satellite Booting Messages

Table C-9 lists the Integrity server satellite booting messages.

Table C-9 Integrity server Satellite Booting Messages
Booting message Comments
MAC address
Booting over the network

Loading.: EIA0 Mac(00-17-a4-51-ce-4a)
This message displays the MAC address of the satellite system that is being used for booting.
BOOTP database
Client MAC Address: 00 17 A4 51 CE 4A ./

Client IP Address: 15.146.235.22
Subnet Mask: 255.255.254.0
BOOTP Server IP Address: 15.146.235.23
DHCP Server IP Address: 0.240.0.0
Boot file name: $2$DKA0:[SYS10.SYSCOMMON.SYSEXE]
VMS_LOADER.EFI
This message displays the BOOTP database of the satellite system. It shows all the information provided on the boot server while configuring the satellite.
Small memory configurations
ERROR: Unable to allocate aligned memory

%VMS_LOADER-I-Cannot allocate 256Meg for memory disk.
Falling back to 64Meg.
%VMS_LOADER-I-Memorydisk allocated at:0x0000000010000000
When booting OpenVMS Integrity server systems over the network or while booting OpenVMS as a guest OS under Integrity VM, OpenVMS allocates a memory disk from the main memory. For OpenVMS Version 8.4, the size of this memory disk defaults to 256 MB. However, for some older systems with relatively small memory configurations, this size cannot be allocated, and displays the following error message:
Unable to allocate aligned memory.

After this message is displayed, OpenVMS adopts a fallback strategy by allocating only 64 MB and excludes some newer drivers from the initial boot. The fallback message indicates that the action was performed. If the fallback message is displayed with no further error messages, the initial error message can be ignored.

Boot progress
Retrieving File Size.

Retrieving File (TFTP).
Starting: EIA0 Mac(00-17-a4-51-ce-4a)
Loading memory disk from IP 15.146.235.23
..................................................
Loading file: $2$DKA0:[SYS10.SYSCOMMON.SYSEXE]IPB.EXE
from IP 15.146.235.23
%IPB-I-SATSYSDIS, Satellite boot from system device $2$DKA0:
The system displays the detailed boot progress in the form of a system message when VMS_LOADER is obtained from the network, followed by one period character written to the console device for every file downloaded to start the boot sequence and last by a message indicating that IPB (the primary bootstrap image) has been loaded.

Caution: Satellite node boot may fail if you register the hardware address of Integrity server satellite node for multiple purposes.

For example, if you attempt a satellite boot of an Integrity server node in a cluster that has an Integrity server node configured and another cluster node configured as an Infoserver boot node with the same MAC address, Integrity sever satellite node will fail its satellite boot.

This is because the hardware address of the Integrity server satellite node is registered as an Infoserver boot node as well as an Integrity server satellite node.

An output similar to the following is displayed:


Loading.: eib0 Mac(00-0e-7f-7e-08-d9) 
Running LoadFile() 
 
CLIENT MAC ADDR: 00 0E 7F 7E 08 D9 
CLIENT IP: 16.116.42.85  MASK: 255.0.0.0  DHCP IP: 0.240.0.0 
 
TSize.Running LoadFile() 
 
Starting: eib0 Mac(00-0e-7f-7e-08-d9) 
 
 
Loading memory disk from IP 16.116.40.168  
 
Unable to open SYS$MEMORYDISK.DAT 
 
FATAL ERROR: Unable to boot using memorydisk method. 

Where; 16.116.40.168 is the IP address of the Alpha Infoserver node's IP address.


Appendix D
Sample Programs for LAN Control

Sample programs are provided to start and stop the NISCA protocol on a LAN adapter, and to enable LAN network failure analysis. The following programs are located in SYS$EXAMPLES:
Program Description
LAVC$START_BUS.MAR Starts the NISCA protocol on a specified LAN adapter.
LAVC$STOP_BUS.MAR Stops the NISCA protocol on a specified LAN adapter.
LAVC$FAILURE_ANALYSIS.MAR Enables LAN network failure analysis.
LAVC$BUILD.COM Assembles and links the sample programs.

Reference: The NISCA protocol, responsible for carrying messages across Ethernet LANs to other nodes in the cluster, is described in Appendix F.

D.1 Purpose of Programs

The port emulator driver, PEDRIVER, starts the NISCA protocol on all of the LAN adapters in the cluster. LAVC$START_BUS.MAR and LAVC$STOP_BUS.MAR are provided for cluster managers who want to split the network load according to protocol type and therefore do not want the NISCA protocol running on all of the LAN adapters.

Reference: See Section D.5 for information about editing and using the network failure analysis program.

D.2 Starting the NISCA Protocol

The sample program LAVC$START_BUS.MAR, provided in SYS$EXAMPLES, starts the NISCA protocol on a specific LAN adapter.

To build the program, perform the following steps:
Step Action
1 Copy the files LAVC$START_BUS.MAR and LAVC$BUILD.COM from SYS$EXAMPLES to your local directory.
2 Assemble and link the sample program using the following command:
$ @LAVC$BUILD.COM LAVC$START_BUS.MAR

D.2.1 Start the Protocol

To start the protocol on a LAN adapter, perform the following steps:
Step Action
1 Use an account that has the PHY_IO privilege---you need this to run LAVC$START_BUS.EXE.
2 Define the foreign command (DCL symbol).
3 Execute the foreign command (LAVC$START_BUS.EXE), followed by the name of the LAN adapter on which you want to start the protocol.

Example: The following example shows how to start the NISCA protocol on LAN adapter ETA0:


$ START_BUS:==$SYS$DISK:[ ]LAVC$START_BUS.EXE
$ START_BUS ETA

D.3 Stopping the NISCA Protocol

The sample program LAVC$STOP_BUS.MAR, provided in SYS$EXAMPLES, stops the NISCA protocol on a specific LAN adapter.

Caution: Stopping the NISCA protocol on all LAN adapters causes satellites to hang and could cause cluster systems to fail with a CLUEXIT bugcheck.

Follow the steps below to build the program:
Step Action
1 Copy the files LAVC$STOP_BUS.MAR and LAVC$BUILD.COM from SYS$EXAMPLES to your local directory.
2 Assemble and link the sample program using the following command:
$ @LAVC$BUILD.COM LAVC$STOP_BUS.MAR

D.3.1 Stop the Protocol

To stop the NISCA protocol on a LAN adapter, perform the following steps:
Step Action
1 Use an account that has the PHY_IO privilege---you need this to run LAVC$STOP_BUS.EXE.
2 Define the foreign command (DCL symbol).
3 Execute the foreign command (LAVC$STOP_BUS.EXE), followed by the name of the LAN adapter on which you want to stop the protocol.

Example: The following example shows how to stop the NISCA protocol on LAN adapter ETA0:


$ STOP_BUS:==$SYS$DISK[ ]LAVC$STOP_BUS.EXE
$ STOP_BUS ETA

D.3.2 Verify Successful Execution

When the LAVC$STOP_BUS module executes successfully, the following device-attention entry is written to the system error log:


DEVICE ATTENTION...
 
NI-SCS SUB-SYSTEM...
 
FATAL ERROR DETECTED BY DATALINK...

In addition, the following hexadecimal values are written to the STATUS field of the entry:

First longword (00000001)
Second longword (00001201)

The error-log entry indicates expected behavior and can be ignored. However, if the first longword of the STATUS field contains a value other than hexadecimal value 00000001, an error has occurred and further investigation may be necessary.

D.4 Analyzing Network Failures

LAVC$FAILURE_ANALYSIS.MAR is a sample program, located in SYS$EXAMPLES, that you can edit and use to help detect and isolate a failed network component. When the program executes, it provides the physical description of your cluster communications network to the set of routines that perform the failure analysis.

D.4.1 Failure Analysis

Using the network failure analysis program can help reduce the time necessary for detection and isolation of a failing network component and, therefore, significantly increase cluster availability.

D.4.2 How the LAVC$FAILURE_ANALYSIS Program Works

The following table describes how the LAVC$FAILURE_ANALYSIS program works.
Step Program Action
1 The program groups channels that fail and compares them with the physical description of the cluster network.
2 The program then develops a list of nonworking network components related to the failed channels and uses OPCOM messages to display the names of components with a probability of causing one or more channel failures.

If the network failure analysis cannot verify that a portion of a path (containing multiple components) works, the program:

  1. Calls out the first component in the path as the primary suspect (%LAVC-W-PSUSPECT)
  2. Lists the other components as secondary or additional suspects (%LAVC-I-ASUSPECT)
3 When the component works again, OPCOM displays the message %LAVC-S-WORKING.

D.5 Using the Network Failure Analysis Program

Table D-1 describes the steps you perform to edit and use the network failure analysis program.

Table D-1 Procedure for Using the LAVC$FAILURE_ANALYSIS.MAR Program
Step Action Reference
1 Collect and record information specific to your cluster communications network. Section D.5.1
2 Edit a copy of LAVC$FAILURE_ANALYSIS.MAR to include the information you collected. Section D.5.2
3 Assemble, link, and debug the program. Section D.5.3
4 Modify startup files to run the program only on the node for which you supplied data. Section D.5.4
5 Execute the program on one or more of the nodes where you plan to perform the network failure analysis. Section D.5.5
6 Modify MODPARAMS.DAT to increase the values of nonpaged pool parameters. Section D.5.6
7 Test the Local Area OpenVMS Cluster Network Failure Analysis Program. Section D.5.7

D.5.1 Create a Network Diagram

Follow the steps in Table D-2 to create a physical description of the network configuration and include it in electronic form in the LAVC$FAILURE_ANALYSIS.MAR program.

Table D-2 Creating a Physical Description of the Network
Step Action Comments
1 Draw a diagram of your OpenVMS Cluster communications network. When you edit LAVC$FAILURE_ANALYSIS.MAR, you include this drawing (in electronic form) in the program. Your drawing should show the physical layout of the cluster and include the following components:
  • LAN segments or rings
  • LAN bridges
  • Wiring concentrators, DELNI interconnects, or DEMPR repeaters
  • LAN adapters
  • Integrity servers and Alpha systems

For large clusters, you may need to verify the configuration by tracing the cables.

2 Give each component in the drawing a unique label. If your OpenVMS Cluster contains a large number of nodes, you may want to replace each node name with a shorter abbreviation. Abbreviating node names can help save space in the electronic form of the drawing when you include it in LAVC$FAILURE_ANALYSIS.MAR. For example, you can replace the node name ASTRA with A and call node ASTRA's two LAN adapters A1 and A2.
3 List the following information for each component:
  • Unique label
  • Type [SYSTEM, LAN_ADP, DELNI]
  • Location (the physical location of the component)
  • LAN address or addresses (if applicable)
Devices such as DELNI interconnects, DEMPR repeaters, and cables do not have LAN addresses.
4 Classify each component into one of the following categories:
  • Node: Integrity server or Alpha system in the OpenVMS Cluster configuration.
  • Adapter: LAN adapter on the system that is normally used for OpenVMS Cluster communications.
  • Component: Generic component in the network. Components in this category can usually be shown to be working if at least one path through them is working. Wiring concentrators, DELNI interconnects, DEMPR repeaters, LAN bridges, and LAN segments and rings typically fall into this category.
  • Cloud: Generic component in the network. Components in this category cannot be shown to be working even if one or more paths are shown to be working.
The cloud component is necessary only when multiple paths exist between two points within the network, such as with redundant bridging between LAN segments. At a high level, multiple paths can exist; however, during operation, this bridge configuration allows only one path to exist at one time. In general, this bridge example is probably better handled by representing the active bridge in the description as a component and ignoring the standby bridge. (You can identify the active bridge with such network monitoring software as RBMS or DECelms.) With the default bridge parameters, failure of the active bridge will be called out.
5 Use the component labels from step 3 to describe each of the connections in the OpenVMS Cluster communications network.  
6 Choose a node or group of nodes to run the network failure analysis program. You should run the program only on a node that you included in the physical description when you edited LAVC$FAILURE_ANALYSIS.MAR. The network failure analysis program on one node operates independently from other systems in the OpenVMS Cluster. So, for executing the network failure analysis program, you should choose systems that are not normally shut down. Other good candidates for running the program are systems with the following characteristics:
  • Faster CPU speed
  • Larger amounts of memory
  • More LAN adapters (running the NISCA protocol)

Note: The physical description is loaded into nonpaged pool, and all processing is performed at IPL 8. CPU use increases as the average number of network components in the network path increases. CPU use also increases as the total number of network paths increases.

D.5.2 Edit the Source File

Follow these steps to edit the LAVC$FAILURE_ANALYSIS.MAR program.
Step Action
1 Copy the following files from SYS$EXAMPLES to your local directory:
  • LAVC$FAILURE_ANALYSIS.MAR
  • LAVC$BUILD.COM
2 Use the OpenVMS Cluster network map and the other information you collected to edit the copy of LAVC$FAILURE_ANALYSIS.MAR.

Example D-1 shows the portion of LAVC$FAILURE_ANALYSIS.MAR that you edit.

Example D-1 Portion of LAVC$FAILURE_ANALYSIS.MAR to Edit

;       *** Start edits here *** 
 
;       Edit 1. 
; 
;               Define the hardware components needed to describe 
;               the physical configuration. 
; 
 
        NEW_COMPONENT   SYSTEM          NODE 
        NEW_COMPONENT   LAN_ADP         ADAPTER 
        NEW_COMPONENT   DEMPR           COMPONENT 
        NEW_COMPONENT   DELNI           COMPONENT 
        NEW_COMPONENT   SEGMENT         COMPONENT 
        NEW_COMPONENT   NET_CLOUD       CLOUD 
 
 
;       Edit 2. 
; 
;                       Diagram of a multi-adapter local area OpenVMS Cluster 
; 
; 
;        Sa   -------+---------------+---------------+---------------+-------  
;                    |               |               |               | 
;                    |             MPR_A             |               | 
;                    |          .----+----.          |               | 
;                    |         1|        1|         1|               | 
;                   BrA       ALPHA     BETA       DELTA            BrB 
;                    |         2|        2|         2|               | 
;                    |          `----+----'          |               | 
;                    |             LNI_A             |               | 
;                    |               |               |               | 
;        Sb   -------+---------------+---------------+---------------+-------  
; 
; 
;       Edit 3. 
; 
; Label    Node                       Description            
; -----   ------  -----------------------------------------------    
 
  SYSTEM  A,      ALPHA,  < - MicroVAX II; In the Computer room>...
  LAN_ADP A1,     ,       <XQA; ALPHA - MicroVAX II; Computer room>,...
  LAN_ADP A2,     ,       <XQB; ALPHA - MicroVAX II; Computer room>,...
 
  SYSTEM  B,      BETA,   < - MicroVAX 3500; In the Computer room>...
  LAN_ADP B1,     ,       <XQA; BETA - MicroVAX 3500; Computer room>,...
  LAN_ADP B2,     ,       <XQB; BETA - MicroVAX 3500; Computer room>,...
  
  SYSTEM  D,      DELTA, < - VAXstation II; In Dan's office>...
  LAN_ADP D1,     ,       <XQA; DELTA - VAXstation II; Dan's office>,...
  LAN_ADP D2,     ,       <XQB; DELTA - VAXstation II; Dan's office>,...
 
;       Edit 4. 
; 
;               Label each of the other network components. 
; 
 
        DEMPR   MPR_A, , <Connected to segment A; In the Computer room> 
        DELNI   LNI_A, , <Connected to segment B; In the Computer room> 
 
        SEGMENT Sa,  , <Ethernet segment A> 
        SEGMENT Sb,  , <Ethernet segment B> 
 
        NET_CLOUD       BRIDGES, , <Bridging between ethernet segments A and B> 
 
;       Edit 5. 
; 
;               Describe the network connections. 
; 
        CONNECTION      Sa,     MPR_A 
        CONNECTION              MPR_A,  A1 
        CONNECTION                      A1,     A 
        CONNECTION              MPR_A,  B1 
        CONNECTION                      B1,     B 
 
        CONNECTION      Sa,     D1 
        CONNECTION              D1,     D 
 
        CONNECTION      Sa,     BRIDGES 
        CONNECTION      Sb,     BRIDGES 
 
        CONNECTION      Sb,     LNI_A 
        CONNECTION              LNI_A,  A2 
        CONNECTION                      A2,     A 
        CONNECTION              LNI_A,  B2 
        CONNECTION                      B2,     B 
 
        CONNECTION      Sb,     D2 
        CONNECTION              D2,     D 
 
        .PAGE 
 
;       *** End of edits *** 

In the program, Edit number identifies a place where you edit the program to incorporate information about your network. Make the following edits to the program:
Location Action
Edit 1 Define a category for each component in the configuration. Use the information from step 5 in Section D.5.1. Use the following format:
NEW_COMPONENT component_type category

Example: The following example shows how to define a DEMPR repeater as part of the component category:

NEW_COMPONENT DEMPR COMPONENT

Edit 2 Incorporate the network map you drew for step 1 of Section D.5.1. Including the map here in LAVC$FAILURE_ANALYSIS.MAR gives you an electronic record of the map that you can locate and update more easily than a drawing on paper.
Edit 3 List each OpenVMS Cluster node and its LAN adapters. Use one line for each node. Each line should include the following information. Separate the items of information with commas to create a table of the information.
  • Component type, followed by a comma.
  • Label from the network map, followed by a comma.
  • Node name (for SYSTEM components only). If there is no node name, enter a comma.
  • Descriptive text that the network failure analysis program displays if it detects a failure with this component. Put this text within angle brackets (< >). This text should include the component's physical location.
  • LAN hardware address (for LAN adapters).
  • DECnet LAN address for the LAN adapter that DECnet uses.
Edit 4 List each of the other network components. Use one line for each component. Each line should include the following information:
  • Component name and category you defined with NEW_COMPONENT.
  • Label from the network map.
  • Descriptive text that the network failure analysis program displays if it detects a failure with this component. Include a description of the physical location of the component.
  • LAN hardware address (optional).
  • Alternate LAN address (optional).
Edit 5 Define the connections between the network components. Use the CONNECTION macro and the labels for the two components that are connected. Include the following information:
  • CONNECTION macro name
  • First component label
  • Second component label
Reference: You can find more detailed information about this exercise within the source module SYS$EXAMPLES:LAVC$FAILURE_ANALYSIS.MAR.


Previous Next Contents Index