  | 
		
OpenVMS Cluster Systems
 
 
F.2.3 Excessive Packet Losses on LAN Paths
Prior to OpenVMS Version 7.3, an SCS virtual circuit closure was the
first indication that a LAN path had become unusable. In OpenVMS
Version 7.3, whenever the last usable LAN path is losing packets at an
excessive rate, PEDRIVER displays the following console message:
 
 
  
    
       
      
%PEA0, Excessive packet losses on LAN path from local-device-name
 to device-name on REMOTE NODE node-name
 
 |   
This message is displayed when PEDRIVER recently had to perform an
excessively high rate of packet retransmissions on the LAN path
consisting of the local device, the intervening network, and the device
on the remote node. The message indicates that the LAN path has
degraded and is approaching, or has reached, the point where reliable
communications with the remote node are no longer possible. It is
likely that the virtual circuit to the remote node will close if the
losses continue. Furthermore, continued operation with high LAN packet
losses can result in significant loss in performance because of the
communication delays resulting from the packet loss detection timeouts
and packet retransmission.
 
The corrective steps to take are:
 
  - Check the local and remote LAN device error counts to see whether a
  problem exists on the devices. Issue the following commands on each
  node:
 
  
    
       
      
$ SHOW DEVICE local-device-name
$ MC SCACP
SCACP> SHOW LAN device-name
$ MC LANCP
LANCP> SHOW DEVICE device-name/COUNT
 
 |   
   - If device error counts on the local devices are within normal
  bounds, contact your network administrators to request that they
  diagnose the LAN path between the devices.
  
F.2.4 Preliminary Network Diagnosis
If the symptoms and preliminary diagnosis indicate that you might have
a network problem, troubleshooting LAN communication failures should
start with the step-by-step procedures described in Appendix C.
Appendix C helps you diagnose and solve common Ethernet and FDDI LAN
communication failures during the following stages of OpenVMS Cluster
activity:
 
  - When a computer or a satellite fails to boot
  
 - When a computer fails to join the OpenVMS Cluster
  
 - During run time when startup procedures fail to complete
  
 - When a OpenVMS Cluster hangs
  
The procedures in Appendix C require that you verify a number of
parameters during the diagnostic process. Because system parameter
settings play a key role in effective OpenVMS Cluster communications,
Section F.2.6 describes several system parameters that are especially
important to the timing of LAN bridges, disk failover, and channel
availability.
F.2.5 Tracing Intermittent Errors
 
Because PEDRIVER communication is based on channels, LAN network
problems typically fall into these areas:
 
  - Channel formation and maintenance 
 Channels are formed when
  HELLO datagram messages are received from a remote system. A failure
  can occur when the HELLO datagram messages are not received or when the
  channel control message contains the wrong data.
    -  Retransmission 
 A well-configured OpenVMS Cluster system should
  not perform excessive retransmissions between nodes. Retransmissions
  between any nodes that occur more frequently than once every few
  seconds deserve network investigation.
  
Diagnosing failures at this level becomes more complex because the
errors are usually intermittent. Moreover, even though PEDRIVER is
aware when a channel is unavailable and performs error recovery based
on this information, it does not provide notification when a channel
failure occurs; PEDRIVER provides notification only for virtual circuit
failures.
 
However, the Local Area OpenVMS Cluster Network Failure Analysis
Program (LAVC$FAILURE_ANALYSIS), available in SYS$EXAMPLES, can help
you use PEDRIVER information about channel status. The
LAVC$FAILURE_ANALYSIS program (documented in Appendix D) analyzes
long-term channel outages, such as hard failures in LAN network
components that occur during run time.
 
This program uses tables in which you describe your LAN hardware
configuration. During a channel failure, PEDRIVER uses the hardware
configuration represented in the table to isolate which component might
be causing the failure. PEDRIVER reports the suspected component
through an OPCOM display. You can then isolate the LAN component for
repair or replacement.
 
Reference: Section F.7 addresses the kinds of
problems you might find in the NISCA protocol and provides methods for
diagnosing and solving them.
F.2.6 Checking System Parameters
 
Table F-3 describes several system parameters relevant to the
recovery and failover time limits for LANs in an OpenVMS Cluster.  
 
  Table F-3 System Parameters for Timing
  
    | Parameter  | 
    Use  | 
   
  
    | RECNXINTERVAL  | 
   
  
    | 
      Defines the amount of time to wait before removing a node from the
      OpenVMS Cluster after detection of a virtual circuit failure, which
      could result from a LAN bridge failure.
     | 
    
 If your network uses multiple paths and you want the OpenVMS Cluster to
 survive failover between LAN bridges, make sure the value of
 RECNXINTERVAL is greater than the time it takes to fail over those
 paths.
 
      Reference: The formula for calculating this parameter
      is discussed in Section 3.4.7.
      | 
   
  
    | MVTIMEOUT  | 
   
  
    | 
      Defines the amount of time the OpenVMS operating system tries to
      recover a path to a disk before returning failure messages to the
      application.
     | 
    
       Relevant when an OpenVMS Cluster configuration is set up to serve disks
       over either the Ethernet or FDDI. MVTIMEOUT is similar to RECNXINTERVAL
       except that RECNXINTERVAL is CPU to CPU, and MVTIMEOUT is CPU to disk.
     | 
   
  
    | SHADOW_MBR_TIMEOUT  | 
   
  
    | 
      Defines the amount of time that the Volume Shadowing for OpenVMS tries
      to recover from a transient disk error on a single member of a
      multiple-member shadow set.
     | 
    
       SHADOW_MBR_TIMEOUT differs from MVTIMEOUT because it removes a failing
       shadow set member quickly. The remaining shadow set members can recover
       more rapidly once the failing member is removed.
     | 
   
 
Note: The TIMVCFAIL system parameter, which optimizes
the amount of time needed to detect a communication failure, is not
recommended for use with LAN communications. This parameter is intended
for CI and DSSI connections. PEDRIVER (which is for Ethernet and FDDI)
usually surpasses the detection provided by TIMVCFAIL with the listen
timeout of 8 to 9 seconds.
F.2.7 Channel Timeouts
 
Channel timeouts are detected by PEDRIVER as described in
Table F-4.  
 
  Table F-4 Channel Timeout Detection
  
    | PEDRIVER Actions  | 
     Comments  | 
   
  
    | 
      Listens for HELLO datagram messages, which are sent over channels at
      least once every 3 seconds
     | 
    
       Every node in the OpenVMS Cluster multicasts HELLO datagram messages on
       each LAN adapter to notify other nodes that it is still functioning.
       Receiving nodes know that the network connection is still good.
     | 
   
  
    | 
      Closes a channel when HELLO datagrams or sequenced messages have not
      been received for a period of 8 to 9 seconds
     | 
    
       Because HELLO datagram messages are transmitted at least once every 3
       seconds, PEDRIVER times out a channel only if at least two HELLO
       datagram messages are lost and there is no sequenced message traffic.
     | 
   
  
    
Closes a virtual circuit when:
- No channels are available.
 - The packet size of the only available channels is insufficient.
  
     | 
    
       The virtual circuit is not closed if any other channels to the node are
       available except when the packet sizes of available channels are
       smaller than the channel being used for the virtual circuit. For
       example, if a channel fails over from FDDI to Ethernet, PEDRIVER may
       close the virtual circuit and then reopen it after negotiating the
       smaller packet size that is necessary for Ethernet segmentation.
     | 
   
  
    | 
      Does not report errors when a channel is closed
     | 
    
       OPCOM "Connection loss" errors or SYSAP messages are not sent
       to users or other system applications until after the virtual circuit
       shuts down. This fact is significant, especially if there are multiple
       paths to a node and a LAN hardware failure occurs. In this case, you
       might not receive an error message; PEDRIVER continues to use the
       virtual circuit over another available channel.
     | 
   
  
    | 
      Reestablishes a virtual circuit when a channel becomes available again
     | 
    
      PEDRIVER reopens a channel when HELLO datagram messages are received
      again.
     | 
   
 
F.3 Using SDA to Monitor LAN Communications
This section describes how to use SDA to monitor LAN communications.
F.3.1 Isolating Problem Areas
 
If your system shows symptoms of intermittent failures during run time,
you need to determine whether there is a network problem or whether the
symptoms are caused by some other activity in the system.
 
Generally, you can diagnose problems in the NISCA protocol or the
network using the OpenVMS System Dump Analyzer utility (SDA). SDA is an
effective tool for isolating problems on specific nodes running in the
OpenVMS Cluster system.
 
Reference: The following sections describe the use of
some SDA commands and qualifiers. You should also refer to the
OpenVMS Alpha System Analysis  Tools Manual or the OpenVMS VAX System Dump Analyzer  Utility Manual for complete information about SDA
for your system.
F.3.2 SDA Command SHOW PORT
 
The SDA command SHOW PORT provides relevant information that is useful
in troubleshooting PEDRIVER and LAN adapters in particular. Begin by
entering the SHOW PORT command, which causes SDA to define cluster
symbols. Example F-1 illustrates how the SHOW PORT command provides a
summary of OpenVMS Cluster data structures.
 
 
  
    | Example F-1 SDA Command SHOW PORT Display | 
   
  
    
       
      
$ ANALYZE/SYSTEM
SDA> SHOW PORT
VAXcluster data structures
--------------------------
                  --- PDT Summary Page ---
 PDT Address          Type         Device          Driver Name
 -----------          ----         -------         -----------
  80C3DBA0             pa          PAA0            PADRIVER
  80C6F7A0             pe          PEA0            PEDRIVER
 |   
F.3.3 Monitoring Virtual Circuits
To examine information about the virtual circuit (VC) that carries
messages between the local node (where you are running SDA) and another
remote node, enter the SDA command SHOW
PORT/VC=VC_remote-node-name. Example F-2 shows how to
examine information about the virtual channel running between a local
node and the remote node, NODE11.
 
 
  
    | Example F-2 SDA Command SHOW PORT/VC
    Display | 
   
  
    
       
      
SDA> SHOW PORT/VC=VC_NODE11
VAXcluster data structures
--------------------------
                 --- Virtual Circuit (VC) 98625380 ---
Remote System Name:  NODE11  (0:VAX)     Remote SCSSYSTEMID:  19583
Local System ID:  217 (D9)              Status: 0005 open,path
------ Transmit -------  ----- VC Closures -----  (7)--- Congestion Control ----
Msg Xmt(1)      46193196  SeqMsg TMO            0  Pipe Quota/Slo/Max(8) 31/ 7/31
  Unsequence          3  CC DFQ Empty          0  Pipe Quota Reached(9)   213481
  Sequence     41973703  Topology Change(5)     0  Xmt C/T(10)              0/1984
  ReXmt(2)       128/106  NPAGEDYN Low(6)        0  RndTrp uS(11)        18540+7764
  Lone ACK      4219362                           UnAcked Msgs                0
Bytes Xmt     137312089                           CMD Queue Len/Max        0/21
------- Receive -------  - Messages  Discarded -  ----- Channel Selection -----
Msg Rcv(3)      47612604  No Xmt Chan           0  Preferred Channel    9867F400
  Unsequence          3  Rcv Short Msg         0  Delay Time           FAAD63E0
  Sequence     37877271  Illegal Seq Msg       0  Buffer Size              1424
  ReRcv(4)         13987  Bad Checksum          0  Channel Count              18
  Lone ACK      9721030  TR DFQ Empty          0  Channel Selections      32138
  Cache             314  TR MFQ Empty          0  Protocol                1.3.0
  Ill ACK             0  CC MFQ Empty          0  Open(12) 8-FEB-1994 17:00:05.12
Bytes Rcv    3821742649  Cache Miss            0  Cls(13) 17-NOV-1858 00:00:00.00
 |   
The SHOW PORT/VC=VC_remote-node-name command displays a number
of performance statistics about the virtual circuit for the target
node. The display groups the statistics into general categories that
summarize such things as packet transmissions to the remote node,
packets received from the remote node, and congestion control behavior.
The statistics most useful for problem isolation are called out in
Example F-2 and described in Table F-5.
 
Note: The counters shown in Example F-2 are stored
in fixed-size fields and are automatically reset to 0 when a field
reaches its maximum value (or when the system is rebooted). Because
fields have different maximum sizes and growth rates, the field
counters are likely to reset at different times. Thus, for a system
that has been running for a long time, some field values may seem
illogical and appear to contradict others.  
 
  Table F-5 SHOW PORT/VC Display
  
    | Field  | 
    Description  | 
   
  
    | 
      (1) Msg Xmt (messages transmitted)
     | 
    
      Shows the total number of packets transmitted over the virtual circuit
      to the remote node, including both sequenced and unsequenced (channel
      control) messages, and lone acknowledgments. (All application data is
      carried in sequenced messages.) The counters for sequenced messages and
      lone acknowledgments grow more quickly than most other fields.
     | 
   
  
    | 
      (2) ReXmt (retransmission)
     | 
    
 Indicates the number of retransmissions and retransmit related timeouts
 for the virtual circuit.
- The rightmost number (106) in the ReXmt field indicates the number
of times a timeout occured. A timeout indicates one of the following
problems:
- The remote system NODE11 did not receive the sequenced message sent
by UPNVMS.
 - The sequenced message arrived but was delayed in transit to NODE11.
 - The local system UPNVMS did not receive the acknowledgment to the
message sent to remote node NODE11.
 - The acknowledgment arrived but was delayed in transit from NODE11.
  
  Congestion either in the network or at one of the nodes can cause
 the following problems:
 
-  Congestion in the network can result in delayed or lost packets.
Network hardware problems can also result in lost packets.
 -  Congestion in UPNVMS or NODE11 can result either in packet delay
because of queuing in the adapter or in packet discard because of
insufficient buffer space.
  
  -  The leftmost number (128) indicates the number of packets actually
 retransmitted. For example, if the network loses two packets at the
 same time, one timeout is counted but two packets are retransmitted. A
 retransmission occurs when the local node does not receive an
 acknowledgment for a transmitted packet within a predetermined timeout
 interval.
 Although you should expect to see a certain number of
retransmissions (especially in heavily loaded networks), an excessive
number of retransmissions wastes network bandwidth and indicates
excessive load or intermittent hardware failure. If the leftmost value
in the ReXmt field is greater than about 0.01% to 0.05% of the total
number of the transmitted messages shown in the Msg Xmt field, the
OpenVMS Cluster system probably is experiencing excessive network
problems or local loss from congestion.
   
     | 
   
  
    | 
      (3) Msg Rcv (messages received)
     | 
    
      Indicates the total number of messages received by local node UPNVMS
      over this virtual circuit. The values for sequenced messages and lone
      acknowledgments usually increase at a rapid rate.
     | 
   
  
    | 
      (4) ReRcv (rereceive)
     | 
    
Displays the number of packets received redundantly by this system. A
remote system may retransmit packets even though the local node has
already successfully received them. This happens when the cumulative
delay of the packet and its acknowledgment is longer than the estimated
round-trip time being used as a timeout value by the remote node.
Therefore, the remote node retransmits the packet even though it is
unnnecessary.
        Underestimation of the round-trip delay by the remote node is not
      directly harmful, but the retransmission and subsequent
      congestion-control behavior on the remote node have a detrimental
      effect on data throughput. Large numbers indicate frequent bursts of
      congestion in the network or adapters leading to excessive delays. If
      the value in the ReRcv field is greater than approximately 0.01% to
      0.05% of the total messages received, there may be a problem with
      congestion or network delays.
      | 
   
  
    | 
      (5) Topology Change
     | 
    
      Indicates the number of times PEDRIVER has performed a failover from
      FDDI to Ethernet, which necessitated closing and reopening the virtual
      circuit. In Example F-2, there have been no failovers. However, if
      the field indicates a number of failovers, a problem may exist on the
      FDDI ring.
     | 
   
  
    | 
      (6) NPAGEDYN (nonpaged dynamic pool)
     | 
    
      Displays the number of times the virtual circuit was closed because of
      a pool allocation failure on the local node. If this value is nonzero,
      you probably need to increase the value of the NPAGEDYN system
      parameter on the local node.
     | 
   
  
    | 
      (7) Congestion Control
     | 
    
      Displays information about the virtual circuit to control the pipe
      quota (the number of messages that can be sent to the remote node [put
      into the "pipe"] before receving an acknowledgment and the
      retransmission timeout). PEDRIVER varies the pipe quota and the timeout
      value to control the amount of network congestion.
     | 
   
  
    | 
      (8) Pipe Quota/Slo/Max
     | 
    
Indicates the current thresholds governing the pipe quota.
- The leftmost number (31) is the current value of the pipe quota
(transmit window). After a timeout, the pipe quota is reset to 1 to
decrease congestion and is allowed to increase quickly as
acknowledgments are received.
 - The middle number (7) is the slow-growth threshold (the size at
which the rate of increase is slowed) to avoid congestion on the
network again.
 - The rightmost number (31) is the maximum value currently allowed
for the VC based on channel limitations.
  
 
      Reference: See Appendix G for PEDRIVER congestion
      control and channel selection information.
      | 
   
  
    | 
      (9) Pipe Quota Reached
     | 
    
      Indicates the number of times the entire transmit window was full. If
      this number is small as compared with the number of sequenced messages
      transmitted, it indicates that the local node is not sending large
      bursts of data to the remote node.
     | 
   
  
    | 
      (10) Xmt C/T (transmission count/target)
     | 
    
      Shows both the number of successful transmissions since the last time
      the pipe quota was increased and the target value at which the pipe
      quota is allowed to increase. In the example, the count is 0 because
      the pipe quota is already at its maximum value (31), so successful
      transmissions are not being counted.
     | 
   
  
    | 
      (11) RndTrp uS (round trip in microseconds)
     | 
    
      Displays values that are used to calculate the retransmission timeout
      in microseconds. The leftmost number (18540) is the average round-trip
      time, and the rightmost number (7764) is the average variation in
      round-trip time. In the example, the values indicate that the round
      trip is about 19 milliseconds plus or minus about 8 milliseconds.
     | 
   
  
    | 
      (12) Open and Cls
     | 
    
      Displays open (Open) and closed (Cls) timestamps for the last
      significant changes in the virtual circuit. The repeated loss of one or
      more virtual circuits over a short period of time (fewer than 10
      minutes) indicates network problems.
     | 
   
  
    | 
      (13) Cls
     | 
    
       If you are analyzing a crash dump, you should check whether the
       crash-dump time corresponds to the timestamp for channel closures (Cls).
     | 
   
 
F.3.4 Monitoring PEDRIVER Buses
The SDA command SHOW PORT/BUS=BUS_LAN-device command is useful
for displaying the PEDRIVER representation of a LAN adapter. To
PEDRIVER, a bus is the logical representation of the LAN adapter. (To
list the names and addresses of buses, enter the SDA command SHOW
PORT/ADDR=PE_PDT and then press the Return key twice.) Example F-3
shows a display for the LAN adapter named EXA.
 
 
  
    | Example F-3 SDA Command SHOW PORT/BUS
    Display | 
   
  
    
       
      
SDA> SHOW PORT/BUS=BUS_EXA
VAXcluster data structures
--------------------------
--- BUS: 817E02C0  (EXA)  Device: EX_DEMNA  LAN Address: AA-00-04-00-64-4F ---
                                   LAN Hardware Address: 08-00-2B-2C-20-B5
Status: 00000803 run,online(1),restart
------- Transmit ------  ------- Receive -------  ---- Structure Addresses ---
Msg Xmt        20290620  Msg Rcv        67321527  PORT Address        817E1140
  Mcast Msgs    1318437    Mcast Msgs   39773666  VCIB Addr           817E0478
  Mcast Bytes 168759936    Mcast Bytes 159660184  HELLO Message Addr  817E0508
Bytes Xmt    2821823510  Bytes Rcv    3313602089  BYE Message Addr    817E0698
Outstand I/Os         0  Buffer Size        1424  Delete BUS Rtn Adr  80C6DA46
Xmt Errors(2)      15896  Rcv Ring Size        31
Last Xmt Error 0000005C         Time of Last Xmt Error(3)21-JAN-1994 15:33:38.96
--- Receive Errors ----  ------ BUS Timer ------  ----- Datalink Events ------
TR Mcast Rcv          0  Handshake TMO  80C6F070  Last  7-DEC-1992 17:15:42.18
Rcv Bad SCSID         0  Listen TMO     80C6F074  Last Event          00001202
Rcv Short Msg         0  HELLO timer           3  Port Usable                1
Fail CH Alloc         0  HELLO Xmt err(4)    1623  Port Unusable              0
Fail VC Alloc         0                           Address Change             1
Wrong PORT            0                           Port Restart Fail          0
 |   
  
    | Field  | 
    Description  | 
   
  
    | 
      (1) Status:
     | 
    
      The Status line should always display a status of "online" to
      indicate that PEDRIVER can access its LAN adapter.
     | 
   
  
    | 
      (2) Xmt Errors (transmission errors)
     | 
    
      Indicates the number of times PEDRIVER has been unable to transmit a
      packet using this LAN adapter.
     | 
   
  
    | 
      (3) Time of Last Xmt Error
     | 
    
You can compare the time shown in this field with the Open and Cls
times shown in the VC display in Example F-2 to determine whether the
time of the LAN adapter failure is close to the time of a virtual
circuit failure.
 
      Note: Transmission errors at the LAN adapter bus level
      cause a virtual circuit breakage.
      | 
   
  
    | 
      (4) HELLO Xmt err (HELLO transmission error)
     | 
    
Indicates how many times a message transmission failure has
"dropped" a PEDRIVER HELLO datagram message. (The Channel
Control [CC] level description in Section F.1 briefly describes the
purpose of HELLO datagram messages.) If many HELLO transmission errors
occur, PEDRIVER on other nodes probably is timing out a channel, which
could eventually result in closure of the virtual circuit.
        The 1623 HELLO transmission failures shown in Example F-3
      contributed to the high number of transmission errors (15896). Note
      that it is impossible to have a low number of transmission errors and a
      high number of HELLO transmission errors.
      | 
   
 
  
  
		
	
 
  
   |