Cluster interface choice and ICMP response times

From: Nigel Rantor <wiggly_at_wiggly.org>
Date: Thu, 01 Mar 2001 14:55:31 +0000

Hi there,

Question regarding how a cluster alias is used by the cluster when a
single instance application is run on a particular node.

Topic: ICMP response times wrt cluster aliases
Platform: Tru64 DS20s, 2 per cluster, OS rev 5.0a
Responses to: wiggly_at_wiggly.org

I beleive we have a problem related to the order in which nodes come up
within the cluster. Basically we see different response times on ICMP
packets (ping/traceroute) on the different nodes. In our case NodeA has
double the response time of NodeB. This occurs if NodeA is the first
node to be booted in the cluster. In other words we can make NodeA have
double the response time of NodeB if we boot NodeB first and then bring
up NodeA.

I do not know enough about the way in which Tru64 uses its cluster alias
address to know for sure what is going on but my hypothesis is below.

- The first node to come up has the cluster alias assigned to it so that
requests to the cluster can be handled.

- When a program is run from the node that is not the node that has been
assigned the cluster alias address any packets that are sent from the
node are 'modified' so that their source address is actually the cluster
alias rather than the specific machine address that generated the
packets.

- When the packets return they are dealt with by the node that has been
assigned the cluster alias, it recognises that it was not the node that
the packets originated from and forward then to the correct machine.

- Therefore packets generated by any node that does not have the cluster
alias assigned to it will have to go through the node that does have the
cluster alias assigned to it because of the way in which the cluster
alias is used to ensure any node in the cluster can handle requests.

Does this sound reasonable to anyone as they way in which things are
handled? I am seeing (for example) 200ms vs 400ms between different
nodes in a cluster. Does anyone have a way of getting around this so
that both nodes get the same (good) response time? The ICMP response
times are critical to my application and I want to be able to have a
single instance app controlled by CAA that can failover to the other
node if neccessary. At the moment this will corrupt my data since the
response times suddenly double from internet hosts.

On the subject of the cluster alais. If I do not want a cluster alais do
I need one? If I merely want single instance applications running with
the ability for CAA to failover between nodes is it neccessary for me to
have a cluster alias at all?

I have been looking through as much stuff as I could find on this but I
cannot find any definitive answers. Any thoughts, comments, ideas, fixes
or pointers to other resources would be welcomed.

Kind regards,

 Nigel Rantor - wiggly_at_wiggly.org
Received on Thu Mar 01 2001 - 14:56:21 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT