Hi managers,
I just installed a trucluster 1.6 with two members ( ds20e ).
Cluster seems to work correctly to relocate disk service manually and
automatically when one member shuts down but if Ethernet connection of one
member losts link, services disappear from both members and return to work
when link came up again.
Private ethernet connections between two members are marked primary and not
monitored, public ethernet connections are marked backup and monitored.
If member that have disk service losts it's public link, it losts disk
services but this services are not relocated to other member.
Looking kern.log i see:
First System called ALPHA1:
Nov 5 14:13:08 ALPHA1 ASE: local HSM Warning: Network interface ee0
192.168.2.202 DOWN
Nov 5 14:13:08 ALPHA1 ASE: local HSM ***ALERT:
HSM_NI_STATUS:192.168.2.202:DOWN
Nov 5 14:13:08 ALPHA1 ASE: local Simulator Notice: snd: exiting...
Nov 5 14:13:08 ALPHA1 ASE: local Director ***ALERT: Network connection down
... exiting
Nov 5 14:13:08 ALPHA1 ASE: local Director Warning: Director exiting...
Nov 5 14:13:12 ALPHA1 ASE: local Agent Notice: will not start a director,
the local network is disconnected
Nov 5 14:13:12 ALPHA1 ASE: local AseMgr Error: Director hasn't started up
yet or local network is disconnected ...
Nov 5 14:13:22 ALPHA1 last message repeated 2 times
Nov 5 14:13:22 ALPHA1 ASE: local HSM Warning: Can't ping ALPHA2 over the
network
Nov 5 14:13:22 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:13:22 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 192.168.2.203: gateway 10.10.1.2: not in table
Nov 5 14:13:22 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:13:22 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 10.10.1.2: gateway 10.10.1.2: not in table
Nov 5 14:13:22 ALPHA1 ASE: local HSM ***ALERT:
HSM_PATH_STATUS:192.168.2.203:DOWN:10.10.1.2:UP
Nov 5 14:13:27 ALPHA1 ASE: local AseMgr Error: Director hasn't started up
yet or local network is disconnected ...
Nov 5 14:14:02 ALPHA1 last message repeated 7 times
Nov 5 14:15:27 ALPHA1 last message repeated 17 times
Nov 5 14:15:29 ALPHA1 ASE: local HSM Notice: Able to ping ALPHA2 over the
network
Nov 5 14:15:29 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:15:29 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 192.168.2.203: gateway 192.168.2.203: not in table
Nov 5 14:15:29 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:15:29 ALPHA1 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 10.10.1.2: gateway 10.10.1.2: not in table
Nov 5 14:15:29 ALPHA1 ASE: local HSM ***ALERT:
HSM_PATH_STATUS:192.168.2.203:UP:10.10.1.2:UP
Nov 5 14:15:32 ALPHA1 ASE: local AseMgr Error: Director hasn't started up
yet or local network is disconnected ...
Nov 5 14:15:35 ALPHA1 ASE: local HSM Notice: Network interface ee0
192.168.2.202 UP
Nov 5 14:15:35 ALPHA1 ASE: local HSM ***ALERT:
HSM_NI_STATUS:192.168.2.202:UP
Nov 5 14:15:35 ALPHA1 ASE: local Agent Error: netChange: no directors
found, starting one.
Nov 5 14:15:35 ALPHA1 ASE: local Simulator Notice: snd: exiting...
Second System called Alpha2:
ov 5 14:12:58 ALPHA2 ASE: local HSM Warning: member ALPHA1 is disconnected
from the network
Nov 5 14:12:58 ALPHA2 ASE: local Agent ***ALERT: All monitored networks on
member 'ALPHA1' are down
Nov 5 14:13:03 ALPHA2 ASE: local HSM Warning: Can't ping ALPHA1 over the
network
Nov 5 14:13:03 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:13:03 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 192.168.2.202: gateway 10.10.1.1: not in table
Nov 5 14:13:03 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:13:03 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 10.10.1.1: gateway 10.10.1.1: not in table
Nov 5 14:13:03 ALPHA2 ASE: local HSM ***ALERT:
HSM_PATH_STATUS:192.168.2.202:DOWN:10.10.1.1:UP
Nov 5 14:13:18 ALPHA2 ASE: local AseMgr Error: timeout connecting to Agent
on ALPHA2
Nov 5 14:13:18 ALPHA2 ASE: local AseMgr Error: connect to Agent on ALPHA2
failed
Nov 5 14:13:18 ALPHA2 ASE: local AseMgr Notice: can't connect to local
agent, retrying...
Nov 5 14:13:28 ALPHA2 ASE: local Agent Notice: starting a new director
Nov 5 14:13:28 ALPHA2 ASE: local Agent Notice: client process hung up
during connect phase
Nov 5 14:13:28 ALPHA2 ASE: local AseMgr Error: Director hasn't started up
yet or local network is disconnected ...
Nov 5 14:13:33 ALPHA2 ASE: local AseMgr Error: can't find director in
portmap, retrying...
Nov 5 14:13:43 ALPHA2 last message repeated 2 times
Nov 5 14:13:44 ALPHA2 ASE: local Director Notice: can't ping agent on
ALPHA1
Nov 5 14:14:08 ALPHA2 ASE: local AseMgr Error: timeout connecting to
Director on ALPHA2
Nov 5 14:14:08 ALPHA2 ASE: local AseMgr Error: connect to Director on
ALPHA2 failed
Nov 5 14:14:08 ALPHA2 ASE: local AseMgr Error: can't connect to the
director, retrying
Nov 5 14:14:15 ALPHA2 ASE: local Director Notice: client process hung up
during connect phase
Nov 5 14:14:15 ALPHA2 ASE: local Agent Notice: starting service cluster
Nov 5 14:14:16 ALPHA2 ASE: local Director Notice: started service cluster
on ALPHA2
Nov 5 14:15:23 ALPHA2 ASE: local Director Notice: agent on ALPHA1 came
ONLINE
Nov 5 14:15:25 ALPHA2 ASE: local HSM Notice: member ALPHA1 is UP
Nov 5 14:15:25 ALPHA2 ASE: local Director Notice: Redistributing services
(an agent's monitored network came up)
Nov 5 14:15:25 ALPHA2 ASE: local Director Notice: finished processing agent
state change from HSM: agent ALPHA1 state NIT_UP
Nov 5 14:15:30 ALPHA2 ASE: local HSM Notice: Able to ping ALPHA1 over the
network
Nov 5 14:15:30 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:15:30 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 192.168.2.202: gateway 192.168.2.202: not in table
Nov 5 14:15:30 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
writing to routing socket: No such process
Nov 5 14:15:30 ALPHA2 ASE: local HSM Notice: /var/ase/sbin/ase_run_sh:
change host 10.10.1.1: gateway 10.10.1.1: not in table
Nov 5 14:15:30 ALPHA2 ASE: local HSM ***ALERT:
HSM_PATH_STATUS:192.168.2.202:UP:10.10.1.1:UP
Received on Mon Nov 05 2001 - 22:03:34 NZDT