SUMMARY: Cluster: long failover time

From: Pang Wai Man Raymond <wmpang_at_se.cuhk.edu.hk>
Date: Thu, 18 Dec 1997 09:39:28 +0800

Hi,

Thanks for Andrew J Cosgriff's solution. Problem gone after DE500-XA was
replaced by DE500-AA.

Regards,
Raymond

On Mon, Dec 15, 1997 at 08:51:27PM +1100, Andrew J Cosgriff wrote:
>
> Pang Wai Man Raymond <wmpang_at_se.cuhk.edu.hk> wrote:
>
> >We have the cluster set up by two identical 4100 with memory channel
> >installed. Both are running DU4.0b, TCR141 and with the latest patches
> >applied. Our problem is that the time required to relocate the NFS service
> >is unacceptable. With the service mounted by client, the relocation time,
> >say from member A to member B, is just few seconds. However from B to A,
> >the required time is ranging from 1 to 4 minutes.
>
> >I then used ping and arp to verify the correct MAC address the service
> >belongs. In my first case, i.e. from A to B, the MAC address of that nfs
> >services changed immediately. And for the later one, it stayed with the
> >original MAC address until, what I think is arp cache timeout, or until I
> >cleared that entry by "arp -d nfs_service". On Alphastation, this takes
> >about 1-2 minutes, and on Sun with Solaris, it takes 2-4 minutes.
>
> sounds like you've been bitten by the DE500 ethernet card bug - check
> the revision (it'll be printed at boot-time / run "dmesg") - if it's
> less than 2.0, then you've got a DE500-XA, and you should pester DEC to
> swap it for a DE500-AA, which doesn't have this problem (the problem
> being that the rev. 1 cards won't do an ARP broadcast after an ifconfig
> (to alias a new address to an interface) at 100MB/s, only at 10MB/s...)
>
> Enjoy,
> Andrew
> --
> - Andrew J Cosgriff - ajc_at_bing.wattle.id.au
> Well, O.K. I'll compromise with my principles because of EXISTENTIAL DESPAIR!
Received on Thu Dec 18 1997 - 02:40:15 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT