I'm not in Kansas anymore. Current version of TruCluster (1.5) doesn't
support VMS style clustering. V2.0 with DUnix 5.0 is supposed to have this
capability. Right now I can setup an NFS service that will fail over to
another box but that is about as far as it goes.
Thanks to:
alan_at_nabeth.cxo.dec.com
Balaji Chandrasekar [digital_at_astro.ocis.temple.edu]
Dr. Tom Blinn, 603-884-0646 [tpb_at_doctor.zk3.dec.com]
Vipin Gokhale, Compaq SBU, Oracle Corporation [VGOKHALE_at_us.oracle.com]
Randall R. Cable [randy.cable_at_mci.com]
Bruce Hines [Bruce.Hines_at_mci.com]
Lars Bro [lbr_at_dksin.dk]
C.Ruhnke [i769646_at_smrs013a.mdc.com]
Randy Rodgers [randy.rodgers_at_ci.ft-wayne.in.us]
K.McManus_at_greenwich.ac.uk
Ryan Ziegler [Zieglerr_at_novachem.com]
All of the respondents said basically the same thing. For completeness and
the archives I'm going to insert a couple of the replies I got.
First from Chris Ruhnke:
--------------------------------------------
In 25 words (more or less)...
TruCluster doesn't (yet) give you True Clustering in the VMS sense.
Primarily
it gives you fault tolerant disk and application servicing. A UNIX
filesystem
can be physically mounted on only one DU system as of DU 4.0D and TCR V1.5.
The other members of the cluster must access the filesystem via NFS. Of
course with Memory Channel available for TruCluster, these cluster NFS
accesses
can be done more quickly than via Ethernet. The fault tolerance comes into
play when a member of the cluster dies. The disk service (or application
service e.g. database server) can be restarted on a surviving member and
accesses to the disk/application can be resumed -- something VMScluster
did not automatically provide.
Hope that made things a little clearer.
----------------------------------------------
and some good history from Lars Bro:
----------------------------------------------
You cannot do this kind of mounts. DU 5.0 should be able to.
History:
DECSafe Available Server was the first product to let
DU systens communicate over a shared SCSI bus and the
ethernet to decide who were alive. All systems could
see all disks via the shared SCSI but if two or more
systems mounted the same filesystem, crash was inevitable.
It is possible to do reservations on the shared bus, ie.:
a member reserves a disk and it is now impossible for other
members to access that disk. Unfortunately, when a system
boots, it resets all devices on the shared buses meaning that
all reservations disappear. There is a menu in the management
program ´asemgr´ to re-reserve the disks according to ASE´s
understanding of what is the state. Unfortunately, it is at the
time when systems crash or reboot problems occur and the above
resetting thing is a great risk. I have personally seen clusters
go down due to more than one machine mounting the same filesystem
(and properly logging this) just because of some misunderstanding
among the machines.
This is the worst problem since ´failover´ cannot occur
before all filesystems have been properly unmounted. And
if just one process has its current directory in a file
system that is owned by a service, this service will not
be able to fail over unless the system is rebooted.
This is normally dealt with by use of the system call fuser(2)
thet is able to provide a list of processes that have files
open on a given filesystem. You kan then kill those processes.
(however, a strange property of DU is that a zombie process,
that once it was alive, executed a file that were located on
the filesystem still holds the lock although it is completely
deallocated and therefore not visible by fuser(2). It is thus
possible to have a situation where a service cannot failover
and the reason cannot be detected). Digital claim that this is
not an error but merely a dispute over the design. The standards
do not specify what exactly shall be deallocated upon exit(). I
have though tried the same on Solaris and that one also frees
the lock on the executable.
With TruCluster came the memory channel, a wire that could connect
the systems and the lock manager. The idea was to have processes
of different members sharing disks. Today, RDBMS´s like Oracle
can do this. You will then be forced to have Oracle place its
tablespaces on ´raw devices´ so that it can manage its own locks
by the lock manager.
In DU5.0 (so I am told) the filesystems may also take advantage
of the lock manager. This will enable you to have the same
filesystems mounted on more than 1 member. And you may put Oracle
on such a filesystem instead of having Oracle manage the locks.
---------------------------------------------------
Daniel Monjar
Manager, Systems
Organon Teknika
Mailto:Daniel.Monjar_at_orgtek.com
> -----Original Message-----
> From: Monjar, Daniel [mailto:Monjar_at_orgtek.com]
> Sent: Tuesday, October 20, 1998 4:02 PM
> To: 'alpha-osf-managers_at_ornl.gov'
> Subject: what does TruCluster give me?
>
>
> I have three 4100s and an RA7000. I'm using TruCluster 1.5
> and Unix 4.0D.
> I need some pointers to some docs that will tell me what I can do with
> TruCluster. I have a lot of experience with clustering on
> VMS but I have a
> feeling I am in a different world with TruCluster.
>
> To give you something specific to answer: I have created a
> disk set on my
> HSZ and formatted it as a AdvFs file system. I want each of the three
> 4100's to mount this file system and see what the others see,
> just like VMS
> makes possible. Can I? The TruCluster stuff talks about a
> distributed lock
> manager which sounds like a VMSish thing. Is it the same?
>
> Daniel Monjar
> Manager, Systems
> Organon Teknika
> Mailto:Daniel.Monjar_at_orgtek.com
>
Received on Wed Oct 21 1998 - 14:09:27 NZDT