Hello,
SUMMARY:
It is not safe nor supported to simultaneously access disks shared via multiple
systems -- including possible problems with READ-ONLY mode. However, DEC is
working to this end.
RESPONDERS:
Robert B. Reinhardt
Dave Golden
alan_at_nabeth.cxo.dec.com
Jonathan B. Craig
Gordon Schumacher
Christophe Sawicki
Dave Cherkus
Kurt Carlson
Regards,
Richard Jackson George Mason University
Computer Systems Senior Engineer UCIS / ISO
Computer Systems Engineering
QUESTION:
Has anyone successfully accessed/used disks simultaneously from two
different systems?
I have two 2100's running DU 3.2c attached via a tri-link connector
to a SW300 cabinet. The SW300 has a HSZ40 controller with RZ29B's
and RZ28M's and each of the 2100's have KZPSA PCI-to-SCSI Storage
adapters.
It would be nice if both systems can SAFELY locally mount the virtual disk
supplied by the HSZ40 rather than one system mount locally and it
NFS export to the other system.
I can access the virtual disk via both systems simultaneously, but how safe
is this?
RESPONSES:
-------------------------------------------------------------------
From: "Robert B. Reinhardt" <breinhar_at_access.digex.net>
>From the SCSI point of view, ala DECsafe ASE, etc. YES. But,
in terms of Digital Unix, they do not have a Unix filesystem
that can handle being dual-mounted yet. I understand that
Digital is working on this, something to be derived from on
old VMS filesystem.
-------------------------------------------------------------------
From: golden_at_falcon.invincible.com
This isn't supported and it won't work. There's no cluster lock
manager like there is in VMS. You can run DECSafe (the available
server product) and make sure that in the event of a failure, the other
system mounts the disk and exports it with the same host name.
-------------------------------------------------------------------
From: alan_at_nabeth.cxo.dec.com
Utterly unsafe, unless you mount both file systems read-only.
Even then, it probably isn't a supported configuration. Two
systems trying to access the same file system have no way
of knowing today what modification the other system has made
to the file system. The buffer caches of the two systems
won't be consistent with respect to each other. It is only
a matter of time before one system think the file system
has become corrupt from the other changing it.
The current implementation of mount, might have checks to
prevent multiple read/write mounts anyway. I think the
file system is written as being "dirty" when mounted so
that if the system crashes an fsck will be forced (for
UFS only obviously). The other system will see a dirty
file system and not want to mount it. You can override
that with an option, but you should expect one or both
systems to panic eventually with file system inconsistencies.
By mounting both read-only, nothing is allowed to change,
so there shouldn't be any inconsistencies. The problem
here simply the one of having multiple SCSI initators on
a common bus. ASE takes care of this, but it may include
special driver bits (I've never looked closely).
You might be able to partition a logical unit and have the
two systems mount separate non-overlapping partitions, but
this may still have the multi-initiator problem. Software
that does its own locking to ensure consistency might also
be able to have multi-access.
-------------------------------------------------------------------
From: "Jonathan B. Craig" <jcraig_at_i2k.net>
Take a backup because you will diffently clobber your system. Both
systems will store the superblock and be unable to tell the work that
the other system is doing. Now, if you have the DECsafe ASE (failover)
software you may be aware of the DECsafe TrueClusters that will be
available soon. This product will allow both system to mount at the
same time but will require the PCI Memory Channel hardware (2 systems
approx 8-10K). This is done by running a NFS like mount accross the
PCI Memory Channel card giving you 100MB access with very low latency.
If you have the hardware to test one thing I thought to try would be to
mount the system "READ-ONLY" on one server and Full access on the other.
This may work but no promises are made. I have the DECsafe software
and it accidently double mounted the SW300 cab and it did not crash us.
But, when we shut the system back down the idle system was last and it
blasted the superblock of the filesystem.
-------------------------------------------------------------------
From: SXRGS <sxrgs_at_orca.alaska.edu>
Basically no. We've been pursuing similar ideas here and have gotten the
definitive answer from Digital that shared dasd (in the real sense of the
word) is not doable or supported. What we have been doing here is sharing
HSZ40's between 2 systems (and not in an ASE environment) but only mounting
a particular RAID set or disk to a single system at a time. Even with this
configuration we've had some problems that we have suspected might be
related to that configuration, i.e., file system corruption on the HSZ40
disks when one of the systems shuts down and the other is active. We
pursued this question with Digital at the last DECUS and were told that
theoretically the HSZ40 should correctly handle bus resets on shared
controllers, but that its only been qualified in a DECsafe ASE
configuration.
I did ask about whether true shared disk (on the MVS model) would ever be
supported by DUNIX and was told, unofficially by a couple different people,
yes...
-------------------------------------------------------------------
From: Christophe Sawicki <chris_at_cae.ca>
If I understand well the configuration you described, you would like
to share the same set of disks between two (or many) computers attached to
the common SCSI bus.
In a configuration where the same disks are mounted locally on several
nodes, nobody ensures the synchronization between the file system caches
running on the different nodes. Such configuration can be safely used
under following restrictions:
a) the file systems are mounted READ-ONLY; or
b) the disks are divided in different partitions which can be both read
and written. However, such partitions can be locally mounted on one
system at a time; or
c) a Distributed Lock Manager is used to implement the distributed file
system cache (not available in Digital Unix yet).
-------------------------------------------------------------------
From: Dave Cherkus <cherkus_at_UniMaster.COM>
It's not as easy as it sounds.
What you propose as a start isn't going to work. The current DEC file
systems will not deal with a second node modifying the on-disk data
structures underneath the first node. The software just doesn't
support it. A clustered file system is required. Given DEC's interest
in clusters, I would imagine a clustered file system is coming, but
none is here today.
Also, the SCSI device drivers in the base DU product don't deal with
two hosts on the bus. In particular, when one issues a SCSI bus reset
(i.e. when booting) the other host would/could just go off the air.
Are you familiar with the DEC ASE (Available Server Environment)
product? If you are willing to go that route, you will get device
drivers that support multiple hosts on the bus (assuming you have the
right kind of scsi controllers), and an envionment that will support
load balancing and failover of services. It still is _not_ a clustered
file system, but users of the two systems could access the systems via
NFS, and get a degree of load balancing and transparent failover.
-------------------------------------------------------------------
From: Kurt Carlson <SXKAC_at_orca.alaska.edu>
To the best of my knowledge this is not supported by Digital Unix,
there is no locking mechanism to control this kind of access to
a single disk. For single disk NFS is your only choice.
You can access two different disks from two systems connected to the
same HSZ40 via an H885 tri-link. Technically this is supported only
under DECSafe-ASE, but it does work if you're careful... we're doing it.
Under this configuration I've seen two AdvFS panics which can likely
be attributed to this configuration... first was when both systems
were booting at the same time and apparently both issued scsi bus
resets which collided on one system, the second was when somebody
issued I believe a showfdmns on a domain already active on the other
system.
Not safe at all, DU does not having locking mechanism for shared disk
at this time. Maybe IBM will offer that technology to OSF someday
(don't hold your breath).
With the development of memory channel clusters my guess is they'll not
have it in any forseeable future, the direction I heard at DECUS with the
future hub/kzps? was an I/O can be initiated on a channeled system without
ever touching the memory/cpu of the owning system.
-------------------------------------------------------------------
Received on Wed Dec 20 1995 - 15:28:42 NZDT