change in LSM volrootmir policy

From: Tim Mooney <mooney_at_dogbert.cc.ndsu.NoDak.edu>
Date: Tue, 11 Aug 1998 17:30:54 -0500 (CDT)

All-

This is not a summary, and it's not a question -- it's more of a "heads up"
regarding a change with volrootmir that may affect a number of people. This
change only affects you if you're mirroring your boot and primary
swap volumes to a disk that is larger than your primary boot disk.

In October of 1996 I asked a question on this list regarding volrootmir and
restrictions on how you could use the remaining parts (those not used by the
mirroring process) of the disk containing the mirror plexes. A number of
people responded indicating that the disk has to be completely free *before*
you did the volrootmir, but afterwards you could use the extra space for
whatever. It was also noted that the mirror disk had to be "at least as large"
as the original, though it was Ok to use a larger disk as the mirror if you
wanted.

We've been doing OS-mirroring on a number of machines ever since, and I just
recently discovered a change in volrootmir. The mirror disk on one of our
machines (running 4.0b, + most of patch set #7, version 1) started logging
a huge number of bad block and timeout errors to the binary error log, so
I had our friendly FSE bring me a replacement disk.

I disabled mirroring on the failing disk, an RZ28M, and removed it from the
LSM configuration. After replacing the disk with a new RZ28M and disklabeling
the drive as an rz28m with a & b partitions that match the size of the a & b
partitions on the primary root disk (which is an RZ26L, a smaller disk), I
attempted to re-enable mirroring using

        volrootmir -a rz16

When I tried this, volrootmir exited with an error indicating that
"Both disks must be of the same type."

As a temporary measure to get mirroring working again, I copied the volrootmir
script and went into it with an editor, removing the stanza of code that
checks to make sure they're the same type of drive.

After doing this, I was able to re-establish the mirror and it's working
reliably, but unfortunately the volrootmir script replaced the disklabel on
the rz28m with an exact copy of the disklabel from the boot disk, an rz26l.
This means that the mirror now shows up as an rz26l, even though it's really
a rz28m. This also means that there's no way I could use extra space on the
rz28m if I wanted to. That's not a big deal for me since I wasn't using the
space anyway, but it might be a big deal for someone else.

I checked with Digital support, and the very knowledgeable person I spoke with
knew where I was going with the problem description before I was done
explaining. He indicated that the change fixes `volrootmir' to work the way
it's supposed to, and that although you could previously use volrootmir to
mirror a smaller disk to a larger disk and still use the remaining parts of
the larger disk, volrootmir previously was "in error". The new behavior is
"broken, as designed", as he put it.

I wanted to let others know, so that other people don't get the same surprise
I got. You'll only see this change when you go through the initial mirroring
process, so if you've already run through root mirroring with volrootmir under
a previous version of Digital Unix (as I did), things will continue to work as
long as you don't remove the mirror. If you remove the mirror and then try
set it up again using volrootmir, that's when you'll notice the problem.

Hope this information helps someone out,

Tim
-- 
Tim Mooney                              mooney_at_dogbert.cc.ndsu.NoDak.edu
Information Technology Services         (701) 231-1076 (Voice)
Room 242-J1, IACC Building              (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164
Received on Tue Aug 11 1998 - 22:31:56 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT