SUMM: Advice needed for system recovery/availability

From: Guy Dallaire <dallaire_at_total.net>
Date: Thu, 03 Apr 1997 10:04:07

Sorry for the late summary and thanks to all who replied. I cannot name
everyone, but you know who you are.

Recovery is a broad subject, I received many suggestions. People are even
willing to help me set up the DecSafe ASE, but this is not an option for
now. It looks dangerous to set it up while the prod machine is running.

Meanwhile, I was looking for a solution in order to have my systems disk
mirrored. I will finally opt for a second internal drive in each of my
alpha 2100 and I will mirror the disks ( /, /usr, swap and /var partitions)
with LSM ,even if I think this product is a bit complicated, considering I
already use HARDWARE RAID and Advfs.

Original Post:

-------------------------------

Hello,

We are currently exploring ways to ensure fast recovery of our production
server in case of a machine failure (board) or in case of a system disk
failure.

We have 2 alpha servers 2100/RM (1 Dev + 1 Prod) and the hardware is set up
for DecSafe, We use dual redundant HSZ40, all our production data is
protected by raid-5 or mirrorsets.

Unfortunatly, DecSafe is not an option right now. We should have configured
the systems right from the start with decsafe, but it was factory installed
and it did not match our expectations at all, so we disabled it. Now we are
in production and implementing DecSafe would incur too much downtime on the
machines. People in production are working from 8 to Midnight, mon-fri and
in the weekend, developpers are working on the development machine.

We will maybe implement DecSafe in the future, if we can understand how it
works. I've never seen a product so badly documented, there are no examples
in the manuals, there are no text describing what goes on behind the
curtains...

For now, we are planning the following and would like to know what you
people think of it:

1st option: Is there a way to put a 2nd INTERNAL hard disk in an alpha
2100/Rack Mount and boot from it if the original disk fails ? We would take
a backup of the system disk to the second disk every night. That would fix
the problem of system disk failure for both machines, but would not do
anything useful in case of a main board failure.

2nd option: Could we make a copy of the prod system disk to a unit behind
the HSZ40's. In the case of the prod machine going down, we could boot the
dev machine from the HSZ and use the copy of the failed prod machine's
system disk. We could then use the DEV server as a backup machine for the
prod server, but the dev people could not work anymore. The only problems I
see (if we can boot from an HSZ40 disk) is that some configuration files
will have to be modified on the prod system disk copy (because the dev
machine has a different network card MAC address, not the same amount of
memory (oracle config files would need to be modified also), etc...) The
different HSZ40 devices are defined on both machines already.

If someone has a better solution, or could tell me which of the 2 above is
best, or if we are missing something here, please share your experience
with us.

                                        Thanks !

-------------------------------------------------------

REPLIES:

     First let me start by saying that what I propose will certainly
cause a crash if you do not do it correctly. It also has a reasonable
chance of causing data corruption.

     The single most important feature of DECsafe is that it attempts to
prevent having a single partition mounted on more than one machine at a
time. All the other stuff if fluff...nice fluff, but fluff none the
less.

     IF TWO SYSTEMS MOUNT THE SAME PARTITION AND START MODIFYING IT,
BOTH ARE LIKELY TO CRASH.

     Having got that out of the way, what I propose is this. First, you
need to establish a method of determining if the primary machine has
crashed. I think a simple shell script would do. Just have it ping the
host. When it fails to get more than a few pings in a row back, presume
it has crashed. This, of course, will kill you if you just have a
network failure, since the primary host will still have access to the
disks. If these hosts were dual homed, that would probably help prevent
this problem.

     Once the primary host has crashed you can mount the disks on the
backup host. If you are using ufs file systems, you will have to do an
fsck to mount them. If you are using advfs, you need to make sure that
you have a copy of /etc/fdmns/DOMAINNAME on both machines.

     In order to ensure that you don't have problems, I would not have
the shared file systems listed in /etc/fstab, that way if the primary
machine recovers, it will not try to mount the disks. This will mean
that starting the primary machine will involve a manual operation each
time, or a sophisticated script.

     Alternativly you may just want to have alarms go off, and do the
fail over manually.

     The best solution would be to use DECsafe. It really isn't that
hard to do. If you are in the Boston area, and are not a competator, I
could probbably arrainge to show you our setup. I work for State Street
Bank.

-cliff

----------------------------------------------

> We will maybe implement DecSafe in the future, if we can understand how it
> works. I've never seen a product so badly documented, there are no examples
> in the manuals, there are no text describing what goes on behind the
> curtains...

Are you sure you have all the documentation? We have used DECsafe, and the
documentation is fine; it is almost trivial to set up for NFS services.

> 1st option: Is there a way to put a 2nd INTERNAL hard disk in an alpha
> 2100/Rack Mount and boot from it if the original disk fails ? We would take
> a backup of the system disk to the second disk every night. That would fix
> the problem of system disk failure for both machines, but would not do
> anything useful in case of a main board failure.

Yes, use LSM with root mirroring. We use that here with a search path for
BOOTDEF_DEV, and it works fine. No need to do backups; the system
continues even if one of the disks fails. We pulled one of our disks with
the system up just to be sure that it works. Users don't even notice.
System administrators won't even notice either unless they check.

-steve

----------------------------------------------------

I am using a DECsafe environment in almost the exact configuration that
you are talking about. We have a 2000, a 2100, and a 4000. We have two
Storage Works arrays (a 300 and a 500) and DECsafe works quite nicely.
If you are physically set up (i.e. KZPSAs, SCSI buses, etc.) for
DECsafe, implementing is quite easy and very quick. The environment is
not at all difficult to work with (though you are correct, the
documentation stinks!). If you wish to look at that option again, I
would be happy to help as much as I can.

If you are physically set up to implement, but still leery, you can
implement some of the capabilities without DECsafe (this means a lot of
VERY careful manual intervention).

As to some of the other solutions, You can boot from other drives and
may even be able to set up to do this automatically. If you have the
space, consider running /usr and /var from a device seperate from root
to make root as generic as possible and reduce the possibility of a
total loss. This might also allow duplication or mirroring of the /usr
and /var partitions.

Good luck and if I can be of further assistance please feel free to
contact me.

- Jef Hamlin

---------------------------------------------------------

     Couple of options:
     
     You can encapsulate your system disk and create a mirror on hsz40.
     Checkout volset and volencap, Then, for example: set your bootdef_dev
     "dka0 dra5".
     
     If you have one internal shelf and some left over money, you can
     invest in BA356 split box, a terminator, second scsi adapter (kzpaa),
     a cable, second power supplier and a disk to mirror system disk.
     Install all of the above and second system disk, shadow'em on hardware
     level and you cannot ask for better redundancy
     
     Ronny
     The Walt Disney Company
     Disney Studios, Burbank, California
    
------------------------------------------------

Yes, I know that a 2100A/RM system can support two internal disks in the
RM enclosure -- not 100% sure about a 2100/RM, but it should be the same. If
you add a second disk, you should really look at LSM for mirroring the boot
disk. You should have LSM as part of the NAS200 base server PAK, which you
normally would have bought with a 2100 class system.

The LSM documentation can be a little hard to follow at points, but there
is a step-by-step for mirroring the boot disk (/, /usr, /var, and primary
swap). The configuration is designed to boot from the mirror if the primary
isn't available.
  
I'm trying to remember if there are any restrictions on booting from HSZ
disks (and I can't think of any off the top of my head, but having said
that our 2100A/RMs boot from internal drives). If there aren't any
restrictions, why not migrate the boot drives to the HSZ40 and use hardware
mirroring -- it should have less overhead than software mirroring.

I think the only trick is in the setup for bootdef_dev at the SRM console.
The LSM configuration is to set it up with two devices, I belive that
it will fail-over cleanly to the second if it can't get a valid boot block
off of the first.

I.e. Primary is dsk100 and secondary is dsk600

set bootdef_dev "dsk100 dsk600"

NOTE - You need to use the quotes so the SRM console knows this is all
one parameter, the LSM documentation neglects to mention this.

+--------------------------------+------------------------------+
| Tom Webster | "Funny, I've never seen it |
| SysAdmin MDA-SSD ISS-IS-HB-S&O | do THAT before...." |
| webster_at_ssdpdc.mdc.com | - Any user support person |
+--------------------------------+------------------------------+
| Unless clearly stated otherwise, all opinions are my own. |
+---------------------------------------------------------------+

Guy Dallaire
dallaire_at_total.net

"God only knows if god exists"
Received on Thu Apr 03 1997 - 17:58:08 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT