Advice needed: Recovery strategy

From: Guy Dallaire <dallaire_at_total.net>
Date: Fri, 21 Mar 1997 09:55:59

Hello,

We are currently exploring ways to ensure fast recovery of our production
server in case of a machine failure (board) or in case of a system disk
failure.

We have 2 alpha servers 2100/RM (1 Dev + 1 Prod) and the hardware is set up
for DecSafe, We use dual redundant HSZ40, all our production data is
protected by raid-5 or mirrorsets.

Unfortunatly, DecSafe is not an option right now. We should have configured
the systems right from the start with decsafe, but it was factory installed
and it did not match our expectations at all, so we disabled it. Now we are
in production and implementing DecSafe would incur too much downtime on the
machines. People in production are working from 8 to Midnight, mon-fri and
in the weekend, developpers are working on the development machine.

We will maybe implement DecSafe in the future, if we can understand how it
works. I've never seen a product so badly documented, there are no examples
in the manuals, there are no text describing what goes on behind the
curtains...

For now, we are planning the following and would like to know what you
people think of it:

1st option: Is there a way to put a 2nd INTERNAL hard disk in an alpha
2100/Rack Mount and boot from it if the original disk fails ? We would take
a backup of the system disk to the second disk every night. That would fix
the problem of system disk failure for both machines, but would not do
anything useful in case of a main board failure.

2nd option: Could we make a copy of the prod system disk to a unit behind
the HSZ40's. In the case of the prod machine going down, we could boot the
dev machine from the HSZ and use the copy of the failed prod machine's
system disk. We could then use the DEV server as a backup machine for the
prod server, but the dev people could not work anymore. The only problems I
see (if we can boot from an HSZ40 disk) is that some configuration files
will have to be modified on the prod system disk copy (because the dev
machine has a different network card MAC address, not the same amount of
memory (oracle config files would need to be modified also), etc...) The
different HSZ40 devices are defined on both machines already.

If someone has a better solution, or could tell me which of the 2 above is
best, or if we are missing something here, please share your experience
with us.

                                        Thanks !


Guy Dallaire
dallaire_at_total.net

"God only knows if god exists"
Received on Fri Mar 21 1997 - 16:40:43 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT