SUMM (partial) New Machine Setup (RAID Newbie)

From: Mike Grau <m.grau_at_kcc.state.ks.us>
Date: Thu, 21 Aug 1997 13:41:47 -0500

I gotta feed that cat better food.
Many thanks for the _very_ helpfulful education on RAID and SWXCRMGR...
   Sean O'Connell
   Alexis Villagra
   alan_at_nabeth
   John Seel
   Tom Webster
   Calvin Chung
   kevin reardon
   Thomas Erskine
   Jason Neil

What I actually have is two rz28's JBOD and five rz29's in one RAID-5
all on the controller using just two of the channels with DU not loaded,
as most of you told me and I was able to confirm:
______________________________
| Tgt Channel Number |
| ID 0 1 2 | Grp/Drv: A/6, B/1, C/1
|-----------------------------| Log Drv: 0 1 2
| 0 - A-0 C-0 |
| OPT OPT |
| 1 - A-2 A-1 |
| OPT OPT |
| 2 - A-4 A-3 |
| OPT OPT |
| 3 - B-0 A-5 |
| OPT OPT |
| 4 - - - |
| |
| 5 - - - |
| |
| 6 - - - |
|-----------------------------|
I am going to change this to:
  two rz28's RAID-1 (re0 system disk; I'm anal retentive, I guess)
  three rz29's RAID-5
  three rz29's RAID-5
and I believe I actually can do it, thanks to you all. Below are some of
the responses I received, but first a few more questions if you will:

Using the above proposed disk configuration, is there any one matrix
that makes the most sense configuring drive groups across all three
controllers?

I've never used ADVFS. Should I on the RAID-5?

Assuming the RAID-1 drives are on separate channels, will the Oracle
Redo Log files be OK (from a performance point of view) on the RAID-1?

How is "striping across RAID's" best achieved?
--------------------------------------------------------------------------
Below are snipits of replies I received with the original question at
the end:

If this is pedestal 1000A, then the only internal storage is the 8
drive
storage bay in the front which appears to be connected to your RAID
controller. You do not appear to have anything other than the cdrom on
the
internal bus.

If this is a rack mount, then there could be an internal drive, but
judging
by the results of "show dev", you do not have one.

By the way. I would recommend using the RAID controller to mirror the
system disk rather than LSM. Mirroring the system disk with LSM is a
pain.
SInce you have the RAID control, you should consider using it. (just my
humble opinion--others may disagree).
---------------------------------------------------------------------------
In the case of a single controller like you have, I'd offload
all the mirroring work to the controller. If the two disks
were spread over two controllers then doing host mirroring has
 its advantages. Just make sure the two disks are on different
 backend busses, if the controller is the 3 channel model.
------------------------------------------------------------------------
You have some serious weirdness going on here....

You have a raid controller which is showing up (thus the dra disks),
I don't know if your raid controller is a KZPSC-AA (one bus) or a
KZPSC-BA (three bus), but it is something of a moot point. I don't
remember if these are supposed to showup on a "sho dev" or not.

You also have an embeded fast-wide controller on the motherboard
(PKA0). This where the CDROM is attached, the KZPSC-xx family doesn't
support any devices other than disks. The default boot device in your
NVRAM (dka0) should also be hanging off of this bus. You may want to
crack the case and see if there isn't another rz28 inside the case
that has a loose cable.

If there isn't a drive sitting in the case waiting to be booted from,
things get strange. There seems to be some confusion over booting from
RAID disks being supported by DEC, and I can't give an answer to that.
I have heard that it is possible and does work. But I've never heard of
DEC shipping a system that does so for the factory.

If you don't find another disk in the cab, it looks like you are going
to have to do a full install. You may really want to think about just
mirroring the two disks that are setup as JBOD right now and then
loading
DU on them. Hardware mirroring generally works al lot better than
software.

On a side note -- are you sure you want a big RAID-5 array for Oracle?
Oracle is real big on spreading things out over multiple independantly
accessable spindles to reduce spindle contention. We figured it out
once
and decided that for a production machine the 'ideal' setup acording to
Oracle was something like 12 hardware mirrored (24 total) disks. This
makes sure that data is sperate from indexes and rollback segments are
seperate from everything....
-------------------------------------------------------------------------
I can't help you with the problem of not being able to boot from any of
your installed disks. Though you could boot from the CD-ROM and mount
the
disks and see if there is actually as system on any of them. by the way,
did you try just 'boot dra1' (i.e. without specifiying all the
0.0.2000.0
information)?

But I can tell you how to change the scsi numbers of your RAID devices
(so
re0 is the system disk, not the RAID 5 diskset). I don't know if it is
best if the system disk has SCSI 0, but I've heard something along those
lines, so why not. It is a little convoluted (we expect nothing less
from
Digital!), but not hard.

A) Run the swxcrmgr from the console:
- from the console (>>>) type 'arc' to go to the ARC console.
- select run another program
- put in the RAID Standalone Software 3.25" diskette
- type a:swxcrmgr

B) Modify groups
- select 1. View/Update Configuration
- select 1. Define Drive Group
- write down the current configuration
- select 2. Cancel Group and cancel all the groups (yikes! don't worry,
it
-doens't do anything to the disks, just in the raid configuration
- select 1. Create Group and recreate the groups, except first define
one
-of the RZ28D-VW's as group A and the second as group B. Then define the
-RAID set as group C. These aren't strictly tied to the SCSI numbers,
but
-it will reduce confusion.
- select 3. Arrange Group and select groups A, B, and C in order. This,
-from what I can tell, is how a group is connected to a specific SCSI
-number.
- now exit from that sub-menu and select 2. Define Logical Drive
- select 1. Create Logical Drives and create the three logical drives,
-presumably using the entire group in each case. The drive groups are
-presented in the order given by Arrange Group, hence the SCSI number
<->
-Arrange Group connection.
- exit to the main menu, choosing to save the configuration
- for the RAID 5 subset, you probably want to initialize (4. Initialize
-Logical Drive) the drives so they are all zero and the parity checking
-doesn't get messed up. I don't use RAID-5, so I'm not sure. But
-initializing the drives can take several hours. For the JBOD (or
RAID-0),
-initialization is not necessary (but the program still complains).
- you can save the configuration on a diskette from the 1. Tools, 6.
-Backup/Restore Conf selection
- exit from the program, ignoring complaint about un-initialized drives
-(the program will probably complain if you have initialized the RAID-5
-set, but not the JBOD disks.

ouila! the SCSI numbers changed. Does it really have to be so difficult?
Apparently so!

Some other tips: Try to spread the drives across more than one of the
RAID
channels. You can do this by splitting the backplane in the AlphaServer
so
the top three slots are on one RAID channel and the bottom four another.
This offers a noticeable speed improvement. For example but one of the
RZ28D-VW's on each channel to reduce contention between them for the
controller. Also, split the disks in the RAID-5 array across different
channels.

Why are you using LSM to mirror the disks? I admit I don't know anything
about it, but you could also make the two disks a RAID-1 set so they
would
be mirrored via the hardware controller. That requires no extra
software.

Is one giant RAID-5 disk the best way to handle all the Oracle files? I
admit I'm no database expert, but from what I've read, it might be
better
to have a couple of smaller disk sets spread out over multiple channels
of
the RAID. Also, you could put the redo logs on a mirrored disk (RAID 5)
and
put the other database files on non-mirrored disks (RAID 0). I found an
article from an old oracle magazine that talks about different Digital
RAID
types to use with Oracle. You can find it at:
http://www.oramag.com/archives/65OPDIGI.html
The Oracle documentation also talks about how to best combine Oracle
with
RAID devices.
-----------------------------------------------------------------------------
The answer is "it depends". Which disk(s) show up as re0 depends on how
the *logical devices* have been configured in the RAID configuration
software. The re# devices are *logical* devices, which means that the
ordering may not be logical. :-) [The RAID configuration software used
to
be called SWXCRMGR, but as Digital has renamed the SWXCR controller to
something which I can never remember, they may have renamed that too.
You
have to boot to the prom prompt and get it to run SWXCRMGR. If you're
dealing with RAID, you'd better get used to this because you can't get
at
it from the unix level.]

Anyway, given the disks you say you have and the lines above, I'd say
it's very likely that Digital has given you what you want (the two 2G
disks as separate devices and the 4G disks all lumped into a RAID5, they
just didn't name them the way you expected. If the "mis-nameing" really
bothers you, then you'll have to re-organize the disks with the RCU and
re-install.

It looks like all your disks are on the swxcr controller which means
that
scsi ids as you probably knew them are out the door. you use the floppy
disk with the standalone RAID swxcrmgr software on it. You put this
into
the floppy drive and type arc at the >>> prompt then select run a
program
from the menu. type a:swxcrmgr <return> and your away -- you are in
the
swxcrmgr utility which is where you define your drive groups and your
logical drives.

The drive numbers (rex) depend on how you arrange the drive groups but
this
is the basic jist (cant remember the exact menu items)...

a:swxcrmgr
define/arrange/create drives or groups
create drive group (you can do this for more that one group)
arrange drive group (this is arranging which group goes first)
define logical drive (define your groups as drives selecting which raid
flavour etc...)

then you exit that menu and save the config and then initialise the
disks
-- all the new ones at the same time

Re: preinstalled software -- try booting of some of the other disks --
It
may just be that you don't have any preinstalled software on them in
which
case you would have to install it yourself -- no big deal it's not that
hard. As far as your disk configuration goes... You won't have mixed
disk
sizes in a raid 5 configuration they will all be your rz29s -- but you
will
confirm this with the swxcrmgr utility. I would say that your 2 jobs
are
the rz28's and I think it would be better to have your system disk as a
raid 1 config with the two rz28's, you will probably have to
reconfigure
your RAID and cancel all you groups but seeing though you have no data
on
them this is not a problem and even good practice -- best get familiar
with
it when your not under pressure. That way your system disk is safe as
houses which is important -- and i don't trust lsm and is a pain to
install
and configure -- hardware mirroring is much more reliable and provides
better throughput.
------------------------------------------------------------------------------
It would seem that you have only two channels on your swxcr active -- if
you have all three with disks on them then you may want to have 2 raid 5
(x3 RZ29's) sets for you oracle data spreed over all three channels.
This
way you loose capacity for redundancy but you can spread your oracle
database over the two raid sets and if one channel goes out you will
still
have the other raid set available -- it may be advantageous. The main
raid
hint I think is to spread your disks over different channels for the
above
reason. If you have a mirror raid on one channel (ie your system disk)
and
your channel goes on your swxcr controller then you whole raid set goes
--
bad.  Don't forget to have swap on a redundant raid set.  If you swap
goes
your machine goes -- doesn't pay to have it on a jbod.
-----------------------------------------------------------------------------
 Isn't the "0" disk always the internal disk?      NO 
Does this mean the two
RZ28's are part of the RAID-5? NO       Don't I want:
re0 at xcr0 unit 0 (unit status = ONLINE, raid level=JBOD)
re1 at xcr0 unit 1 (unit status = ONLINE, raid level=JBOD)
re2 at xcr0 unit 2 (unit status = ONLINE, raid level=5)
and if so, _how_ do I get it? How do I know what disks are what?
Or do re0, re1, re2 not correlate like this because of RAID and I can
just go ahead and load UNIX on re1 and mirror re2?
 
YOU MUST USE RAID STANDALONE DISKETTE TO CONFIGURE YOUR RAID LIKE YOU
WANT.
I RECOMMEND YOU TO DO TWO GROUPS:
ONE RAID 5 WITH THE SIX RZ29 DISKS, THAT IS ALREADY CONFIGURED,
AND ONE RAID 1 (MIRROR) WITH THE TWO RZ28 DISKS, SO YOU MUST DESTROY RE1
AND RE2, AND CREATE A NEW RE1 THAT CONTAINS BOTH RZ28 DISKS.
--VV----Original question---VV---
Advice, pointers, head-butts needed...
I have for the first time received a just-out-of-the-box Alpha 1000A
5/400 256 Mb which I thought was going to come with DU 4.0b
pre-installed.
It has an internal 2.1 gig RZ28D-VW disk, and another RZ28D-VW in the
first shelf slot followed by six 4.3 gig RZ29B-VW's. I wanted the boot
disk with / and /user to be the internal RZ28D-VW and to mirror this
disk (LSM) on the other RZ28D-VW in the first slot. The six RZ29B-VW's
were to be RAID-5 for data storage (Oracle). (I have no experience with
RAID; I hope this setup makes sense?)
I had visions of simply doing the FIS procedure and getting started.
Alas, I can only boot from the CDROM. Booting the default device fails
with:
   boot dka0.0.02000.0 -flags a     (where is this?)
   failed to open dka0.0.0.2000
"sho dev" reports:
   dka400.4.0.2000.0  DKA400  RRD46 0557
   dra0.0.0.13.0      DRA0    6 member RAID 5
   dra1.0.0.13.0      DRA1    1 member JBOD
   dra2.0.0.13.0      DRA2    1 member JBOD
   dva0.0.0.1000.0    DVA0
   ewa0.0.0.12.0      EWA0    00-00-F8-05-B4-DE
   pka0.7.0.2000.0    PKA0    SCSI BUS ID 7 2.10
Booting from the CD reports:
   re0 at xcr0 unit 0 (unit status = ONLINE, raid level=5)
   re1 at xcr0 unit 1 (unit status = ONLINE, raid level=JBOD)
   re2 at xcr0 unit 2 (unit status = ONLINE, raid level=JBOD)
Isn't the "0" disk always the internal disk? Does this mean the two
RZ28's are part of the RAID-5? Don't I want:
   re0 at xcr0 unit 0 (unit status = ONLINE, raid level=JBOD)
   re1 at xcr0 unit 1 (unit status = ONLINE, raid level=JBOD)
   re2 at xcr0 unit 2 (unit status = ONLINE, raid level=5)
and if so, _how_ do I get it? How do I know what disks are what?
Or do re0, re1, re2 not correlate like this because of RAID and I can
just go ahead and load UNIX on re1 and mirror re2?
Sorry to ask such basic questions, but I've been RTFM-ing and am
thoroughly confused. Could some kind souls _please_ enlighten me enough
to get me started? Many XOXOXO to those who do, or I'll kiss the cat
(cringe) when (if) I get home.
Received on Thu Aug 21 1997 - 21:13:23 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT