SUMMARY: Fileserver configuration

From: Martin Mokrejs <mmokrejs_at_mail.natur.cuni.cz>
Date: Wed, 17 Dec 1997 23:25:27 +0100 (MET)

Hello,
 thank's anyone who replied. Now I understand much more RAID and names of
Digital products. ;-)

**************************
Original messages:

From: alan_at_nabeth.cxo.dec.com

>My questions are:
>I think about AS, singleprocessor because I'm not sure if there are some
>apps for multiprocessor environments.

It depends on whether you view the software as a single application
or the system as a set of applications. Even file servers run more
than a single process. On an MP system, all the processes can be
spread among the processors. If you're going to be running some
computationally complex (or intensive) application then having an
MP system could be of value because it will let the file services
have the use of the other CPUs to provide good response.

>RAM about 512MB or 1GB, probably no
>more. Disk space about 15-20GB. What do you think about RAID 0 on two
>disks on which will be located filesystem with databses(to speed up
>database searches)?

I like striping. It has value in applications that higher bandwidth
for sequential reads and writes than a single disk can provide and
for applications where multiple parallel I/Os are needed. If you
can spread the I/O out enough, then you may want to consider a stripe
set with more members. But even two members can be better than a single
disk.

>I think we want to stripe about 2-4 GB, no more.
>RAID 0+1 would be nice, but what configuration and what price it would
be?

The one significant draw back to striping is that if you have a disk
crash of one member you've basically lost the whole stripe set. By
mixing mirroring and striping, you have the best of both RAID levels
and few of the limitations. The price is likely to be little more than
the price of the extra disks and controllers.

The details depend on where you decide to each type of RAID.

> - we have access to DEC Campus licences so we will have sw like cc, c++,
>f77, f90, OpenGL, perl etc. Is there paralel version of any of these?
>I'm still not sure if we will utilize multiprocessor machine....

Are you asking if the applications themselves are multi-thread or
how easily they produce multi-threaded applications? If you have
a single person running the compiler and that single compile takes
a long time, then a multi-threaded compiler that can do work in
parallel has considerable value. If the compiles are small and/or
you have many of them running at once, letting the operating system
do scheduling of multiple processes on multiple CPUs may be sufficient.

I don't know about the capabilities of the various applications to
produce multi-threaded code. The Fortran 90 compiler may have the
best chance.
 
>I know there are some KZPSA-BB controllers, KZPSC-AA or BA
> -those are probably PCI /EISA drive controllers.

The KZPSA controllers are (relatively) simple SCSI adapters. Each
adapter is one SCSI bus and the host uses the SCSI device driver
stack to control the devices on it. The KZPSC (all the KZxxC
devices) are array controllers that use SCSI for the device
connections. The controller itself uses a custom device driver.

>There is some Storage Works(is it SW or HW)?
> -it's probably the external case. Is there inside also a controller? In
>all cases, or just a SCSI bus?

StorageWorks is the family name for a wide variety of hardware and
software products. Some of the newer media will even have the
StorageWorks logo on it. You're going to have to much more
detailed in your question.

>What are RAID arrays? Some series like 210, 230, or 450... What's that.
>Also "just external boxes"? What's inside? Some default devices?

Poor use of the English language... RAID is Redundant Array of
Inexpensive (or Independent depending on who you talk to) Disks.
Therefore a "RAID array" is a "Redundant Array of Inexpensive
Disks array". When talking of the singular, the extra array
is really redundant. Certainly one can have arrays of RAIDs...
But I digress.

Particular to the products, the RAIDarray 210 and 230 are
packaged subsystems which consist of disks, shelves to put the
disks in, the KZP*C array controllers, software, cable, etc.

The RAIDarray 450 is a cabinet subsystem, disks, software and
array controller that connects to a SCSI bus, rather than use
a dedicated backplane controller like the 200 family.

Just looking at a StorageWorks Product Guide Winter 1997 there
are seven seperate model numbers for the RAIDarray 230. Some
models are single channel versions of the backplane controller,
others are the three channel versions. Model numbers are just
the controller without disks or shelves. Within the one and
three channel parts numbers the variants are different capacity
drives.

The RAIDarray 450 part numbers are for the cabinet, a single
HSZ50 array controller, power supplies, etc. The cabinet can
hold 24 disks, redundant power supplies and a redundant
controller. The HSZ50 is the 2nd generation StorageWorks
array controller. It has 6 SCSI busses for connecting
devices thought the base cabinet only supports four devices
per bus. It isn't clear from the product guide whether
you can add expansion cabinets. The controller uses a
Fast/Wide/Differential SCSI for host connections and can be
used on a shared SCSI bus among multiple (cooperating)
systems. The controller supports RAID-0, RAID-1, adaptive
RAID-3/5, RAID 0+1 or simply presenting the back-end devices
as just a bunch of disks (JBOD). It also supports connecting
SCSI tapes and other devices.

>If you have any suggestions to software, hardware or anything else I'd
>like to hear from you. So what machine would you dream about?

For the upper end of your capacity you can also most get that with two
9 GB disks. No need for array of any sort at that point. For this
you can the Logical Storage Manager to make a stripe set of the disks
and simply put them on one or two suitable SCSI adapters. If you
foresee adding additional hosts to provide file service failover,
stick with pure SCSI and avoid the backplane controllers. If you
see needing expansion into hundreds of gigabytes, think about the
RAIDarray 450 or the newer UltraSCSI RA7000 and ESA10000 and then
get as many disks as you'll need now.

For the small configuration get one or two StorageWorks pedestal
enclosures and suitable SCSI adapters foreach one. I think the
current version of the pedestals use a Single-ended SCSI connection
so a Single-ended SCSI adapter would be best (KZPDA or the SE version
of the KZPBA). The Single-ended version of the KZPBA is probably
the better choice since it can support UltraSCSI disks and devices.

With the LSM license you can stripe and/or mirror the data among
the disks. Spreading the disks over two controller allows setting
up the configuration for higher bandwidth or high availability.
If you want both and have the PCI slots go with four adapters
and enclosure and really spread the data around.

Don't forget a tape. Today, I'd go with a TZ89 table top drive
or StorageWorks brick version. An 8mm AIT drive might be less
expensive. As your capacity expands you can get more single
drives or consider a small tape library.


******************************
>From peter_at_wiscpa.weizmann.ac.il

On the software side, Digital has a product called DXML (Digital
eXtended Math Lib) which is included in DECcamous and has parallel
versions for some of its subroutines, which include BLAS 1,2,3
(i.e. vector-vector, vector-matrix, matrix-matrix) Signal-processing
(e.g. FFT) et al. If your programs can call any of these built-in
subroutines, you are in great shape because all the optimization and
parallelization has been done for you at a low level. In fact, here

There is also a High Performance Fortran option which I believe is
part of DECcampus which lets you include parallel directives in F90.

You can also buy the KAP preprocessors (for C, C++, F77, F90) which
include parallelization options.

As for processors, if money is no object, go for the new EV6 processor
(SPECint95 ~40, SPECfp95 ~50). Digital is now selling the AS 4100
with a 533MHz EV56 chip (in fact, I believe that the 600MHz chips is
just out), with a built-in upgrade in six months to the EV6. This lets
you put in up to 4 cpus. If you need only two, I think that there are
other AS options, but I am not sure about the model numbers.


********************************
>From GGuethlein_at_GiantOfMaryland.com

 I've been lucky enough to use a fairly wide range of DEC
equipment in support of out DEC Unix systems, af follows :
     A/S 2100A +/- 60GB
     A/S 4100 (3) 60 - 100GB (1 uses a RAID Array 410)
     A/S 8400 500GB

I'll try to answer your questions, in no specific order.

1.) KZPSA-BB controllers are PCI-based FWD SCSI Adapters
       which can handle up to 15 SCSI devices.
2.) KZPSC controllers are are PCI-based RAID controllers
       -AA is one port & can handle up to 7 drives max
       -BA is three port & can handle up to 21 drives max
        *** the actual # of drives depends on the Level of RAID used
3.) StorageWorks is the offshoot company that now handles ALL
       of the Digital Hardware (ie. cabinets, drives, cables, tape
       drives, etc)
4.) The RAID Arrays are preconfigured cabinets with the storage
       shelves hard-wired to the cabinet backplane. Our R/A410
       came with one SWXCR controller (similar to a KSPSC-AA ?)
       and handles up to 24 drives across 6 I/O channels (ports).
       We opted to buy a 2nd SWXCR and run in dual-redundant
       mode. NO DRIVES ARE INCLUDED, you but them seperate.
       These devices have a set capacity. If you think you may need
       to expand later, you should consider a larger storage
       cabinet and use the stand-alone HSZ controllers and
       storage shelves.
5.) ALL of our systems are at least dual-processor. We don't
       use all of the software you mentioned. But the software we
       do use runs fine on the dual CPU machines.
6.) The CPU speeds are changing FAST, but the best you can.
       The 625mhz are out & we still have some 300mhz systems.
7.) The benefit of the RAID controllers (any of them) is that they
       have the on-board cache. We have 32MB in ours and it
       screams (for our database app). Also, you can use the
       RAID controller software to mirror (R 1) AND stripe (R 0) the
       same storage AT ONE TIME. ALWAYS do it in that order too.
       You create mirror sets first, then you stripe across the
       mirrors. That order will decrease the chance of an outage
       due to a second disk failure .
8.) You can accomplish your mirroring or striping by using the
       Logical Storage Manager (LSM) software. I'm not sure if it
       can do both (to the same piece of storage). We use this for
       R1 on our 500GB system, with great reliability. On another
       system we use LSM to create stripsets on top of disks that
       are defined as mirrorsets via RAID.
9.) Due to database constraints (Informix 2GB DBSpace limit),
       we needed LSM to break our 4GB drives into smaller Logical
       Volumes (regardless of the use of RAID).
10.) LSM will allow you to easily manage your disk storage. And,
        when used with AdvFS, management of filesystem sizing /
        resizing is easier too.

Four last things to wrap this up :
1.) I think LSM & AdvFS should be considered a necessary part
of the Operating System. Their uses and benefits far out-weigh
the cost.
2.) If you want speed and reliability (Write-Back Cache) on the
applications access, then it would be hard to beat the RAID
technology.
3.) Get your hands on a copy of "The RAIDbook - A Source For
Disk Array Technology", published by the RAID Advisory Board. It
has everything you need to know about how RAID works, and
how to implement it.
4.) Your best investment (AND ITS FREE !!!) would be to work
closely with your Digital Value Added Reseller (VAR). Their
Technical Consultants can help you figure out what you need to
do what you want. *** I'm just guessing that you have that type of
support network in the Czech Republic.

I hope this will be of some help to you. It all comes from my own
personal experience (mostly learning the hard way). Best of luck
on your choices.


-------------------------------------------------------------------------
| Martin MOKREJS - Net&SysAdmin |
| PGP 5.0i key at: finger://mail.natur.cuni.cz/mmokrejs |
| mmokrejs_at_natur.cuni.cz Faculty of Science, The Charles University |
| tel.: +420-2-2195 2315 Albertov 6, PRAGUE 2, 128 43, Czech Republic |
-------------------------------------------------------------------------
Received on Wed Dec 17 1997 - 23:26:23 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT