[SUMMARY] Storage system, SCSI, RAID

From: Jarek Wieczorek <icesnowfire_at_hotmail.com>
Date: Wed, 10 May 2000 07:56:12 +0000 (GMT)

Hi

Thanx everybody for your sugesstions. I find alan_at_nabeth.cxo.dec.com's ones
the most useful. I learned that at www.compaq.com/storage there is a
resonable source of information, especialy considering available hardvare.

My oryginal problems was:

I would like to hear your advise on my disk data storage system
configuration and its hardware.

My system consist of two subsystems (currently on one machine, later on I
will dedicate separate Alpha for each of subsystems).

Subsystem I
Oracle DB with raw data about 200GB. Once the db is created it will not have

heavy update load. Only a few concurrent clients. The main purpose of this
db is to produce other databases. It will be done as a back-end process. So
speed does not matter much here. What matters is easy backup and recovery
and data safety and redundancy.

Subsystem II
That is a bunch of custom build static databases. Each of them about 30GB.
The databases are exclusively static (read only) and are generated by
subsystem I. They go online and have a very heavy read load. So speed
matters but redundancy and recovery does not since those databases can by
easily recreated by system I.

Currently available hardware:
Alpha Server DS20E + 1GB RAM + 9.1GB HD+
RAID200 (7 * 9.1GB HD)

Please advise on hardware I should buy (RAID, SCSI, controllers, etc) and
its configuration (RAID levels, file systems, etc). Of course having in mind
$$$ criteria.

And the reply was:

        For the first database, I'd use controller based RAID-5,
        controller based RAID 0+1 or a mix of controller and
        host based RAID 0+1. If you're not under time pressure
        to build the database initially, the relatively poor
        write performance you're likely to get from RAID-5 won't
        be much of an issue and it will provide the striping like
        performance you want on read with reasonable redundancy.

        On the other hand, for the best redundancy and performance,
        you want RAID 0+1. With the right controllers, you can do
        this completely in the controller, or completely in the host
        or in both. Some of the combinations:

                o Build Mirrors on the controller and stripe those
                   across one more controllers.
                o Build stripe sets on one or more controllers and
                   mirror those on the host.
                o Use the controller(s) as means to connect lots
                   of disks and then use LSM to mirror and stripe
                    the data.
                o Use the controller to build Mirrored stripe sets
                   or striped mirror sets according to what it
                   supports.

        Using 36 GB disks you can reach the 200 GB capacity requirement
        with just 6 disks. With 12 you can a RAID 0+1. Using simple
        SCSI connected disks over two or four busses you could do this
        entirely on the host. With the new StorageWorks packaging you
        could do it with a single dual-bus shelf. With the old
        packaging you'd need two shelves.

        I'll point out that you don't say how many 30 GB database
        you're going to have on "Subsystem II", so it is hard to
        advise without knowing the capacity. The data organization
        I'd use would be striping if you're using a relatively
        small number of disks and RAID-5 for more disks.

        If you go the controller based route, you have two choices;
        host connected parallel SCSI or host connected Fibre Channel
        SCSI. In the short-term parallel SCSI will probably be less
        expensive, but Fibre Channel based SCSI gives you some interest-
        ing options; notably having just one subsystem.

        For this last case, an HSG80 based subsystem would work
        well. Each controller has two Fibre Channel interfaces
        and a subsystem supports two controllers for failover.
        With two systems you can still keep all the storage on a
        single subsystem and isolate access to unit on a per host
        basis. With the V5.0A version of TruCluster, you can
        make the two systems a cluster and share the storage
        completely.

        With parallel SCSI, you can probably still get an HSZ70 or
        HSZ80 based subsystem. You can connect the controllers to
        both hosts.

        For the single large database, you either want raw disk
        space or AdvFS. UFS will probably work for a file system
        that large, but it isn't supporting beyond 128 GB these
        days. For a single large file the fsck wouldn't be too
        bad, but bad enough. Whether to use raw space or a file
        system depends as much on the database software as anything
        else. Each has its own advantages and disadvantages.

        One of the advantages that AdvFS offers is ability to make
        a clone of the particular domain the large database is on.
        This will allow backing while it is in use, though you'll
        want to make the initial clone while it is idle. On the
        other hand, guess at how you're using the data, this may
        not be necessary.

        An AdvFS clone copies all the metadata for a fileset (file
        system) to a new fileset. Anytime data changes on the
        original, the clone gets a copy or the original data. The
        original blocks are always references on reads. So, you
        could do something like:

                o Make initial database.
                o Make a clone.
                o In parallel...

                   + Backup the clone.
                   + Create the 2nd step databases.

        However, this will put a fairly heavy read load on the
        initial database since the clone is reading all the
        original space. You'd probably just as well off making
        the 2nd step database and then backup the initial database.

        Going the V5 TruCluster route you'll have to use AdvFS
        for any read-write file system, since UFS won't be sharable
        in a cluster.

    Hardware.

        The disadvantage of a bunch of disks in shelves, is that
        you need lots of SCSI busses, to connect lots of disks.
        If the Subsystem II space requirement is comparible to
        that of Subsystem I you're looking at 12 36 GB disks and
        no redundancy (or 23 18 GB disks, or ~45 9 GB disks).
        That's a lot of shelves and controllers.

        Going with an HSZ70 or HSZ80 based subsystem (parallel
        SCSI), you could get two RA7000s/RA8000s which support
        24 disks each. Two of these do fit into a single cabinet
        if floor space is an issue. You can also expand each
        device cabinet to up to 72 disks. I believe packaging
        names are:

        BA370 - The basic 24 disk building block of most of the
        following subsystem. The BA370 has six SCSI busses
        which support 4 disks each. The busses can be extended
        to additional cabinets to allow 48 or 72 disks per
        subsystem. The BA370s supports UltraFast, Wide SCSI
        devices.

        RAID Array 7000 - HSZ70 based subsystem in a pedestal
        that supports 24 disks.

        RAID Array 8000 - HSZ80 or HSG80 based subsystem in a
        pedestal that supports 24 disks.

        Enterprise Storage Array 10000 - The cabinet mounted
        version of the RA7000. A single cabinet holds two of
        the 24 disk "shelves". A 2nd cabinet is needed for a
        the 3rd BA370. The two common configurations of two
        BA370s were the pair connected to a pair of redundant
        controllers or each BA370 with its own pair of
        controllers.

        Enterprise Storage Array 12000 - The cabinet mounted
        version of the RA8000 (parallel SCSI or Fibre Channel
        based controllers). I believe this has all the same
        configurations options as the ESA 10000.

        HSZ70 - A RAID controller supporting a UltraFast Wide
        Differential SCSI connection on the host side and six
        UltraFast Wide Single-ended busses on the device side.
        The subsystem supports pairs of controllers which can
        be used to balance the I/O load when both are working
        or failover to one if the other fails. The subsystem
        supports a 128 MB cache per controller. This cache
        can be mirrored between the two controllers for better
        availability. A battery backup makes the cache non-
        volatile for a sufficient period to survive most power
        outages, allowing to be used as a write-back cache.

        The controller operating system supports Striping (RAID-0)
        Mirroring (RAID-1), RAID 0+1 (striped mirror sets),
        adaptive RAID-3/5 (*) and the ability to present the
        disks individually (JBOD). It also supports partitioning
        of large storage sets into smaller logical units.

        HSZ80 - The follow-on controller to the HSZ70. The
        basic I/O subsystem is the same on the back-end. The
        HSZ80 supports a 512 MB cache. It uses a faster internal
        bus for better performance and has two host interfaces
        per controller instead of the one on the HSZ70. I believe
        the multi-path support in V5.0A can take advantage of this
        feature.

        HSG80 - Rather than having two UltraFast, Wide SCSI interfaces
        this has two 100 MB/sec Fibre Channel interfaces. I believe
        it is only supported in a switched fabric, which does raise
        the subsystem cost.

        Adaptive RAID 3/5 - The controller will use RAID-3 like
        write algorithms to write data when possible instead of
        pure RAID-5 (read-modify-write). In loads going sequential
        writes this can improvement the performance of RAID-5.

        Modular Array 8000 and Enterprise Modular Array 12000. These
        subsystem use the new StorageWorks packaging. Only Fibre
        Channel based subsystems are available. Where the old
        StorageWorks packaging used shelves that could only hold
        6 or 7 devices, these shelves hold 10 or 14 devices. They
        support split bus and dual power supplies/blowers just like
        the old packaging.

bye,
Jarek
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
Received on Wed May 10 2000 - 07:57:31 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT