Many thanks to people who answered to my original posting :
alan_at_nabeth.cxo.dec.com
Chris Jankowski chris_at_lagoon.meo.dec.com
Ron Barett ron_barett_at_corp.cubic.com
Derek Alexander Derek.Alexander_at_snl.co.uk
Christian Elias G=Christian;S=ELIAS;O=SCIC;P=cec;A=rtt;C=be
>I am still looking for experiences on Oracle databases over LSM + ADVFS.
>Below are some worries on the best way to setup a new system, before installing Oracle.
>I plan to dedicate 2x3 RZ28 disks for Oracle data + indexes . Each set of 3 disks is mirrored
>(soft via LSM) onto a separate BA350 cabinet. Another set of 2x2 disks will be dedicated to
>Oracle structures like redologs, rollback segments, ...
>I plan to make use of ADVFS, because this brings usefull tools (like clonefsets for very
>fast backups, defragmentation, online volume extension ,...). I have the feeling that using
>Advfs instead of raw devices will slow down performances (compared to raw devices), but this
>should be (at least) balanced by the fact I will stripe each plex over the three disks. I decided
>to (mirror +) stripe ( Raid 0+1 soft) when I saw that Oracle made use of datafiles sequentially,
>that is filling up the first allocated datafile before going to the second one.
>My questions are :
>a/ do I setup LSM volumes with gen or fsgen usage type ? LSM documentation only recommends
>fsgen if you plan to install a file system over the volume. This does not really help me.
---> All answers urged me to use fsgen, because I will install an (ADVFS) File System over it.
Have not got supplementary information on the differences in LSM behaviour.
>b/ what would be a suited stripe width (adapted to Oracle behavior).
>Have not been able to get recommendations neither from DEC nor from Oracle
---> 64 KB for the stripe width seems to be an adequate value to start.
Refer to the attached message from alan_at_nabeth.cxo.dec.com. Some people prefer doing *per file*
striping thanks to Advfs rather than using LSM volume striping (see attached mail from Chris Jankowski)
>Any expertise on this topic will be appreciated
>Jean-Claude PETIOT
>Agence Internationale de l'Energie - PARIS
>jean-claude.petiot_at_iea.fr
Alan e-mail :
-----------
From the description, using Advfs on top of LSM to store
the Oracle data, you're using a file system and thus
fsgen is probably more appropriate. I don't know what
difference it makes to how LSM works though.
Since you're layering the Oracle I/O load on top of a
file system, what I/O coming out the file system will
more determine the better chunk size than what Oracle
wants. Consider; if Oracle prefers a 2 KB page size,
but the file system is always doing 8 KB transfers, then
planning the chunk size for 8 KB is the better idea.
For small random I/Os 32 KB to 256 KB is probably a good
choice. Benchmarking a variety of chunk size may turn
up one which offers slightly better performance, but
there probably isn't one which is ideal. You want a
size that will distribute the load among the disks
evenly, but not exercise any data organization quirks
that make some structures cyclic and thus end up on the
same disk.
For example, if the software writes a summary block every
64 KB you don't want the chunk size to be 64 KB because
that summary block will always end up on the same disk.
You may want to prefer a size that spreads it across all
the disk's. I've heard of instances where data was like
that.
You probably want the chunk size to be a multiple of the
I/O size to increase the probability that a single I/O
won't have to span two devices. I/O setup takes an interesting
amount of time. If the I/O size is too small per device,
then the setup time dominates the transfer time and there's
no advantage to being able to the I/Os in parallel. Which,
I don't think LSM does anyway.
For large sequential I/Os, 64 KB and larger are better. Both
UFS and Advfs will limit the I/O size to 64 KB. With sizes
larger than that, a single disk I/O will tend to be as large as
feasible. I/Os crossing disks can still be reasonably large.
For an I/O load dominated by small random I/Os, I'd use 64 KB
to start and see how the load distributes. If one disk comes
up busier than the others try 72 KB. If the load will be
dominated by large sequential transfers, 64 KB to 256 KB is
a good range.
Benchmark, benchmark, benchmark. Be willing to experiment with
real I/O loads and different arrangements before going into
production. It takes time, but the choice of a particularly
bad chunk size may cause performance problems later. Don't
try to be exhaustive either. I/O loads change and what was an
ideal choice yesterday, not be the ideal choice tomorrow.
Chris Jankowski e-mail :
------------------------
I recommend that you use LSM for mirroring only - build volumes of
mirrored drives. Then build AdvFS on those volumes.
Use AdvFS *per file* striping to stripe only the neccessary Oracle files.
Normally those are only the redolog files.
Ron Barett e-mail :
-------------------
I'm running Oracle over Advfs, although I'm not using RAID 0 or 1 on that file
domain & I haven't gotten to any tuning steps yet. I just have a few people doing a
small number of updates and queries. RAID 0+1 gives a maximum quantity of performance
and reliability, albeit at a higher cost (for the extra disks), so it sounds as though
your setup is optimized for data availability and performance. Since you are creating
Advfs FILESYSTEMS, you will want the fsgen parameter on LSM as opposed to the gen
parameter. The stripe width determines WHEN you stop writing to the first RAID 1 volume
and start writing on the next one. When Oracle is reading sequentially, the RAID 1
configuration balances the load very well across all disks. Their is a slight
performance penalty for the RAID 0 mirroring, but you probably won't notice it. I'm not
sure which stripe width will work best for you, I believe the default width is 128K, my
stripes are 64K (for a filesystem which is being used for general purpose (ie not
Oracle) work. I would venture to guess that the default stripe width might not make a
bad starting point, although changing the width requires a backup, volume recreation,
and data restoration from the backup.
Derek Alexander e-mail :
------------------------
We have just completed a migration of a client server application
using oracle from OpenVMS to D Unix on dual Alphaservers 2100. We
have the disks connected via HSZ40s and a Fast/Wide SCSI controller on
the 2100. We have used LSM to stripe and mirror (0 + 1 ) the ORACLE
datafiles, software etc. We also used UFS rather than ADVFS. The
main reason for this was that we had no real experience of ADVFS in a
production environment and could give no guarantees regarding the
performance, reliability etc. For backups we shutdown the database,
drop off a LSM plex and startup the database, total downtime on the
database is about 30secs. In this way we get a clean backup and a
quick recovery time in the event of a disaster ( we also have
archiving turned on for recording the changes between backups ).
If you want to use ADVFS you will have to tag the volume as fsgen and
not gen. You would use gen in the same way as a raw partition eg
swap. On our systems all volumes are fsgen except the swap volumes
which are gen. I believe that the volumes would have to be fsgen for
ADVFS.
The issue on stripe width size has been going on for a long time. DEC
bought in LSM from Veritas Software in the US as have a number of
other computer manufacturers. We have some large Sequent unix
machines running Oracle applications, which also have a version of the
Veritas software (SVM, Sequent Volume Manager ). This has been in use
for 4 years now and the debate regarding the stripe width size has
been going on for a long time. We have tried to get information from
Oracle, Sequent and DEC, and the only real answer is that it depends
on your application, number of disks in the stripe, the type of disks,
the disk controllers etc., and that it probably should be a
multiple/factor of the disk transfer size. We currently use a stripe
width of 64k. We have experimented a fair amount over the last 4
years and can predict the performance using this stripe width.
Christian Elias e-mail :
------------------------
J'ai eu l'occasion de lire votre mail concernant Oracle et LSM.
Notre installation est similaire a la votre.
Notre serveur est un Alpha AXP2100/A500, 4 processeurs et OSF1 v3.2.
Nous utilisons un RAID controler sur un bus EISA (32 Mb/sec) avec 2*3 disk
1gb mirrored and striped avec LSM (stripe width de 64 secteurs et usage type
FSGEN). Le choix de FSGEN ou de GEN depend de l'utilisation de RAW devices ou
non; nous avons decide d'employer les file system normaux (facilite de
backup). D'autre part la stripe width a ete decidee pour eviter le mouvement
des tetes de lecture des disques.
Notre experience nous laisse croire que il existe de meilleures solutions
vers lesquelles nous allons migrer. Les niveaux RAID 0 et 1 vont etre fait
d'une maniere hardware via le RAID controler et non plus par LSM. Le RAID
controler EISA va etre remplace par un RAID controler PCI a 132 Mb/sec.
Au niveau des structures ORACLE les tablespaces de donnees et d'index sont
places sur les disques de ce controleur tandis que les segments Rollback, les
redologs, les segments temporaires sont places sur des disques normaux sans
raid 0 et 1 afin d'eviter de trop grands 'overheads'.
Received on Tue Jan 23 1996 - 10:55:05 NZDT