Admins,
There were a few responses to the orginal question on problems or sucesses with
wildfire implemntation.
the vote count is 3 to1 in favor of good performance, and few implementation
problems.
The negative vote
I am a bad person to ask that question to......if you are running a
database dont go with this. we have found many issues with 5.1 and the
wildfire system so much that we already pushed off the migration by two
weeks because compaq support cant get us answers.
the big thing we have found is that as long as the cpu and the memory it
needs to access is in the same QBB you are ok, but if has to make a remote
memory call you will have a 3x hit ( 600ns ) I will forward some docs we
have on all of this stuff. I would read it all carfully and note the
"unhappy story" in one of the docs. I wish we had this before we spend xxxx.xx
on this hardware, since it will not run any faster than out GS140
and if you go to big GS320 could be even slower.
The documentaion was very helpful... I would reccomend any one intersted
contacting Compaq and asking for the training documentation for the Wildfire...
The postives went like this
In general, the answer is yes, even if you have 700 MHz EV67 with 8 MB cache
in the 8400s, simply because the architecture in the QBBs is better. The SMP
has lower overhead and the memory bandwidth is (much) better compared to
8400. Ofcourse there are two levels of performance with QBBs, the local (and
very good) and the remote QBBs (where access has to go thru the switch). The
OS will generally try to keep things local.
If used correct, this local/remote difference is a good thing because you
get really fast local access, and it's usually a good idea to consider this
when implementing the solution (partition your application where possible).
Next thing is to look at the type of jobs/applications. A few very fast CPUs
will run a few singlethreaded processes very fast. My experience is that
this is often very important in most companies. The users don't care about
the ability to run 2000 processes concurrently, they care only about the
execution time of their process, and a single 1 GHz CPU is usually much
better at this than two 500 MHz unless it's 100% utilized.....
I do not want to speak to much out of turn here, but we are doing something
similar to what you ask. The solution is new, but is modeled after another
customers solution. We are running a GS-320 with 64GB of RAM and about 20TB
of fiber storage. (Modular in the ESA12000 cabinet). From the conversations
that I have been privy too we can expect a 7x-10x overall increase from the
8400 (2 node cluster w/ SCSI storage (Z70's)) and have gotten 128% of
parallelizm with Oracle......
Just a quick note. We have made the transition here but it was on our
OpenVMS platforms and not the Tru64UNIX machines. On the OpenVMS boxes we
enjoyed a smooth transition experiencing significant performance gains for
the application with the new hardware. Volume shadowing operations (host
based mirroring) have recieved a significant boost (indicating vastly
improved I/O operation with the new Fibre Channel SAN's and CPU intensive
tasks we also greatly boosted.
We were very fortunate in that we were able to utilize our test
cluster and hot-standby machines to build our new cluster and make all
configurations actions in advance of the switch over, also we performed the
application migration over a holiday (easter) weekend, and hence had plenty
of leeway for our large DB restores.....
The first case was a server consilidation. For a certain application we used
six Alphaservers (4 production and 2 test and development) which we have
migrated to one GS160. The "old" Alphaservers, running Tru64 4.0-D, were
giving performance problems and there was no room to upgrade the systems.
Therefore we looked at the GS160 and went to XXXXX with our applications
to test them on Tru64 5.1 on a Wildfire.
In XXXXX we had some small problems with some of our C-programs, but
besides that we could run all of our applications fairly easy on the new 5.1
Tru64 version. I must mention however that we ran all our applications on
stand-alone 'domains' within the WF, consisting of one or more QBB's.
Eventually (at the end of the Proof of concept) we decided to implement a
Trucluster solution within the WF, to meet the bussiness requirements on
downtime etc. Because it was the end of the POC we were not able to test
this in the lab.
Back in our home country we also added the wish to migrate from SCSI to our
fiber optic based SAN (based on EMC). Eventually the assignment became to
migrate the current stand-alone systems based on Tru64 Unix 4.0-D to a GS160
with Tru64 v5.1. We planned the clusering of three QBB's in a cluster for
all production applications. The fourth QBB was intended to be the test and
development node. Furthermore all QBB's were connected to the SAN and should
use the SAN disks for booting as well as regular data storage.
So this was quite an ambitious assignment as you could imagine. I must say
we had quite some problems in the migration traject. To mention the most
important:
* booting from SAN disks gave some problems, because the knowledge about how
to do this (booting an Alphaserver node from a EMC disk on the SAN was
officially supported in the end of 2000 by EMC).
* not all layered software products were supported on Tru64 v5.1, although
most of them did work correctly on the new version. To avoid problems later
on we decided to wait for the official supported version. Compaq states that
v5.1 is binary compatible with v4.0-x, but you have to make sure you test
every application (and relink the sources if possible) to be absolutely
sure.
* Migrating to a cluster does require some good thinking and planning. I
think we've under-estimated this part somewhat .... A
* The trucluster v5.1 is the first "production approved" trucluster version.
We've ran into some bugs which are now mostly solved in PK3 for Tru64 v5.1.
* The SAN gave us some problems. So setting up your SAN is also something
to give some good thoughts.
Right now we are running the GS160 to our satisfaction. Performance is
better on 3 QBB's then on 4 Alphaservers 8200 with more applications on the
QBB's. The WF CPU's (EV6.7 733 MHz) were said to deliver 100% better
performance and I think this is almost the case.
We also installed a GS320 which we are until now only using as a database
server. This machine is also connected to Fiber optic storage (Storageworks
disks ESA12000 with HSG80's and Broccade switches). We are still planning to
move more applications to this machine but are waiting for the certification
of some essential middleware products on Tru64 v5.1. The machine performs
well as database server, as you should expect. And this machine was
installed from scratch, so we didn't had any migration problems as described
with the GS160....
All names and references were stricken from the report to provide privacy...
Lee Brewer
Received on Mon Jun 11 2001 - 14:49:53 NZST