SUMMARY: Memory Channel questions from Eugene Chu on 1998-01-18 (tru64-unix-managers)

From: Eugene Chu <chu_at_musp0.Jpl.Nasa.Gov>
Date: Sat, 17 Jan 1998 12:28:48 -0800

The short answers to my questions are that:

1. The current MCI hub is a simple hub, but a new version with full
    switching capabilities is on its way. With a switching hub, the MCI
    would probably provide slightly better performance than HPPI.
    However, current TCP/IP over MCI may run at FDDI speeds (about 1/8
    peak channel speed).

2. Different classes of license for the TruCluster software is for
    different classes of machines that will run the license. Dec
    defines the 8000 class as enterprise, 4000 class as departmental,
    and 1000 class as work group. I don't know if there is a class
    for workstations.

3. For our applications, which are currently running under MPI on an
    IBM SP2, the easiest transition would be to another system running
    MPI. An optimized MPI, such as DEC's, would provide more efficient
    use of the machine, especially if it was coupled with a fast
    interconnect, such as the MCI. We're currently experimenting with a
    public domain MPI library on a couple of AS 500/500 workstations
    connected through 100 BT, which actually runs faster than FDDI.

Here are the responses and some of my comments. Thanks to all who
responded.

Julian.Rodriguez_at_digital.com:
>the MC hub is a simple one. We'll have a MC crossbar switch quite soon:
>contact your sales rep.
>
>The three TruCluster licenses correspond to the three sizes of
>AlphaServers Digital provides. For instance, for a 4100 you need a
>departmental type of license, whereas a 1000 needs a workgroup one.
>
>Maybe you only need the MC Driver license, instead of using the full
>TruCluster Production Server.

I was thinking of doing something like this; get the MCI and drivers,
and run my apps with DEC MPI.

k.mcmanus_at_gre.ac.uk:
>I have three 4100's each with four processors and interconnected with
>memory channel. We do not use TruCluster as we seek best parallel
>number crunching and so run MPI ( and PVM ) between the three boxes.
>As I understand it, and my speed-up results demonstrate, the MC hub
>is most certainly a hub. At the time of purchase we wanted a switch
>but this was not yet available. Perhaps DEC or whoever it is that
>makes memory channel have managed to make one now. With only three
>boxes it really doesn't make a great deal of difference but if the
>hub was pushed then it looks like performance degradation would
>show. I would like to know more about TruCluster which I understand
>to get in the way of MPI but I have yet to find the time to
>investigate.
>
>You don't really need the hub, in theory, I really must get around
>to ripping the interface cards out of the hub and stick them into
>the 4100's Each 4100 would then have two MC cards so
>I could connect them in a ring and throw away the hub
>bottleneck. It seems that nobody at DEC has yet tried this simple
>experiment and I cannot afford the down time necessary to do so
>much hardware rearrangement. Taking this idea further why not
>build 4100's into a hypercube? Now if someone were to cross my
>palm I could have an interesting little experiment.

This is an interesting idea; a connection machine based on Alpha nodes
and MCI links; sort of like the back end of a Cray T3E (er, SGI/Cray).
But at least it would require that many PCI lots, and would be more
effective if each MCI adapter is on its own PCI bus.

>I have no experience of HIPPI but I understand that MC
>outperforms it by a large margin, maybe I'm mistaken here but I
>looked closely at message passing latency and DEC with MC trashed
>the competition, that is SGI origin, Sun enterprise, Dolphin and
>Myrinet.

This is what I suspected; the MCI and DEC's optimized MPI software would
gain more efficiency than some kind of standardized MPI designed to work
on many platforms.

>Most other groups in my field have bought SGI and gave me funny looks
>when I announced that I was going DEC. Since then we have set up some
>benchmarks and everyone has gone quiet. It seems that the origin
>cannot compete with the 4100 MC cluster.

I get this too; a couple of groups in our section have O2000s (and Power
Challenges), and their proponents never fail to remind me about how
wonderful they think those systems are. One thing that intrigues me
about those systems is the "super HPPI" that they claim to use at their
connection media. Must be something they picked up from Cray. It's
ironic that 5 of the top 10 supercomputing systems in the world is an
SGI/Cray T3E, running 21164/300 processor clusters.

Ed.Murphy_at_ussurg.com:
>Digital has an article in their Technical Journal on version 2 of memory
>channel. http://www.digital.com/DTJP03/DTJP03HM.HTM
>This paper will answer some of the questions that you brought up but not all.
>Especially in the area of performance of TCP/IP over memory channel. You may
>want to look at fiber channel if it's purely TCP/IP that you are interested
>in as a transport. I'd suspect that this would be a lower cost compared with
>HIPPI...

Fiberchannel crossed my mind too, but again, the question of a FC hub
comes up as well. I'll check into that article next week.

tpb_at_zk3.dec.com (Dr. Thomas P. Blinn):
>I can't speak to the relative performance of MC vs HPPI for TCP/IP traffic,
>but my understanding is that in the current implementation, the performance
>of TCP/IP over MC is not substantially better than TCP/IP over FDDI. There
>are other kinds of traffic in a cluster that don't use TCP/IP over MC that
>would not work as needed over FDDI or a similar interface.
>
>In making a choice, you need to consider a variety of issues, and if what
>you need is an "open" network suported by multiple vendors with components
>available from several sources, then MC isn't it, and FDDI or HPPI should be
>a lot closer. The only real "gotcha" with HPPI is there is no support for
>it in DIGITAL UNIX as a native network interface, so you take the risk that
>if your third party needs to re-release some software for you to be able to
>move to a new version of DIGITAL UNIX (as *might* happen going from V4.0x to
>V5.0x), you would be delayed until they were ready. With a Digital FDDI you
>would get the updated support in the base OS kit. But it's a manageable risk
>and you are already familiar with these kinds of issues.

The software compatibility issue is always present; we've always delayed
upgrading our O/S in many cases because of questions of whether 3rd
party software or drivers would work with the latest O/S.
Received on Sat Jan 17 1998 - 21:20:29 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT