First, I neglected to say that all 9 es40's were running
the same copy of the app from an NFS mounted disk.
Second, no errors reported in messages.
Third, I recompiled the driver for the app and "all is well".
It runs on all the nodes again.
That probably means 1 bit somewhere is bad and recompiling
just moved things enough to miss the bad bit.
Now all I have to do is convince Compaq to swap out all
the memory boards (fortunately it only has 512K in it).
Thanks to:
William H. Magill
Octave Orgeron
Dr. Thomas.Blinn
Selden E Ball Jr
Peter.Stern
Joe Fletcher
original question:
>Hi,
>
>I've got a real puzzler. I have an application (Gaussian98)
>which has suddenly started core dumping on only one of my
>9 identical ES40's. I've rebooted, power cycled, threatened
>but it still core dumps the instant I invoke the app. Ladebug
>won't even let me examine the core, it dies on a bus error.
>
>Should I suspect hardware?
>
>thanks,
>jim
----------------------------------------------------------------------------
James W. Caldwell (voice) 415-476-8603
Department of Pharmaceutical Chemistry (fax) 415-502-1411
Mail Stop 0446 (email) caldwell_at_heimdal.ucsf.edu
513 Parnassus Avenue (video) farbauti.compchem.ucsf.edu
University of California (netmeeting)
San Francisco, CA 94143-0446
----------------------------------------------------------------------------
Received on Mon May 07 2001 - 20:35:34 NZST