Well, after much heart ache and sweat, the problem has been resolved. My
original question was:
I've got a 2100 4/275 with two CPU's that's operating a an
application server for our SAP environment (its running V3.2A).
Intermittently, it will lock up and stop accepting any kind of requests
(NFS, telnet, ping, SAP, etc). It won't even talk to the console. Once
the system has been reset and rebooted it will be fine for anywhere from
an hour to a week. There's nothing in the uerf at all. My question
is, is there a way to force a dump so I can see what's
going on? Has anybody run into this type of problem before?
As it turned out, it was a flaky 512MB memory board that was causing the
problem. When I ran the system exercisers on the hardware, the memory
module passed the tests (even when run continuously over night). The
next morning after rebooting the system it started to actually see
problems in the mem module and after about 6 power cycles, it started
failing power up tests on the module. So the rule here I guess is don't
believe the test results all the time.
My thanks go to those who suggested hitting the halt button and typing
crash at the console prompt, and to check my swap mode. They include:
Siggy Hensel <siggy%aed.uucp_at_Bonn.Germany.EU.net>
Phil Krause <phil_at_wolf.ncat.edu>
Jeff Finkelstein <finkels_at_alf.dec.com>
Matt Thomas <thomas_at_lkg.dec.com>
Wayne Blom <wayne.blom_at_faulding.com.au>
Hellebo Knut <Knut.Hellebo_at_nho.hydro.com>
Karl Bradshaw
System Programmer (et. al.)
Agrium Inc.
(403) 258-5721
bradshac_at_cuug.ab.ca
Received on Mon Sep 18 1995 - 20:05:24 NZST