Help with my machine

From: Eugene Chu <chu_at_musp0.Jpl.Nasa.Gov>
Date: Sun, 25 Jun 1995 03:08:37 -0700

An urget plea for help:

My system just started to act real wierd, and I suspect it has something
to do with either the new memories or the new disk that I just added.
The system is a 3000/500 running OSF 1.3, had 128 MB of RAM consisting
of 32 DEC and DATARAM simms, upgraded by removal of one set of 8 simms
and replacing with 8 higher density (128 MB) DATARAM simms, bringing
total capacity to 224 MB. The /, /usr, and /var partitions were on the
original RZ26 1 GB drive, now moved to a DEC DSP3210 2 GB drive.
Other things on the system: DEFTA FA, and Bit-3 Turbochannel half of
their TC to VME bus converter.

The problems: first, a mysterious crash during some disk dumps. I have
the crash-data file, which said something about a kernel memory fault.
But I got those before when I tried to do too big of a read operation
through the SCSI CAM driver from our custom hardware.
Now (just a few minutes ago) various pieces of system software started
to core dump with the message:

inst fault=4, status word=8, pc=3ff800e80a4
Illegal instruction

things such as su, inetd, ftp, and a few other things would produce
messages like this. This particular message was produced by ls -l,
but just ls works fine.

Right now, I can't do much with the system, as I only have one non-
priviledge user logged into it, I can't rlogin or telnet into it,
I can't su, and I can't log into the console as root to shut it down.

>From the one logged in user account, I've looked at all the log messages
in syslogd.dated and uerf, and found no errors except secondary symptoms
(such as mail or inetd not working).

After the first crash, I ran the various runnable tests from the
console, and they all passed except for the NI test, which complained of
a problem with a loop back test. It eventually passed as well, and I
booted the system back up. Could this have something to do with illegal
instructions?

So, can anyone tell me what kind of schizophrenia my machine has
acquired? Is it the new memory or disk that have been corrupted?
Could the corruption be caused by a device driver? (I find this
unlikely, since all the drivers were used before addition of the new
hardware, and the symptoms started showing up afterward.)

Any info would be appreciated.

eyc
Received on Sun Jun 25 1995 - 12:46:42 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT