SUMMARY: OT: strange hardware problem with alpha oem-board-based server

From: Horst G. Reiterer <horst_at_reiterer.net>
Date: Fri, 25 Feb 2000 15:00:11 +0100

My initial problem was that my LX164 machine was randomly hanging for some
reason. I thought it was an SCSI related problem as those were the only
components I haven't yet replaced.

Thanks go to:

    Russell G Auld <rauld_at_grove.ufl.edu>

who pointed out that the problem could actually be a bad SCSI cabling resp.
termination and added that I should check whether the terminator is ACTIVE
or not as ACTIVE ones tend to be more reliable.

I replaced the storage hardware, SCSI cabling and terminators to go for sure
that this can't be the source of problem.
The I installed tru64 in order to check whether it could be related to the
software we're using and tru64 enlighted me with the fact that the reason
for the hangs was a defect memory module as it stopped at boot-time with a
memory-access-error after adding the remaining 2 DIMMs after installation:

...
trap: invalid memory read access from kernel mode

    faulting virtual address: 0x0000000000000018
    pc of faulting instruction: 0xfffffc00005cf43c
    ra contents at time of fault: 0xfffffc00005a2b5c
    sp contents at time of fault: 0xfffffe04141679c0

panic (cpu 0): kernel memory fault
...

Although SCSI doesn't seem to be related now I want to thank Russell again
for providing me with this valuable info!


cheers,

    Horst Reiterer
Received on Fri Feb 25 2000 - 14:00:15 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT