SUMMARY: Kernel memory fault crashes AlphaServer 8400

From: Lucia <lucia_at_calliope.cccfc.uam.es>
Date: Wed, 15 Oct 1997 10:15:41 +0200

Dear Managers:

Thanks to all who replied:

Susan Rodriguez <SUSROD_at_HBSI.COM>
Paul Kitwin <PAUKIT_at_HBSI.COM>
"Charles M. Richmond" <cmr_at_iisc.com>
Nick Hill <NMH1_at_axprl1.rl.ac.uk>

The "kernel memory fault" crashes were probably due to the #4
DU 4.0B patch distribution. Tomorrow I'm applying the September
set, but this time I will not install all of the patches, even
if DEC support tells me to do so.

I attach my original posting and some of the advice I received.

Regards,

Lucia.
lucia_at_calliope.cccfc.uam.es


-----------------------------------------------------------------
My original posting:

>
> Our AlphaServer 8400 running Digital UNIX 4.0B (with a
> variety of patches), with 2GB RAM, 4.3GB swap, has been
> crashing repeatedly and producing similar panic strings:
>
> ------------------------------------------------------------
> trap: invalid memory read access from kernel mode
>
> faulting virtual address: 0x00000000002a38c8
> pc of faulting instruction: 0xfffffc0000263ff8
> ra contents at time of fault: 0xfffffc00004856f4
> sp contents at time of fault: 0xffffffffb50cf288
>
> panic (cpu 3): kernel memory fault
> ------------------------------------------------------------
>
> We have just installed all patches from the August distribution
> (following DEC's advice) plus four kernel modules from the
> TotalView Beta Test patches (arch_alpha.mod, procfs.mod,
> std_kern.mod, vm.mod).
>
> Does the kernel memory fault have to do with the patches? We'd
> never had this problem before the patches were installed.

The advice I received:

>From SUSROD_at_HBSI.COM

** Don't forget to rebuild the kernel at the end.

> - Did you rebuild your kernel after applying patches? That was my
> first mistake when applying the jumbo patch for 4.0B (August
> distribution). I didn't do a new kernel and had a lot of problems. I
> posted a question about one of the problems I was having (with
> ddr.dbase) and most of the answers I got, said to rebuild my kernel.
> They were correct, however, I couldn't get a kernel built. I ended up
> re-building the system.

** Don't apply all of the patches, some of them could make it worse.
 
> - How many of the patches did you apply? My second mistake was putting
> on every patch that the dupatch utility recommended. I had to rebuild
> my system a second time because some of the patches were incompatible
> with the base software subsets I had chosen (like putting on advfs
> patches when I don't have advfs). dupatch is supposed to figure this
> out for you, but I don't trust it anymore.
 
> We are currently running all but 10 of the patches that are part of
> jumbo #4. One of those patches fixes the dd command (used for kernel
> builds). I strongly suspect that this patch created the problem which
> forced me to rebuild my system.

>From PAUKIT_at_HBSI.COM

** The eeproms should be rebuilt after patch installations.

> At the SRM console (P008>> prompt), type "build -e". This will rebuild
> the eproms on the CPUs. Every time to make a major (sometimes a minor)
> change to your system, especially hardware, you should do this.
> SO, after you do the first one, type "set cpu 1" to change to the 2nd
> CPU, and run the build -e again, "set cpu 2" etc, etc.
> WARNING: Make sure you note the console settings (boodef_dev, etc) as
> rebuilding the eproms will wipe the settings. Easy to reset though, as
> long as you write them down ahead of time.
Received on Wed Oct 15 1997 - 10:39:27 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT