SUMMARY:kernel panic--how to analyze crash dump?

From: <kevin_at_meso.com>
Date: Mon, 01 Feb 1999 10:11:03 -0500

Many thanks to the following:

Tom Blinn
Whitney Latta
Chris H. Ruhnke
Joel Gallun

The kernel panic almost certainly was not a hardware problem; rather,
it emanated from nfs. It was recommended to apply patchkit # 8
(BL10) for V4.0B, which we had not installed at the time of the
panic.

I received some helpful pointers in analyzing the text crashdump
file (/var/adm/crash/crash_data.0 in my case). Some of the most
useful information in this file occurs in the stack trace; here's
what mine looked like:

Begin Trace for machine_slot[paniccpu].cpu_panic_thread:
> 0 boot(0x0, 0xfffffc0002c0e580, 0x2c0000002c, 0x35, 0x1)
["../../../../src/kernel/arch/alpha/machdep.c":2466, 0xfffffc00003ced18]
   1 panic(s = 0xfffffc00004e79f0 = "kernel memory fault")
["../../../../src/kernel/bsd/subr_prf.c":791, 0xfffffc000027a25c]
   2 trap() ["../../../../src/kernel/arch/alpha/trap.c":1512,
0xfffffc00003d6bb8]
   3 _XentMM(0x0, 0xfffffc000030aaa0, 0xfffffc000050c830, 0x0, 0x0)
["../../../../src/kernel/arch/alpha/locore.s":1424, 0xfffffc00003cacc4]
   4 nfs_getpage(0x0, 0xfffffc0000f844b8, 0x1c4f, 0x0, 0x0)
["../../../../src/kernel/nfs/nfs_vnodeops.c":3295, 0xfffffc000030aa9c]
End Trace for machine_slot[paniccpu].cpu_panic_thread:

Which means that the kernel was in the nfs_getpage routine when it died.
At the time, the last process running was tar; and a user indeed was tarring
files from an nfs-mounted file system when the crash occurred (in
the crash log, it lists the processes running at the time
of the crash, and also lists the "_current_pid", which in fact was tar,

--Kevin Tyle <kevin_at_meso.com>
MESO, Inc.

Original Post:

> Hi managers,
>
> After nearly 2 years of dutiful service, one of our AlphaPC 500 MHZ
> machines running 4.0B panicked, with
> "trap: invalid memory read access from kernel mode"
> announcing the crash in the messages file. The three crash files
> appeared in /var/adm/crash, but I have no idea what to make of them.
> I've seen in past summaries that this message may or may not indicate
> a hardware problem--we haven't started running any new software in
> a while.
>
> Can anyone give me some guidance in how to make sense of the crash data?
> The text file isn't much help, at least at first glance.
Received on Mon Feb 01 1999 - 15:13:01 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT